-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
[MRG] Optimize 3D display (continued) #11577
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is ready to be reviewed.
The CI passes and I tried to respond to all comments from the parent PR below. Some comments that suggested removing added changes were also taken into account even if there is no corresponding comment in the diff.
cc @WeatherGod
For a somewhat noisy and large iso-surface plotted with plot_trisurf
of 95000 vertices (image below), this PR results in a x2.1 speed-up (10 s from 21 s) when compared to master. I imagine for lighter plots it won't be as significant.
Please let me know if I can do anything to make the review easier. Thanks!
lib/mpl_toolkits/mplot3d/art3d.py
Outdated
@@ -169,12 +182,15 @@ def line_2d_to_3d(line, zs=0, zdir='z'): | |||
|
|||
def path_to_3d_segment(path, zs=0, zdir='z'): | |||
"""Convert a path to a 3D segment.""" | |||
seg3d = np.ones((3, len(path))) | |||
# Works both if zs is and array and a scalar | |||
seg3d[2, :] *= zs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seg3d[2, :] = zs
Done
#6085 (comment) by @eric-wieser
If you do that, use np.empty above too
When using np.empty
tests fail (I'm not quite sure why) so I kept the np.ones
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't look done to me - you have *=
, not =
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely, missed that -- this was also the reason for np.empty
not working. Fixed both.
lib/mpl_toolkits/mplot3d/art3d.py
Outdated
for path, pathz in zip(paths, zs)) | ||
segments, codes_list = zip(*path_generator) | ||
|
||
return np.asarray(segments), np.asarray(codes_list) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we use a preallocated array here?
I'm not sure setting a numpy array element by element would be faster than initializing it from a sequence. Rewrote the above code to use list comprehension.
lib/mpl_toolkits/mplot3d/art3d.py
Outdated
self._segments3d = np.asanyarray(segments) | ||
self._segments3d = [] | ||
if len(segments) > 0: | ||
self._seg_sizes = [len(c) for c in segments] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
np.full(len(segments3d), len(c))
Unless I'm mistaken it's not equivalent. We want the length for each segment (which can be variable), while in np.full
the second argument (fill_value
) can only be scalar.
#6085 (comment) by @WeatherGod
np.full did not come about until fairly recently, so I don't think we can
use it yet in matplotlib's codebase.
Currently in master
__version__numpy__ = '1.10.0' # minimum required numpy version
which does support np.full
: I have used it in other parts of this PR
|
||
self._segments3d_data = np.empty((n_segments, 4)) | ||
self._segments3d_data[:, :3] = np.vstack(segments) | ||
self._segments3d_data[:, 3] = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#6085 (comment) by @eric-wieser
This would be marginally better as
self._segments3d_data = np.empty((n_segments, 4)) self._segments3d_data[:,:3] = segments self._segments3d_data[:,3] = 1
Done.
Also, are you sure you mean "quaternion" and not "homogenous coordinate"?
Not sure, but I removed the "quaternion" term as suggested.
|
||
# For coveniency, store a view of the array in the original shape | ||
self._segments3d = _array_split(self._segments3d_data[:, :3], | ||
np.cumsum(self._seg_sizes)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lib/mpl_toolkits/mplot3d/proj3d.py
Outdated
return txs, tys, tzs, tis | ||
vecw /= vecw[3] | ||
# Integrating tis in the numpy array for optimization purposes | ||
vecw[3, :] = tis |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#6085 (comment) by @eric-wieser
Get rid of this colon - it makes the function fail on 1d inputs
I have not changed it as I think this will work with scalar and 1D arrays, also CI pass. Unless I'm missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If vecw.ndim == 1
, then vec[3,:]
is an error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand - fixed.
vecw /= vecw[3] | ||
# Integrating tis in the numpy array for optimization purposes | ||
vecw[3, :] = tis | ||
return vecw |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#6085 (review) by @eric-wieser
Storing coordinates in the first axis of the array is kind of unusual (see how np.linalg and scipy.misc.imread behave).
Perhaps these should all be xyz[..., 0] instead of xyz[0]?
Well in the current case xyz[0]
for default C aligned arrays will produce a contiguous array while xyz[..., 0]
won't unless I'm mistaken, which might have a performance cost (I haven't tested it).
Is it essentially the difference between np.meshgrid
with indexing="xy"
and indexing="ij"
?
I have not changed this, if you think this is critical I will.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're correct about contiguity, but the choice of internal storage order and external indexing order is orthogonal.
I'll have to come back to this later to decide if it's important
Going back to this PR, I would really like to get this merged. Below is a summary of the situations on benchmarks, API changes, and tests. Benchmarksimport sys
from pathlib import Path
import cProfile
import pstats
import io
import numpy as np
from skimage.measure import marching_cubes_lewiner
from time import time, sleep
from matplotlib.backends.backend_qt5agg import FigureCanvas
from matplotlib.figure import Figure
import matplotlib.pyplot as plt
from PyQt5 import QtWidgets, QtCore
from mpl_toolkits.mplot3d import Axes3D # noqa
profile = False
class FigureWidget(QtWidgets.QWidget):
def __init__(self, parent=None, canvas=None):
super().__init__(parent)
self.canvas = canvas
self.layout = QtWidgets.QVBoxLayout()
if canvas is not None:
self.layout.addWidget(self.canvas)
self.setLayout(self.layout)
class ExampleCanvas(FigureCanvas):
def __init__(self):
self.figure = Figure()
FigureCanvas.__init__(self, self.figure)
self.ax = self.figure.add_subplot(111, projection="3d")
def plot(self, X=None):
if X is None:
self.draw()
return
self.ax.clear()
print(f"# Input dataset shape : {X.shape}")
t0 = time()
verts, faces, normals, values = marching_cubes_lewiner(
X, level=X.max() * 0.5
)
dt = time() - t0
print(f" - marching_cubes_lewiner: {verts.shape[0]} vertices in {dt:.2f}s")
if profile:
pr = cProfile.Profile()
pr.enable()
t0 = time()
self.ax.plot_trisurf(
verts[:, 0], verts[:, 1], faces, verts[:, 2], color="r", antialiased=True
)
self.draw()
plt.pause(0.001)
dt = time() - t0
print(f" - plot_trisurf in {dt:.3f}s")
t0 = time()
angles = list(range(0, 360, 72))
for angle in angles:
self.ax.view_init(30, angle)
self.draw()
plt.pause(0.001)
dt = time() - t0
print(f" - rotation {dt/len(angles):.2f} s/frame")
if profile:
pr.disable()
s = io.StringIO()
ps = pstats.Stats(pr, stream=s)
ps.sort_stats('tottime')
# ps.print_stats()
ps.print_callers()
print(s.getvalue())
if __name__ == "__main__":
app = QtWidgets.QApplication(sys.argv)
canvas = ExampleCanvas()
widget = FigureWidget(canvas=canvas)
canvas.plot()
basedir = Path(__file__).parent
app = QtWidgets.QApplication(sys.argv)
canvas = ExampleCanvas()
widget = FigureWidget(canvas=canvas)
canvas.plot()
widget.show()
class BenchmarkThread(QtCore.QThread):
def run(self):
sleep(1)
X = np.load(basedir / "data_3d.npy")
X = np.abs(X)
for step in [4, 2, 1]:
Xsl = X[::step, ::step, :]
canvas.plot(Xsl)
thread = BenchmarkThread()
thread.start()
sys.exit(app.exec_()) run on this 3D dataset data_3d.npy (130 MB). The results are below,
This PR
so this speeds up Handle API changesThis PR changes quite a vew functions to return numpy arrays instead of lists. If I follow @anntzer 's suggestion (#11577 (comment)), it would mean deprecating,
then because we cange
I'm not sure what you think the ideal public API of Fix CISo CI was passing fine a few commits ago, but then after I merged master, I started to get two test failures,
For instance, would you have suggestions where this could be coming from? Apart for that I think I addressed most comments, apart for possible performance improvements in https://github.com/matplotlib/matplotlib/pull/11577/files/ad0e783d17be4dce4d3ee6d84807d388a3bbde14#diff-fc54f86f287bf203dfa9e78102e2678b, #11577 (review) by @eric-wieser . I think any additional performance improvements should be done in a follow up PR. Now I'm mostly concerned about the API deprecations and making CI pass, in order to make this mergable. |
I don't know if we need to completely deprecate all those functions. What if instead of deprecating them, we modify them to output lists if the input was a list (and deprecate that feature as we transition to allow only numpy arrays)? Note, I am not against changing the API, because these benefits are very compelling, and the list-based approach is so outdated. I am just trying to get a feel for our options. As for the test failure, the image difference is with the bar with no height. Perhaps there is a numerical instability somewhere that is causing the normal vector to flip? |
For the bar with no height, it's possible that you reversed the order of a list of polygons somewhere, but numerical instability would also make sense. I think in that case it would be fine to change the expected image |
if arr_chunks[-1].size == 0: | ||
if not remove_empty: | ||
# Preserve the 2D dimensionality of the last chunk that can be | ||
# empty (numpy <=1.10 replaces empty chunks by a 1D empty array) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we require numpy>=1.11 now.
Most of these functions are clearly helpers that should never have been part of the public API to start with. More importantly, adding code to each and every one of them to make them check whether the input is a list and return a list in that case (and then immediately deprecating that) is just code churn; plus it still breaks backcompat if someone was passing an array in and expecting a list out (which was the previous behavior), so it doesn't actually help anyone. |
Thanks for the feedback. Made a PR to deprecate these helper functions in #13030 |
I beleive @anntzer changed the test to no longer run into this issue |
Rebase conflicts are likely with #16472 |
Sorry I don't have the availability to continue working on this. So this PR can be marked as stalled (or a more appropriate label). If someone wants to continue it, they should certainly feel free to do so. |
Given that this has been stalled for 2 years and we have had a fair amount of refactoring to the mplot3d code base I am going to close this, however it may be worth re-considering the ideas in these changes to improve performance. Thankyou for your effort on this @rth , @eric-wieser and @Kojoley attn @scottshambaugh |
PR Summary
Faster 3D projections using numpy arrays instead if built-in Python objects
Continued #6085 (squashed and rebased WeatherGod's rebased version from 2017).
PR Checklist
New features are documented, with examples if plot related-> not applicable