-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Revise scatter_masked.py #24409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Revise scatter_masked.py #24409
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -3,30 +3,77 @@ | |||||||||||||||||||||||
Scatter Masked | ||||||||||||||||||||||||
============== | ||||||||||||||||||||||||
|
||||||||||||||||||||||||
Mask some data points and add a line demarking | ||||||||||||||||||||||||
masked regions. | ||||||||||||||||||||||||
A NumPy masked array (see `numpy.ma`) can be passed to `.Axes.scatter` or | ||||||||||||||||||||||||
`.pyplot.scatter` as the value of the *s* parameter in order to exclude certain | ||||||||||||||||||||||||
data points from the plot. | ||||||||||||||||||||||||
|
||||||||||||||||||||||||
This example uses this technique so that data points within a particular radius | ||||||||||||||||||||||||
are not included in the plot. | ||||||||||||||||||||||||
|
||||||||||||||||||||||||
""" | ||||||||||||||||||||||||
import matplotlib.pyplot as plt | ||||||||||||||||||||||||
import numpy as np | ||||||||||||||||||||||||
|
||||||||||||||||||||||||
# Fixing random state for reproducibility | ||||||||||||||||||||||||
# Fix random state for reproducibility | ||||||||||||||||||||||||
np.random.seed(19680801) | ||||||||||||||||||||||||
|
||||||||||||||||||||||||
|
||||||||||||||||||||||||
# Create N random (x, y) points | ||||||||||||||||||||||||
N = 100 | ||||||||||||||||||||||||
r0 = 0.6 | ||||||||||||||||||||||||
x = 0.9 * np.random.rand(N) | ||||||||||||||||||||||||
y = 0.9 * np.random.rand(N) | ||||||||||||||||||||||||
area = (20 * np.random.rand(N))**2 # 0 to 10 point radii | ||||||||||||||||||||||||
c = np.sqrt(area) | ||||||||||||||||||||||||
r = np.sqrt(x ** 2 + y ** 2) | ||||||||||||||||||||||||
area1 = np.ma.masked_where(r < r0, area) | ||||||||||||||||||||||||
area2 = np.ma.masked_where(r >= r0, area) | ||||||||||||||||||||||||
plt.scatter(x, y, s=area1, marker='^', c=c) | ||||||||||||||||||||||||
plt.scatter(x, y, s=area2, marker='o', c=c) | ||||||||||||||||||||||||
# Show the boundary between the regions: | ||||||||||||||||||||||||
theta = np.arange(0, np.pi / 2, 0.01) | ||||||||||||||||||||||||
plt.plot(r0 * np.cos(theta), r0 * np.sin(theta)) | ||||||||||||||||||||||||
|
||||||||||||||||||||||||
# Create masked size array based on calculation of x and y values | ||||||||||||||||||||||||
size = np.full(N, 36) | ||||||||||||||||||||||||
radius = 0.6 | ||||||||||||||||||||||||
masked_size = np.ma.masked_where(radius > np.sqrt(x ** 2 + y ** 2), size) | ||||||||||||||||||||||||
|
||||||||||||||||||||||||
# Plot data points using masked array | ||||||||||||||||||||||||
subplot_kw = { | ||||||||||||||||||||||||
'aspect': 'equal', | ||||||||||||||||||||||||
'xlim': (0, max(x)), | ||||||||||||||||||||||||
'ylim': (0, max(y)) | ||||||||||||||||||||||||
} | ||||||||||||||||||||||||
fig, ax = plt.subplots(subplot_kw=subplot_kw) | ||||||||||||||||||||||||
ax.scatter(x, y, s=masked_size, marker='^', c="mediumseagreen") | ||||||||||||||||||||||||
|
||||||||||||||||||||||||
# Show the boundary between the regions | ||||||||||||||||||||||||
theta = np.arange(-0, np.pi * 2, 0.01) | ||||||||||||||||||||||||
circle_x = radius * np.cos(theta) | ||||||||||||||||||||||||
circle_y = radius * np.sin(theta) | ||||||||||||||||||||||||
ax.plot(circle_x, circle_y, c="black") | ||||||||||||||||||||||||
|
||||||||||||||||||||||||
plt.show() | ||||||||||||||||||||||||
|
||||||||||||||||||||||||
############################################################################### | ||||||||||||||||||||||||
# This technique can also be used to plot a decision boundary, rather than | ||||||||||||||||||||||||
jklymak marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||||||||||||||||
# masking certain data points so that they don't appear at all. This example | ||||||||||||||||||||||||
# uses the same data points and boundary as the example above, this time in the | ||||||||||||||||||||||||
# style of a decision boundary. | ||||||||||||||||||||||||
|
||||||||||||||||||||||||
Comment on lines
+46
to
+52
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would tend to just be specific about what you are doing, but use general language. If you want the technical term yo think people will search on, by all means include it. Maybe something like:
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok gotcha, you're saying to use more general language to describe what the plotting code is doing. Not to explain the ML concept in more general language. That makes sense! |
||||||||||||||||||||||||
# Create a masked array for values within the radius | ||||||||||||||||||||||||
masked_size_2 = np.ma.masked_where(radius <= np.sqrt(x ** 2 + y ** 2), size) | ||||||||||||||||||||||||
|
||||||||||||||||||||||||
# Plot solution regions | ||||||||||||||||||||||||
fig, ax = plt.subplots(subplot_kw=subplot_kw) | ||||||||||||||||||||||||
ax.patch.set_facecolor('#D8EFE2') # equivalent of 'mediumseagreen', alpha=0.2 | ||||||||||||||||||||||||
ax.fill(circle_x, circle_y, color='#FFF7CC') # equivalent of 'gold', alpha=0.2 | ||||||||||||||||||||||||
|
||||||||||||||||||||||||
# Plot data points using two different masked arrays | ||||||||||||||||||||||||
ax.scatter(x, y, s=masked_size, marker='^', c='mediumseagreen') | ||||||||||||||||||||||||
ax.scatter(x, y, s=masked_size_2, marker='o', c='gold') | ||||||||||||||||||||||||
|
||||||||||||||||||||||||
# Plot boundary | ||||||||||||||||||||||||
ax.plot(circle_x, circle_y, c='black') | ||||||||||||||||||||||||
|
||||||||||||||||||||||||
plt.show() | ||||||||||||||||||||||||
|
||||||||||||||||||||||||
############################################################################### | ||||||||||||||||||||||||
# | ||||||||||||||||||||||||
# .. admonition:: References | ||||||||||||||||||||||||
# | ||||||||||||||||||||||||
# The use of the following functions, methods, classes and modules is shown | ||||||||||||||||||||||||
# in this example: | ||||||||||||||||||||||||
# | ||||||||||||||||||||||||
# - `matplotlib.axes.Axes.scatter` / `matplotlib.pyplot.scatter` | ||||||||||||||||||||||||
# - `matplotlib.axes.Axes.plot` / `matplotlib.pyplot.plot` | ||||||||||||||||||||||||
# - `matplotlib.axes.Axes.fill` / `matplotlib.pyplot.fill` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know we could do this 😄 and I would maybe not put the comment here 'cause I think it makes it seem like this line has something to do w/ the masking specifically
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a nice clean approach if you want to create two different plots with the same axes properties, rather than calling methods to set them 🙂
Fair point, what if I moved the comment like this: