Skip to content

DOC remove obsolete SVM example #27108

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -301,6 +301,7 @@
"auto_examples/decomposition/plot_beta_divergence": (
"auto_examples/applications/plot_topics_extraction_with_nmf_lda"
),
"auto_examples/svm/plot_svm_nonlinear": "auto_examples/svm/plot_svm_kernels",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As noted in #26972 (comment), the XOR problem is the canonical example for non-linear decision functions.

I agree that auto_examples/svm/plot_svm_kernels illustrates similar motivations but I would rather redirect to an example that also includes the canonical XOR, maybe with more models / kernels as suggested by Guillaume.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ogrisel Do you think these days this example is still relevant? I would rather people read more realistic examples instead. Is there another example you prefer to be here instead?

I also don't really see a way where users would actually benefit from this redirection. Our example pages don't have a link to the same example in the new release. The only link we provide is to the home page of the latest release, not the same page in the latest release. So I'd be happy to remove these redirections altogether.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about adding XOR to classifier comparison, wdyt?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be inclined to have this dataset because this is still an interesting synthetic one. Probably the classification one is the right place to have it. For instance, this is part of the default dataset in this playground (https://playground.tensorflow.org)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So let's remove this one, and put the dataset into an existing one. We really have too many examples!

"auto_examples/ensemble/plot_adaboost_hastie_10_2": (
"auto_examples/ensemble/plot_adaboost_multiclass"
),
Expand Down
59 changes: 45 additions & 14 deletions examples/svm/plot_svm_kernels.py
Original file line number Diff line number Diff line change
Expand Up @@ -110,12 +110,15 @@
from sklearn.inspection import DecisionBoundaryDisplay


def plot_training_data_with_decision_boundary(kernel):
def plot_training_data_with_decision_boundary(
kernel, ax=None, long_title=True, support_vectors=True
):
# Train the SVC
clf = svm.SVC(kernel=kernel, gamma=2).fit(X, y)

# Settings for plotting
_, ax = plt.subplots(figsize=(4, 3))
if ax is None:
_, ax = plt.subplots(figsize=(4, 3))
x_min, x_max, y_min, y_max = -3, 3, -3, 3
ax.set(xlim=(x_min, x_max), ylim=(y_min, y_max))

Expand All @@ -136,20 +139,26 @@ def plot_training_data_with_decision_boundary(kernel):
linestyles=["--", "-", "--"],
)

# Plot bigger circles around samples that serve as support vectors
ax.scatter(
clf.support_vectors_[:, 0],
clf.support_vectors_[:, 1],
s=250,
facecolors="none",
edgecolors="k",
)
if support_vectors:
# Plot bigger circles around samples that serve as support vectors
ax.scatter(
clf.support_vectors_[:, 0],
clf.support_vectors_[:, 1],
s=150,
facecolors="none",
edgecolors="k",
)

# Plot samples by color and add legend
ax.scatter(X[:, 0], X[:, 1], c=y, s=150, edgecolors="k")
ax.scatter(X[:, 0], X[:, 1], c=y, s=30, edgecolors="k")
ax.legend(*scatter.legend_elements(), loc="upper right", title="Classes")
ax.set_title(f" Decision boundaries of {kernel} kernel in SVC")
if long_title:
ax.set_title(f" Decision boundaries of {kernel} kernel in SVC")
else:
ax.set_title(kernel)

_ = plt.show()
if ax is None:
plt.show()


# %%
Expand Down Expand Up @@ -237,7 +246,6 @@ def plot_training_data_with_decision_boundary(kernel):
# using the hyperbolic tangent function (:math:`\tanh`). The kernel function
# scales and possibly shifts the dot product of the two points
# (:math:`\mathbf{x}_1` and :math:`\mathbf{x}_2`).

plot_training_data_with_decision_boundary("sigmoid")

# %%
Expand Down Expand Up @@ -271,3 +279,26 @@ def plot_training_data_with_decision_boundary(kernel):
# parameters using techniques such as
# :class:`~sklearn.model_selection.GridSearchCV` is recommended to capture the
# underlying structures within the data.

# %%
# XOR dataset
# -----------
# A classical example of a dataset which is not linearly separable is the XOR
# pattern. HEre we demonstrate how different kernels work on such a dataset.

xx, yy = np.meshgrid(np.linspace(-3, 3, 500), np.linspace(-3, 3, 500))
np.random.seed(0)
X = np.random.randn(300, 2)
y = np.logical_xor(X[:, 0] > 0, X[:, 1] > 0)

_, ax = plt.subplots(2, 2, figsize=(8, 8))
args = dict(long_title=False, support_vectors=False)
plot_training_data_with_decision_boundary("linear", ax[0, 0], **args)
plot_training_data_with_decision_boundary("poly", ax[0, 1], **args)
plot_training_data_with_decision_boundary("rbf", ax[1, 0], **args)
plot_training_data_with_decision_boundary("sigmoid", ax[1, 1], **args)
plt.show()

# %%
# As you can see from the plots above, only the `rbf` kernel can find a
# reasonable decision boundary for the above dataset.
45 changes: 0 additions & 45 deletions examples/svm/plot_svm_nonlinear.py

This file was deleted.

Loading