-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Building interactive demos for examples #24878
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@adrinjalali I think you're wrong in changing the title. Embedding demo itself is not about Hugging Face Hub but about having interactive demos within sklearn docs. (see Kornia docs for example) I updated the issue for better clarity 🙂 |
@merveenoyan where do these demos run? Don't they then run on a Space? If yes, then it's a Hub related issue. We don't have the compute backend to actually run the demos on our servers (which we don't have much of). We could think of having them as pyodide though. What I'm saying: who's maintaining the demo code, and who's maintaining the compute servers/infra. If the answer to those questions are people at Hugging Face, then it's a Hub thing. |
@adrinjalali I see, but this is something done for sklearn docs in the end and that is what I mean 🙂 Wanted to ask @ArturoAmorQ their opinion 🙂 |
Thanks for the demos @merveenoyan, they look great and they can really help the reader gain insights about the impact of hyperparameters on model behaviors. However as of now the scikit-learn HTML documentation has no dependency on external online services: it can be generated and browsed locally without any internet connection. Furthermore any code change in scikit-learn that breaks an example can be detected on our CI when we generate the HTML doc which will not be the case for code hosted and executed on a third party platform. So I am a bit worried about starting to do this. I would love to have interactive demos directly in the scikit-learn documentation. However I would prefer if we could find a way that does not rely on Python code execution on a remote server (as gradio requires) for the reason I mentioned above. Out of the top of my head, I see the following two options:
I would like to probe the opinion of other users, contributors and maintainers. Maybe using the gradio / huggingface space combo is fine compared to the alternatives that have their own limitations at this time. If maintainers do not want to have to deal with a doc dependency on remotely maintained code demos on huggingface spaces, then at least it would be worth publishing a blog post with those demos included on https://blog.scikit-learn.org (via https://github.com/scikit-learn/blog). |
BTW for the classifier comparison demo, it would be great to allow the user to interactively "brush" the 2d dataset in addition to loading one of the standard datasets such as blob, 2 moons & co. The user would be able to use the mouse to click on a region of the 2d canvas to add random points of a selected class (red vs blue) around the position of the cursor at the time of the click. And the longer you hold, the higher the density of the points. I am sure I saw a demo in that spirit on twitter a while ago but I cannot find it. |
I think having interactive examples really helps people get a feeling for a concept by allowing them to test their understanding a la "I think doing X will make change Y", then doing X and comparing what really happened to their expectation. I think something that is embedded in the docs, like ipywidgets or pyodide, would be the best solution. It means the docs don't depend on an additional service (availability is the product of the individual uptimes, so more services almost always means lower availability) and there are no data protection questions that need answering (to directly embed many social sharing widgets or YouTube content you need understand the privacy/regulatory implications). An approach other projects take is to link out to other community maintained blog posts, examples and tutorials. Maybe this is a model for these examples as well? It makes it clear to the reader that they are not maintained by the project itself, but somehow noteworthy because they are linked to. It does make it harder to detect that they have drifted from what the docs describe/broken :-/ I'd invest in pyodide, maybe by starting with examples that don't lead to errors. My assumption being that in the examples you'd not necessarily give people a full shell/editor experience but something graphical as in the examples above. So you can constrain users to the part of scikit-learn that works. Another example are the demos on https://spacy.io/. They are code based and rely on an additional service (mybinder.org). Spacy seem happy with this setup, but see my comment above regarding "availability is the product of the individual uptimes". |
Thanks for the initiative @merveenoyan! I do agree that interactive examples are very effective in generating intuitions and understanding. I think linking to the demos is doable as long as we make it clear to the reader that they are noteworthy but are not maintained by the project itself, as mentioned by @betatim. This should probably include a disclaimer about the privacy/regulatory implications. Regarding @ogrisel's comment:
I don't fully agree because we do link to Wikipedia and scientific papers, but I guess the spirit of the comment is more inline with the idea that pre-calculated demos would be better in terms of usability and even computational sources. Doing so would then be the responsibility of the external maintainers. The only concern about adding external links (even with a disclaimer) is that it may set a precedent that people may use opportunistically to demand linking to their blogposts/projects as long as the content is "relevant", instead of actually contributing to the development of the existing content. In any case, @ogrisel's solution has my +1:
For the moment I encourage other users, contributors and maintainers to give their opinion! |
That is true. I think that you can deal with this via PR reviews, technology as well as an explicit statement about inclusion and spam. A good thing about linking is that it requires little up front investment, you can start doing it relatively quickly, and if there is a flood of link spam PRs you can also stop quite quickly again. At least compared to getting pyodide working well :D |
Is this possible to add links in the examples with their associated demos? @betatim @ArturoAmorQ It makes sense for someone to see the example and try the demo instead if they find it interesting. This is what we do with Keras examples, you can see one here. Also see below on how they look. it's inside a simple markdown actually with a badge 🙂 some people have directly put markdown link. @ogrisel we have a canvas component that we can do 2D space for data points 🙂 |
@ogrisel @betatim @ArturoAmorQ we're discussing pyodide support for Gradio. I'll update you on this once everything's all clear 🙂 |
Someone pointed me to https://jupyterlite-sphinx.readthedocs.io/en/latest/directives/replite.html which is a sphinx plugin that lets you have pyodide in the docs. Maybe this is a thing to investigate for the examples or even inline code in the docs |
For the scikit-learn examples, some work is needed to integrate JupyterLite into sphinx-gallery, see sphinx-gallery/sphinx-gallery#977 for an attempt by Andreas for example (full disclosure I haven't looked at it in details). Also there are caveats down the road:
|
A related issue that mentions thebe-light. Might be worth keeping an eye on as jupyterbooks is based on sphinx (I think). And thebe is a "old" project (probably the OG?!) related to making interactive computation possible (with support for widgets, etc, etc). |
Hey folks! 👋 While waiting for Gradio to integrate pyodide, we decided to kick-off a community event to build the demos. For this, we drafted this guiding document. (for now, I allowed you to leave comments too if you have any!) We would like to do it in collaboration with scikit-learn and were wondering if you would like to be involved in this? |
FYI, an update on my earlier comment, the dev website examples now have a JupyterLite button, for example you can run this example inside JupyterLite or find your favourite example from the gallery and click on the JupyterLite button! For more details about the implementation, you can have a look at #25887. |
So we kinda need a decision here, with the inclusion of JupyterLite link in our docs now, do we think we'd like to do more in terms of interactivity? |
So we kinda need a decision here, with the inclusion of JupyterLite link in our docs now, do we think we'd like to do more in terms of interactivity?
I would say: if it's client side (eg javascript) and very well integrated with our stack, yes.
The benefit of jupyterlite is that it requires pretty much no infrastructure (just making sure that scikit-learn works in the pyiodide stack) and no change to our examples.
Can we get interactivity that is also easy to maintain (integrated in the test suite) and to deploy?
|
Seems like there's a port of streamlit (stlite) which allows us to do that, here's a demo: https://discuss.streamlit.io/t/new-library-stlite-a-port-of-streamlit-to-wasm-powered-by-pyodide/25556/28 |
@ArturoAmorQ would you have bandwidth to check how easy it would be to do a demo with |
Hey all! Omar from Hugging Face here 🤗 In the recently co-organized community sprint, community members built ~80 Gradio-based apps based on The Gradio team, including @whitphx (creator of |
As soon as pyodide support for gradio is released (even as a tech preview), it would be interesting to open a DRAFT PR to prototype a possible integration of a small gradio app into a sphinx page of the doc. This way the scikit-learn CI will automatically generate the rendered HTML and we will able to see the end result. I am not sure if such interactive examples necessarily need to be treated as |
We could even think of including interactive demos directly in the user guide (as we do for generated |
If we are to have interactive demos on the website, that means adding a dependency to those libraries, and having some contributors to understand the framework we're adding. So there are a few open points:
I think we should be careful about what we add here, since in the long term, what matters is what we and our contributors feel more comfortable working with, as well as the download size on each demo for each framework. |
We might also want to explore the possibility to just use ipywidgets in regular sphinx-gallery examples. Combined with jupyterlite, that might be an interesting alternative with lighter weight dependencies albeit maybe at the cost of more limited UI and inability to be included as interactive demos in the user guide. |
And finally there is also https://shiny.rstudio.com/py/docs/shinylive.html. |
Yes, I think the
Happy to discuss async about this, I want to make sure there is an objective discussion here. Gradio has 3.4M monthly pip installs, a significant, thriving usage base (with big projects such as Auto111, etc), and from 🤗 Hub usage, we've seen significantly organic accelerating usage for it. The fact that the community was able to quite independently build 80 demos in some days also tells something, and we're always happy to take feedback on how to improve our docs in Gradio's repo if you feel anything is lacking for sklearn use cases. In any case, we <3 Streamlit and Shiny and are collaborating closely with them; I hope one of these options (or other pyodide solutions!) work well for you! |
Hey @adrinjalali @ogrisel thanks a lot for the discussion! |
I like the idea of having interactive figures directly in the user guide (somehow activated by user interaction to avoid large downloads). I think being able to explore stuff as you are reading about it is super cool. It would also be a lot of work to write the guide so that it works together with the example, etc. But luckily we have quite a few people with experience in teaching. Count this as a vote for having a way to have interactive figures in the user guide. |
There is another contender, namely voici, the WASM version of voila dashboards for jupyter/ipywidgets. Here is a demo: I don't know if they can be embedded in a sphinx rendered HTML page though. We could probably use an iframe if needed but maybe there is a better way to do it. Edit: I opened a feature request for embeddable apps: voila-dashboards/voici#79. |
Good news @ogrisel, that day is today! We just released I ported this demo from the scikit-learn gradio hackathon to gradio-lite: https://huggingface.co/spaces/freddyaboulton/gradio-lite-sklearn You can run it yourself locally by pasting this into an html file and opening it in your browser: <!DOCTYPE html>
<html>
<head>
<link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Roboto&display=swap" >
<style>
body {
font-family: 'Roboto', sans-serif;
font-size: 16px;
}
.logo {
height: 1em;
vertical-align: middle;
margin-bottom: 0.1em;
}
</style>
<script type="module" crossorigin src="https://cdn.jsdelivr.net/npm/@gradio/lite@0.4.1/dist/lite.js"></script>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@gradio/lite@0.4.1/dist/lite.css" />
</head>
<body>
<h2>
<img src="https://gradio-builds.s3.amazonaws.com/assets/lite-logo.png" alt="logo" class="logo">
Gradio and scikit-learn running entirely in your browser thanks to pyodide!
</h2>
<gradio-lite>
🔥
<gradio-requirements>
scikit-learn
plotly
numpy
</gradio-requirements>
<gradio-file name="app.py" entrypoint>
import numpy as np
import plotly.graph_objects as go
from sklearn import decomposition
from sklearn import datasets
import gradio as gr
np.random.seed(5)
## PCA
def PCA_Pred(x1, x2, x3, x4):
#Load Data from iris dataset:
iris = datasets.load_iris()
X = iris.data
Y = iris.target
label_data = [("Setosa", 0), ("Versicolour", 1), ("Virginica", 2)]
#Create the model with 3 principal components:
pca = decomposition.PCA(n_components=3)
#Fit model and transform (decrease dimensions) iris dataset:
pca.fit(X)
X = pca.transform(X)
#Create figure with plotly
fig = go.Figure()
for name, label in label_data:
fig.add_trace(go.Scatter3d(
x=X[Y == label, 0],
y=X[Y == label, 1],
z=X[Y == label, 2],
mode='markers',
marker=dict(
size=8,
color=label,
colorscale='Viridis',
opacity=0.8),
name=name
))
user_iris_data = np.array([[x1, x2, x3, x4]], ndmin=2)
#Perform reduction to user data
pc_output = pca.transform(user_iris_data)
fig.add_traces([go.Scatter3d(
x=np.array(pc_output[0, 0]),
y=np.array(pc_output[0, 1]),
z=np.array(pc_output[0, 2]),
mode='markers',
marker=dict(
size=12,
color=4, # set color
colorscale='Viridis', # choose a colorscale
opacity=0.8),
name="User data"
)])
fig.update_layout(scene = dict(
xaxis_title="1st PCA Axis",
yaxis_title="2nd PCA Axis",
zaxis_title="3th PCA Axis"),
legend_title="Species"
)
return [pc_output, fig]
title = "PCA example with Iris Dataset 🌺"
with gr.Blocks(title=title) as demo:
gr.Markdown(f"## {title}")
gr.Markdown(
"""
The following app is a demo for PCA decomposition. It takes 4 dimensions as input, in reference \
to the following image, and returns the transformed first three principal components (feature \
reduction), taken from a pre-trained model with Iris dataset.
""")
with gr.Row():
with gr.Column():
inp1 = gr.Slider(0, 7, value=1, step=0.1, label="Sepal Length (cm)")
inp2 = gr.Slider(0, 5, value=1, step=0.1, label="Sepal Width (cm)")
inp3 = gr.Slider(0, 7, value=1, step=0.1, label="Petal Length (cm)")
inp4 = gr.Slider(0, 5, value=1, step=0.1, label="Petal Width (cm)")
output = gr.Textbox(label="PCA Axes")
with gr.Column():
plot = gr.Plot(label="PCA 3D Space")
Reduction = gr.Button("PCA Transform")
Reduction.click(fn=PCA_Pred, inputs=[inp1, inp2, inp3, inp4], outputs=[output, plot])
demo.load(fn=PCA_Pred, inputs=[inp1, inp2, inp3, inp4], outputs=[output, plot])
demo.launch()
</gradio-file>
</gradio-lite>
</body>
</html> |
This would probably require a SLEP comparing alternatives before we move forward. I don't have the bandwidth to drive that SLEP, but another maintainer might. |
Hello @adrinjalali 👋 since we already have 135 demos here that are already built by community using gradio, I think gradio-lite is a very good consideration given it doesn't require a lot of change except for the HTML parts on top and bottom. We can ask the contributors to open PRs to sklearn docs as well because they were very eager to see their code in the docs! 😊 Also pinging @ArturoAmorQ and @francoisgoupil who were very fond of them. |
I suggest using Codapi for interactive code examples in the documentation. It's an open source tool designed for this very purpose (disclaimer: I'm the author). Reasons to choose Codapi:
Here is how it looks like: https://codapi.org/try/scikit-learn (source) |
How is codapi (for Python) different from something like Juniper and Thebe (server based) or using pyodide (browser based)? The reason I'm asking is that the former (thebe and juniper) are well tested, community owned and supported tools that are IMHO unlikely to go unmaintained (at least they've survived for many years so far). They have the downside of needing a server (currently powered by mybinder.org). Pyodide and JupyterLite are newer but don't require a server and there is already an ongoing effort without scikit-learn to use them (for example all examples already run in JupyterLite). This means there are quite a lot of options already and the question is "why add even more options instead of focussing on finishing one?" |
I'm no expert on Juniper and Thebe, but at first glance they seem to lack the features you need to make existing documentation interactive (like the ones I mentioned — templates and snippet dependencies). I've taken one of the tutorials and made it interactive using Codapi (the link is in my comment above). You can try to do the same with Juniper/Thebe and see how it works out. As for Pyodide, it is just an execution engine. Codapi can also use it. In any case, I'm not trying to say that Codapi is better and other tools are worse. I've presented it as a solution. I've described its strengths (as I see them). I've given a specific example for one of the existing tutorials. The rest is up to the scikit-learn team. |
Describe the issue linked to the documentation
This is not an issue but rather a good-to-have feature for documentation and this issue is more of a discussion.
It would be nice to have links to or embedding interactive demos made with Gradio inside documentation.
I've built two apps that you might see how it looks like:
We can have below demo (but more official version 😅) embedded in your documentation.

The current workflow for users is to download python code or run binder. This reduces amount of work in this workflow too!
If you think embedded demos is a bit overkill what we could do is to create these demos and host them on Spaces and put link to Space instead (like it is with Keras examples or kornia examples, you can see them here and here, they're linked from their tutorials on both Keras.io/examples and Kornia docs) We will also host the Spaces in better hardware (8 vCPU 32 GiB RAM) to make sure it's always running and all good. ✨
You can see how it's implemented in Kornia for rst docs here. It looks like this inside Kornia docs.
As for Keras, we just put a redirection link inside docs since their document generation is more layered.

We definitely want to reduce the core maintainers' workflow so what would be cool is that we (team at Gradio & community) will be developing the demos and maintaining them through GitHub actions such that they will not break every time there's a breaking change in sklearn (this happens rarely, though 🙂)
Hoping to see how we could collaborate!
The text was updated successfully, but these errors were encountered: