{ "cells": [ { "cell_type": "markdown", "id": "201bf295-3c31-4348-9429-893dcab6be94", "metadata": {}, "source": [ "
\n", " \n", " \n", " \n", " \n", " \"vl\n", " \n", "
\n", " GitHub โ€ข\n", " Join Discord Community โ€ข\n", " Discussion Forum \n", "
\n", "\n", "
\n", " Blog โ€ข\n", " Documentation โ€ข\n", " About Us \n", "
\n", "\n", "\n", "
\n", "\n", "
\n", "
\n", " \n", " \"site\"\n", " \n", " \"blog\"\n", " \n", " \"github\"\n", " \n", " \"slack\"\n", " \n", " \"linkedin\"\n", " \n", " \"youtube\"\n", " \n", " \"twitter\"\n", "
\n", "
" ] }, { "cell_type": "markdown", "id": "pN6wiKBax7Pa", "metadata": { "id": "pN6wiKBax7Pa", "tags": [] }, "source": [ "# Quickstart - Analyze Dataset for Potential Issues\n", "\n", "[![Open in Colab](https://img.shields.io/badge/Open%20in%20Colab-blue?style=for-the-badge&logo=google-colab&labelColor=gray)](https://colab.research.google.com/github/visual-layer/fastdup/blob/main/examples/quickstart.ipynb)\n", "[![Open in Kaggle](https://img.shields.io/badge/Open%20in%20Kaggle-blue?style=for-the-badge&logo=kaggle&labelColor=gray)](https://kaggle.com/kernels/welcome?src=https://github.com/visual-layer/fastdup/blob/main/examples/quickstart.ipynb)\n", "[![Explore the Docs](https://img.shields.io/badge/Explore%20the%20Docs-blue?style=for-the-badge&labelColor=gray&logo=read-the-docs)](https://docs.visual-layer.com/docs/getting-started-with-fastdup)\n", "\n", "Welcome to the fastdup Quickstart Guide! ๐ŸŽ‰\n", "\n", "This notebook demonstrates how to efficiently analyze an image dataset for potential issues using [fastdup](https://github.com/visual-layer/fastdup), a powerful tool designed for image and video dataset exploration.\n", "\n", "### Objectives\n", "By the end of this tutorial, you'll be able to:\n", "- Detect and identify **broken images**.\n", "- Spot **duplicates** or **near-duplicates** within your dataset.\n", "- Discover **outliers** that may affect model performance.\n", "- Find **dark, bright, or blurry images** for potential quality adjustments.\n", "\n", "### What's Included\n", "In addition to identifying dataset issues, this guide will help you:\n", "- Visualize **clusters of visually similar images**, enabling a high-level understanding of your dataset's structure.\n", "- Learn the core functionalities of fastdup with simple, step-by-step examples." ] }, { "cell_type": "markdown", "id": "c0727302-dbe5-46b3-a5ff-b039811a7e7e", "metadata": { "tags": [] }, "source": [ "## Installation\n", "First, let's start with the installation:\n", "\n", "> โœ… **Tip** - If you're new to fastdup, we encourage you to run the notebook in [Google Colab](https://colab.research.google.com/github/visual-layer/fastdup/blob/main/examples/quick-dataset-analysis.ipynb) or [Kaggle](https://kaggle.com/kernels/welcome?src=https://github.com/visual-layer/fastdup/blob/main/quick-dataset-analysis.ipynb) for the best experience. If you'd like to just view and skim through the notebook, we recommend viewing using [nbviewer](https://nbviewer.org/github/visual-layer/fastdup/blob/main/examples/quick-dataset-analysis.ipynb). \n", "\n" ] }, { "cell_type": "code", "execution_count": 1, "id": "8e6dd3e6-0f72-456b-9b16-2e53d5d5c099", "metadata": {}, "outputs": [], "source": [ "import sys\n", "if \"google.colab\" in sys.modules:\n", " # Running in Google Colab\n", " !pip install --force-reinstall --no-cache-dir numpy==1.26.4 scipy fastdup\n", "else:\n", " # Running outside Colab\n", " !pip install -Uq fastdup\n" ] }, { "cell_type": "markdown", "id": "488abfbf", "metadata": {}, "source": [ "Now, test the installation by printing out the version. If there's no error message, we are ready to go!" ] }, { "cell_type": "code", "execution_count": 2, "id": "e301485f", "metadata": { "id": "e301485f", "tags": [] }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/Users/or_barsheshet/Library/Python/3.9/lib/python/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n", " from .autonotebook import tqdm as notebook_tqdm\n" ] }, { "data": { "text/plain": [ "'2.14'" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import os \n", "os.environ['JPY_PARENT_PID'] = '1'\n", "\n", "# Verify fastdup installation\n", "import fastdup\n", "fastdup.__version__" ] }, { "cell_type": "markdown", "id": "2d30a901-4ba8-48cf-9a2f-37e0f70fa1ae", "metadata": { "tags": [] }, "source": [ "## Download Dataset\n", "\n", "For demonstration, we will use a generally curated [Oxford IIIT Pet dataset](https://www.robots.ox.ac.uk/~vgg/data/pets/). Feel free to swap this dataset with your own.\n", "\n", "The dataset consists of images and annotations for 37 category pets with roughly 200 images for each class. \n", "\n", "> ๐Ÿ—’ **Note** - fastdup works on both unlabeled and labeled images. But for now, we are only interested in finding issues in the images and not the annotations. \n", "> If you're interested in finding annotation issues, head to:\n", "> + ๐Ÿ–ผ [**Analyze Image Classification Dataset**](https://nbviewer.org/github/visual-layer/fastdup/blob/main/examples/analyzing-image-classification-dataset.ipynb)\n", "> + ๐ŸŽ [**Analyze Object Detection Dataset**](https://nbviewer.org/github/visual-layer/fastdup/blob/main/examples/analyzing-object-detection-dataset.ipynb).\n", "\n", "\n", "Let's download only from the dataset and extract them into the local directory:" ] }, { "cell_type": "code", "execution_count": null, "id": "d91abfc1", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "zsh:1: command not found: wget\n", "tar: Error opening archive: Failed to open 'images.tar.gz'\n" ] } ], "source": [ "!wget https://thor.robots.ox.ac.uk/~vgg/data/pets/images.tar.gz -O images.tar.gz\n", "!tar xf images.tar.gz" ] }, { "cell_type": "markdown", "id": "8cd8a7da-2e05-4c38-aa37-33fd466a61e2", "metadata": { "tags": [] }, "source": [ "## Run fastdup\n", "\n", "Once the extraction completes, we can run fastdup on the images.\n", "\n", "For that let's initialize fastdup and specify the input directory which points to the folder of images." ] }, { "cell_type": "code", "execution_count": 6, "id": "fe4d8211-89b2-4a2f-91f4-8074d2314aef", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Warning: fastdup create() without work_dir argument, output is stored in a folder named work_dir in your current working path.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "fastdup By Visual Layer, Inc. 2024. All rights reserved.\n", "\n", "A fastdup dataset object was created!\n", "\n", "Input directory is set to \u001b[0;35m\"images\"\u001b[0m\n", "Work directory is set to \u001b[0;35m\"work_dir\"\u001b[0m\n", "\n", "The next steps are:\n", " 1. Analyze your dataset with the \u001b[0;35m.run()\u001b[0m function of the dataset object\n", " 2. Interactively explore your data on your local machine with the \u001b[0;35m.explore()\u001b[0m function of the dataset object\n", "\n", "For more information, use \u001b[0;35mhelp(fastdup)\u001b[0m or check our documentation https://docs.visual-layer.com/docs/getting-started-with-fastdup.\n", "\n" ] } ], "source": [ "fd = fastdup.create(input_dir=\"images/\")" ] }, { "cell_type": "markdown", "id": "4acb64a1-ab06-4fa2-8111-65b5d4f2a335", "metadata": {}, "source": [ "> ๐Ÿ—’ **Note** - The `.create` method also has an optional `work_dir` parameter which specifies the directory to store artifacts from the run.\n", "\n", "In other words you can run `fastdup.create(input_dir=\"images/\", work_dir=\"my_work_dir/\")` if you'd like to store the artifacts in a `my_work_dir`.\n", "\n", "Now, let's run fastdup." ] }, { "cell_type": "code", "execution_count": null, "id": "beac4c50-3084-47fe-9b22-b14c3d3cb139", "metadata": { "tags": [] }, "outputs": [], "source": [ "fd.run(overwrite=True)" ] }, { "cell_type": "markdown", "id": "24b9d94d-7458-42f0-bf77-1b33491279f2", "metadata": {}, "source": [ "## View Run Summary\n", "\n", "After the run is completed, you can optionally view the summary with:" ] }, { "cell_type": "code", "execution_count": 15, "id": "b546398f-e555-42b7-83ad-fd9ba9286d41", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", " ########################################################################################\n", "\n", "Dataset Analysis Summary: \n", "\n", " Dataset contains 7390 images\n", " Valid images are 99.92% (7,384) of the data, invalid are 0.08% (6) of the data\n", " For a detailed analysis, use `.invalid_instances()`.\n", "\n", " Components: failed to find images clustered into components, try to run with lower cc_threshold.\n", " Outliers: 6.14% (454) of images are possible outliers, and fall in the bottom 5.00% of similarity values.\n", " For a detailed list of outliers, use `.outliers()`.\n", "\n" ] }, { "data": { "text/plain": [ "['Dataset contains 7390 images',\n", " 'Valid images are 99.92% (7,384) of the data, invalid are 0.08% (6) of the data',\n", " 'For a detailed analysis, use `.invalid_instances()`.\\n',\n", " 'Components: failed to find images clustered into components, try to run with lower cc_threshold.',\n", " 'Outliers: 6.14% (454) of images are possible outliers, and fall in the bottom 5.00% of similarity values.',\n", " 'For a detailed list of outliers, use `.outliers()`.\\n']" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fd.summary()" ] }, { "cell_type": "markdown", "id": "9cde5da4-960b-469e-bba2-32736c5131f8", "metadata": { "id": "67205fab", "tags": [] }, "source": [ "## Invalid Images\n", "From the summary above, we see there are a few invalid images. These are broken images that cannot be read.\n", "\n", "You can get a list of broken images with:" ] }, { "cell_type": "code", "execution_count": 5, "id": "883435db-3097-4449-ab1a-c522d48edbd9", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
filenameindexerror_codeis_validfd_index
136images/Abyssinian_34.jpg136ERROR_CORRUPT_IMAGEFalse136
1042images/Egyptian_Mau_139.jpg1042ERROR_CORRUPT_IMAGEFalse1042
1049images/Egyptian_Mau_145.jpg1049ERROR_CORRUPT_IMAGEFalse1049
1070images/Egyptian_Mau_167.jpg1070ERROR_CORRUPT_IMAGEFalse1070
1079images/Egyptian_Mau_177.jpg1079ERROR_CORRUPT_IMAGEFalse1079
1095images/Egyptian_Mau_191.jpg1095ERROR_CORRUPT_IMAGEFalse1095
\n", "
" ], "text/plain": [ " filename index error_code is_valid fd_index\n", "136 images/Abyssinian_34.jpg 136 ERROR_CORRUPT_IMAGE False 136\n", "1042 images/Egyptian_Mau_139.jpg 1042 ERROR_CORRUPT_IMAGE False 1042\n", "1049 images/Egyptian_Mau_145.jpg 1049 ERROR_CORRUPT_IMAGE False 1049\n", "1070 images/Egyptian_Mau_167.jpg 1070 ERROR_CORRUPT_IMAGE False 1070\n", "1079 images/Egyptian_Mau_177.jpg 1079 ERROR_CORRUPT_IMAGE False 1079\n", "1095 images/Egyptian_Mau_191.jpg 1095 ERROR_CORRUPT_IMAGE False 1095" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fd.invalid_instances()" ] }, { "cell_type": "markdown", "id": "98a0333c", "metadata": {}, "source": [ "## Interactive Exploration\n", "In addition to the static visualizations presented above, fastdup also offers interactive exploration of the dataset.\n", "\n", "To explore the dataset and issues interactively in a browser, run:" ] }, { "cell_type": "code", "execution_count": null, "id": "4f2298e8", "metadata": {}, "outputs": [], "source": [ "fd.explore()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> ๐Ÿ—’ **Note** - This currently requires you to sign-up (for free) to view the interactive exploration. Alternatively, you can visualize fastdup in a non-interactive way using fastdup's built in galleries shown in the upcoming cells.\n", "\n", "You'll be presented with a web interface that lets you conveniently view, filter, and curate your dataset in a web interface.\n", "\n", "\n", "![image.png](https://vl-blog.s3.us-east-2.amazonaws.com/fastdup_assets/cloud_preview.gif)" ] }, { "cell_type": "markdown", "id": "c1330de5", "metadata": {}, "source": [ "## Visualize Image Clusters\n", "\n", "One of fastdup's coolest features is visualizing image clusters. In this section, we group similar-looking images (or even duplicates) as a cluster and visualize them in the gallery.\n", "\n", "To do so, run:\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "id": "7e56690a", "metadata": {}, "outputs": [], "source": [ "fd.vis.component_gallery()" ] }, { "cell_type": "markdown", "id": "22e04b25-0fe7-409d-8bd9-3b92c2ec8c5b", "metadata": {}, "source": [ "## Duplicate/Near-duplicates\n", "\n", "One of the lowest hanging fruits in cleaning a dataset is finding and eliminating duplicates.\n", "\n", "fastdup provides a handy way of visualizing duplicates/near-duplicates using the `duplicates_gallery` method. The `Distance` value indicates how visually similar are the image pairs in the gallery. A `Distance` of `1.0` indicates an exact copy and vice-versa." ] }, { "cell_type": "code", "execution_count": 6, "id": "27b091e6-fffa-4701-8a9a-19b7b087314a", "metadata": { "tags": [] }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "f0b35ad31c454d779888a71a79cd0b4d", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Generating gallery: 0%| | 0/20 [00:00\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " Duplicates Report\n", " \n", " \n", "\n", "\n", "\n", "
\n", "
\n", "
\n", " \n", " \"logo\"\n", " \n", "
\n", " \n", "\n", "
\n", "
\n", "
\n", " For the new and interactive data exploration\n", " \n", " Read more \n", " \n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", " fastdup.explore()\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", "

Duplicates Report

\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance1.0
From/Bombay_11.jpg
To/Bombay_100.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance1.0
From/Bombay_220.jpg
To/Bombay_126.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance1.0
From/Bombay_189.jpg
To/Bombay_164.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance1.0
From/Bombay_99.jpg
To/Bombay_202.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance1.0
From/Bombay_185.jpg
To/Bombay_190.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance1.0
From/boxer_82.jpg
To/boxer_114.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance1.0
From/Bombay_198.jpg
To/Bombay_69.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance1.0
From/newfoundland_137.jpg
To/newfoundland_153.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance1.0
From/Egyptian_Mau_183.jpg
To/Egyptian_Mau_10.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance1.0
From/keeshond_59.jpg
To/keeshond_54.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance1.0
From/Bombay_194.jpg
To/Bombay_32.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance1.0
From/Bombay_193.jpg
To/Bombay_22.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance1.0
From/Bombay_109.jpg
To/Bombay_206.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance1.0
From/newfoundland_152.jpg
To/newfoundland_147.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance1.0
From/Egyptian_Mau_224.jpg
To/Egyptian_Mau_71.jpg
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", " \n", "
\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "0" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fd.vis.duplicates_gallery()" ] }, { "cell_type": "markdown", "id": "530988f2-a98e-4516-90e1-0d94bcac9951", "metadata": {}, "source": [ "## Outliers\n", "\n", "Similar to duplicate pairs, you can visualize potential outliers in your dataset with:" ] }, { "cell_type": "code", "execution_count": 7, "id": "7d83835b-0223-445f-9700-052fc4ca58a1", "metadata": { "tags": [] }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "43b67a7169da4bb2bd7af9a32aef1cac", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Generating gallery: 0%| | 0/20 [00:00\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " Outliers Report\n", " \n", " \n", "\n", "\n", "\n", "
\n", "
\n", "
\n", " \n", " \"logo\"\n", " \n", "
\n", " \n", "\n", "
\n", "
\n", "
\n", " For the new and interactive data exploration\n", " \n", " Read more \n", " \n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", " fastdup.explore()\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", "

Outliers Report

Showing image outliers, one per row

\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.597075
Path/Bengal_105.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.616418
Path/Sphynx_128.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.624279
Path/beagle_142.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.629087
Path/staffordshire_bull_terrier_51.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.629917
Path/american_pit_bull_terrier_72.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.633318
Path/german_shorthaired_173.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.633533
Path/miniature_pinscher_76.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.634925
Path/Bengal_131.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.639585
Path/chihuahua_6.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.642
Path/basset_hound_197.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.643355
Path/boxer_149.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.643534
Path/beagle_147.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.645831
Path/Bombay_204.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.653168
Path/Bombay_36.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.6535
Path/Abyssinian_226.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.654307
Path/miniature_pinscher_191.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.655955
Path/staffordshire_bull_terrier_76.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.660908
Path/chihuahua_164.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.661223
Path/german_shorthaired_121.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
Distance0.667204
Path/Bombay_188.jpg
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", " \n", "
\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "0" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fd.vis.outliers_gallery() " ] }, { "cell_type": "markdown", "id": "789da241-e9cd-4568-9d19-aa5c80567415", "metadata": {}, "source": [ "## Blurry, Dark and Bright Images\n", "\n", "fastdup also lets you visualize images from your dataset using statistical metrics.\n", "\n", "For example, with `metric='blur'` we can visualize the blur images from the dataset." ] }, { "cell_type": "code", "execution_count": 8, "id": "292bdd75-5df0-4617-bd1e-8bcdd147e215", "metadata": { "tags": [] }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "013b326375884f1cadff1df4134f87ad", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Generating gallery: 0%| | 0/20 [00:00\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " Dark Image Report\n", " \n", " \n", "\n", "\n", "\n", "
\n", "
\n", "
\n", " \n", " \"logo\"\n", " \n", "
\n", " \n", "\n", "
\n", "
\n", "
\n", " For the new and interactive data exploration\n", " \n", " Read more \n", " \n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", " fastdup.explore()\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", "

Dark Image Report

Showing example images, sort by ascending order

\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean15.7118
filenameimages/Abyssinian_4.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean18.7883
filenameimages/Abyssinian_114.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean19.5741
filenameimages/Abyssinian_18.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean19.8396
filenameimages/Bombay_191.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean26.7209
filenameimages/Bombay_108.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean27.4072
filenameimages/Abyssinian_62.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean28.5051
filenameimages/scottish_terrier_171.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean29.4029
filenameimages/Sphynx_119.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean29.9286
filenameimages/Maine_Coon_134.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean31.4749
filenameimages/shiba_inu_137.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean31.599
filenameimages/chihuahua_78.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean32.7848
filenameimages/shiba_inu_27.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean33.2283
filenameimages/Egyptian_Mau_59.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean33.7525
filenameimages/japanese_chin_175.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean33.7692
filenameimages/beagle_180.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean33.9768
filenameimages/Abyssinian_30.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean34.0113
filenameimages/american_bulldog_150.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean34.3895
filenameimages/Abyssinian_46.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean34.8092
filenameimages/Sphynx_46.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean35.634
filenameimages/japanese_chin_40.jpg
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", " \n", "
\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "0" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fd.vis.stats_gallery(metric='dark')" ] }, { "cell_type": "code", "execution_count": 9, "id": "6e4cd628-ee7b-4eb9-b2d0-bde4f0beb22d", "metadata": { "tags": [] }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "9889a1605f854cfa81f5cc173daceed4", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Generating gallery: 0%| | 0/20 [00:00\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " Bright Image Report\n", " \n", " \n", "\n", "\n", "\n", "
\n", "
\n", "
\n", " \n", " \"logo\"\n", " \n", "
\n", " \n", "\n", "
\n", "
\n", "
\n", " For the new and interactive data exploration\n", " \n", " Read more \n", " \n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", " fastdup.explore()\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", "

Bright Image Report

Showing example images, sort by descending order

\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean235.6992
filenameimages/saint_bernard_183.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean234.3785
filenameimages/saint_bernard_188.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean233.4722
filenameimages/Egyptian_Mau_99.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean232.2554
filenameimages/saint_bernard_186.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean230.1848
filenameimages/Abyssinian_127.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean226.9057
filenameimages/saint_bernard_187.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean226.3688
filenameimages/British_Shorthair_274.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean223.6878
filenameimages/Egyptian_Mau_1.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean223.2687
filenameimages/great_pyrenees_88.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean220.246
filenameimages/Bengal_20.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean218.5597
filenameimages/pug_76.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean217.9169
filenameimages/Egyptian_Mau_39.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean216.7688
filenameimages/Maine_Coon_267.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean214.4495
filenameimages/staffordshire_bull_terrier_25.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean213.1254
filenameimages/Birman_136.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean212.3259
filenameimages/basset_hound_24.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean211.3064
filenameimages/boxer_172.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean211.2815
filenameimages/saint_bernard_14.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean211.1101
filenameimages/pug_96.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
mean210.7337
filenameimages/Egyptian_Mau_45.jpg
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", " \n", "
\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "0" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fd.vis.stats_gallery(metric='bright')" ] }, { "cell_type": "code", "execution_count": 10, "id": "aeb3a18e-1c2e-4ce4-94c0-61cdddae0619", "metadata": { "tags": [] }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "4f14b06bbd194f29a73ff5231a596550", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Generating gallery: 0%| | 0/20 [00:00\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " Blurry Image Report\n", " \n", " \n", "\n", "\n", "\n", "
\n", "
\n", "
\n", " \n", " \"logo\"\n", " \n", "
\n", " \n", "\n", "
\n", "
\n", "
\n", " For the new and interactive data exploration\n", " \n", " Read more \n", " \n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", " fastdup.explore()\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", "

Blurry Image Report

Showing example images, sort by ascending order

\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur65.1586
filenameimages/Persian_228.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur68.6347
filenameimages/Ragdoll_254.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur71.8926
filenameimages/pomeranian_170.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur76.9661
filenameimages/pomeranian_183.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur77.3129
filenameimages/pug_166.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur77.8375
filenameimages/Ragdoll_255.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur79.21
filenameimages/yorkshire_terrier_123.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur83.2725
filenameimages/pomeranian_166.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur88.556
filenameimages/pomeranian_123.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur91.0464
filenameimages/chihuahua_124.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur93.68
filenameimages/chihuahua_161.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur96.0024
filenameimages/pomeranian_117.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur99.3509
filenameimages/pomeranian_176.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur104.3721
filenameimages/chihuahua_187.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur105.5227
filenameimages/Siamese_250.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur108.3876
filenameimages/Persian_260.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur111.6988
filenameimages/pomeranian_173.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur113.5611
filenameimages/pomeranian_172.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur115.5061
filenameimages/Bombay_85.jpg
\n", "
\n", "
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", "
\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", " \n", "
Info
blur115.5061
filenameimages/Bombay_200.jpg
\n", "
\n", "
\n", "
\n", " \n", "
\n", "
\n", " \n", "
\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "0" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fd.vis.stats_gallery(metric='blur')" ] }, { "cell_type": "markdown", "id": "6c3135e1", "metadata": {}, "source": [ "## Wrap Up\n", "\n", "That's a wrap! In this notebook we showed how you can run fastdup on a dataset or any folder of images. \n", "\n", "We've seen how to use fastdup to find:\n", "\n", "+ Broken images.\n", "+ Duplicate/near-duplicates.\n", "+ Outliers.\n", "+ Dark, bright and blurry images.\n", "+ Image clusters.\n", "\n", "Next, feel free to check out other tutorials -\n", "\n", "+ ๐Ÿงน [**Clean Image Folder**](https://nbviewer.org/github/visual-layer/fastdup/blob/main/examples/cleaning-image-dataset.ipynb): Learn how to analyze and clean a folder of images from potential issues and export a list of problematic files for further action. If you have an unorganized folder of images, this is a good place to start.\n", "+ ๐Ÿ–ผ [**Analyze Image Classification Dataset**](https://nbviewer.org/github/visual-layer/fastdup/blob/main/examples/analyzing-image-classification-dataset.ipynb): Learn how to load a labeled image classification dataset and analyze for potential issues. If you have labeled ImageNet-style folder structure, have a go!\n", "+ ๐ŸŽ [**Analyze Object Detection Dataset**](https://nbviewer.org/github/visual-layer/fastdup/blob/main/examples/analyzing-object-detection-dataset.ipynb): Learn how to load bounding box annotations for object detection and analyze for potential issues. If you have a COCO-style labeled object detection dataset, give this example a try. \n", "\n", "As usual, feedback is welcome! Questions? Drop by our [Slack channel](https://visualdatabase.slack.com/join/shared_invite/zt-19jaydbjn-lNDEDkgvSI1QwbTXSY6dlA#/shared-invite/email) or open an issue on [GitHub](https://github.com/visual-layer/fastdup/issues).\n" ] }, { "cell_type": "markdown", "id": "6034a6ad-2aa2-454e-ad2d-bd320e7fe6bb", "metadata": {}, "source": [ "
\n", "
\n", " \n", " \"site\"\n", " \n", " \"blog\"\n", " \n", " \"github\"\n", " \n", " \"slack\"\n", " \n", " \"linkedin\"\n", " \n", " \"youtube\"\n", " \n", " \"twitter\"\n", "
\n", "
\n", "
\n", " \"logo\"\n", "
Copyright ยฉ 2024 Visual Layer. All rights reserved.
\n", "
\n", "\n", "
" ] } ], "metadata": { "colab": { "provenance": [] }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.6" } }, "nbformat": 4, "nbformat_minor": 5 }