Skip to content

Update to yolo #232

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 6 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .devcontainer.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
{
"name": "example-repos-dev",
"image": "mcr.microsoft.com/devcontainers/python:3.10",
"runArgs": ["--ipc=host"],
"extensions": ["Iterative.dvc", "ms-python.python", "redhat.vscode-yaml"],
"features": {
"ghcr.io/devcontainers/features/nvidia-cuda:1": {
Expand Down
54 changes: 3 additions & 51 deletions example-get-started-experiments/code/README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,5 @@
[![DVC](https://img.shields.io/badge/-Open_in_Studio-grey.svg?style=flat-square&logo=dvc)](https://studio.iterative.ai/team/Iterative/projects/example-get-started-experiments-y8toqd433r)
[![DVC-metrics](https://img.shields.io/badge/dynamic/json?style=flat-square&colorA=grey&colorB=F46737&label=Dice%20Metric&url=https://github.com/iterative/example-get-started-experiments/raw/main/results/evaluate/metrics.json&query=dice_multi)](https://github.com/iterative/example-get-started-experiments/raw/main/results/evaluate/metrics.json)

[Train Report](./results/train/report.md) - [Evaluation Report](./results/evaluate/report.md)
[![DVC Studio](https://img.shields.io/badge/-Open_in_Studio-grey.svg?style=flat-square&logo=dvc)](https://studio.iterative.ai/team/Iterative/projects/example-get-started-experiments-y8toqd433r)
[![DVC-metrics](https://img.shields.io/badge/dynamic/json?style=flat-square&colorA=grey&colorB=F46737&label=Dice%20Metric&url=https://github.com/iterative/example-get-started-experiments/raw/main/dvclive/metrics.json&query=metrics/mAP50(M))](https://github.com/iterative/example-get-started-experiments/raw/main/dvclive/metrics.json)

# DVC Get Started: Experiments

Expand All @@ -11,8 +9,6 @@ This is an auto-generated repository for use in [DVC](https://dvc.org)
This is a Computer Vision (CV) project that solves the problem of segmenting out
swimming pools from satellite images.

[Example results](./results/evaluate/plots/images/)

We use a slightly modified version of the [BH-Pools dataset](http://patreo.dcc.ufmg.br/2020/07/29/bh-pools-watertanks-datasets/):
we split the original 4k images into tiles of 1024x1024 pixels.

Expand Down Expand Up @@ -58,7 +54,7 @@ $ dvc pull
## Running in your environment

Run [`dvc exp run`](https://man.dvc.org/exp/run) to reproduce the
[pipeline](https://dvc.org/doc/user-guide/pipelines/defining-pipelinese):
[pipeline](https://dvc.org/doc/user-guide/pipelines/defining-pipelines):

```console
$ dvc exp run
Expand Down Expand Up @@ -107,47 +103,3 @@ This tag also contains a GitHub Actions workflow that reruns the pipeline if any
changes are introduced to the pipeline-related files.
[CML](https://cml.dev/) is used in this workflow to provision a cloud-based GPU
machine as well as report model performance results in Pull Requests.

## Deploying the model
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing this section for now.
Will properly close #230 on a subsequent P.R. adding the sagemaker code


Check out the [PR](https://github.com/iterative/example-get-started-experiments/pulls)
that adds this model to
[Iterative Studio Model Registry](https://dvc.org/doc/studio/user-guide/model-registry/what-is-a-model-registry).
You can [trigger CI/CD](https://dvc.org/doc/studio/user-guide/model-registry/use-models#deploying-and-publishing-models-in-cicd)
by [registering versions](https://dvc.org/doc/studio/user-guide/model-registry/register-version)
and [assigning stages](https://dvc.org/doc/studio/user-guide/model-registry/assign-stage)
in Model Registry, building and publishing Docker images with the model,
or deploying the model to the cloud.

## Project structure

The data files, DVC files, and results change as stages are created one by one.
After cloning and using [`dvc pull`](https://man.dvc.org/pull) to download
data, models, and plots tracked by DVC, the workspace should look like this:

```console
$ tree -L 2
.
├── LICENSE
├── README.md
├── data. # <-- Directory with raw and intermediate data
│ ├── pool_data # <-- Raw image data
│ ├── pool_data.dvc # <-- .dvc file - a placeholder/pointer to raw data
│ ├── test_data # <-- Processed test data
│ └── train_data # <-- Processed train data
├── dvc.lock
├── dvc.yaml # <-- DVC pipeline file
├── models
│ └── model.pkl # <-- Trained model file
├── notebooks
│ └── TrainSegModel.ipynb # <-- Initial notebook (refactored into `dvc.yaml`)
├── params.yaml # <-- Parameters file
├── requirements.txt # <-- Python dependencies needed in the project
├── results # <-- DVCLive reports and plots
│ ├── evaluate
│ └── train
└── src # <-- Source code to run the pipeline stages
├── data_split.py
├── evaluate.py
└── train.py
```
178 changes: 178 additions & 0 deletions example-get-started-experiments/code/TrainSegModel.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import shutil\n",
"from pathlib import Path\n",
"\n",
"import cv2\n",
"from ultralytics import YOLO\n",
"\n",
"DATA = Path(\"datasets\")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Load data and split it into train/test\n",
"\n",
"We have some [data in DVC](https://dvc.org/doc/start/data-management/data-versioning) that we can pull. \n",
"\n",
"This data includes:\n",
"* satellite images\n",
"* masks of the swimming pools in each satellite image\n",
"\n",
"DVC can help connect your data to your repo, but it isn't necessary to have your data in DVC to start tracking experiments with DVC and DVCLive."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!dvc pull"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Convert to YOLO Dataset format\n",
"\n",
"https://docs.ultralytics.com/datasets/segment/"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def mask_to_yolo_annotation(mask):\n",
" contours, _ = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)\n",
" annotation = \"\"\n",
" for contour in contours:\n",
" single_annotation = \"0\"\n",
" for row, col in contour.squeeze():\n",
" single_annotation += f\" {round(col / mask.shape[1], 3)} {round(row / mask.shape[0], 3)}\"\n",
" annotation += f\"{single_annotation}\\n\"\n",
" return annotation"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"test_regions = [\"REGION_1-\"]\n",
"\n",
"train_data_dir = DATA / \"yolo_dataset\" / \"train\"\n",
"train_data_dir.mkdir(exist_ok=True, parents=True)\n",
"test_data_dir = DATA / \"yolo_dataset\" / \"val\"\n",
"test_data_dir.mkdir(exist_ok=True, parents=True)\n",
"\n",
"for img_path in DATA.glob(\"pool_data/images/*.jpg\"):\n",
" yolo_annotation = mask_to_yolo_annotation(\n",
" cv2.imread(\n",
" str(DATA / \"pool_data\" / \"masks\" / f\"{img_path.stem}.png\"),\n",
" cv2.IMREAD_GRAYSCALE\n",
" )\n",
" )\n",
"\n",
" if any(region in str(img_path) for region in test_regions):\n",
" dst = test_data_dir / img_path.name\n",
" else:\n",
" dst = train_data_dir / img_path.name\n",
"\n",
" shutil.copy(img_path, dst)\n",
" dst.with_suffix(\".txt\").write_text(yolo_annotation)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"yolo_dataset_yaml = DATA / \"yolo_dataset.yaml\"\n",
"yolo_dataset_yaml.write_text(\n",
" \"\"\"\n",
"path: ./yolo_dataset\n",
"train: train\n",
"val: val\n",
"\n",
"names:\n",
" 0: pool\n",
" \"\"\"\n",
")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Train model\n",
"Set up model training, using DVCLive to capture the results of each experiment."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"imgsz = 512\n",
"epochs = 20\n",
"model = \"yolov8n-seg.pt\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": false
},
"outputs": [],
"source": [
"yolo = YOLO(model)\n",
"\n",
"yolo.train(data=yolo_dataset_yaml, epochs=epochs, imgsz=imgsz)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
},
"vscode": {
"interpreter": {
"hash": "949777d72b0d2535278d3dc13498b2535136f6dfe0678499012e853ee9abcab1"
}
}
},
"nbformat": 4,
"nbformat_minor": 4
}
3 changes: 0 additions & 3 deletions example-get-started-experiments/code/data/.gitignore

This file was deleted.

2 changes: 2 additions & 0 deletions example-get-started-experiments/code/datasets/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
/pool_data
/yolo_dataset
Loading