- Investigate using multiscale grids in a Vision Transformer Masked Autoencoder.
- Will it be worth the computational requirements?
- Smallest will likely do
- https://github.com/techmn/satmae_pp
- A few notes : https://github.com/RichardScottOZ/satmae_pp
- LICENSE - Apache 2.0
- Not run this one yet
- https://www.research-collection.ethz.ch/handle/20.500.11850/581338 - Self-Supervised Representation Learning for Remote Sensing https://github.com/RichardScottOZ/satmae_pp/tree/main
- V100s - 16GB?
- Trained on https://github.com/fMoW/dataset [70GB tarball] https://purl.stanford.edu/vg497cb6002
- To investigate structure
- Presumably 3 band groupings for 10, 20 and 60m resolution patches around pictures of locations of interest - airports, zoos, etc.
- Designed to classify these
- Metadata file - csv with location/polygon coordinates, class type etc.
category location_id image_id timestamp polygon
0 airport 0 6 2015-07-25T08:45:14Z POLYGON ((32.666164117900003 39.932541952376475, 32.711078120537337 39.932541952376475, 32.711078120537337 39.967113357199999, 32.666164117900003 39.967113357199999, 32.666164117900003 39.932541952376475))
- pytorch as pre instructions
- geopandas to get bonus gdal
- rasterio via conda-forge
- tensorboard via conda-forge
- pip install timm
- pip install opencv-python
- [so far]
-
I had started with a python 3.10 and default installed the rest which gave timm 0.9.16 and an error
-
satmae advises
- python 3.8
- pytorch 1.10
- cuda 11.1
- timm 0.4.12
- rasterio.errors.RasterioIOError: '/dataset/fmow_sentinel/fmow-sentinel/train\parking_lot_or_garage/parking_lot_or_garage_927/parking_lot_or_garage_927_109.tif' does not exist in the file system, and is not recognized as a supported dataset name.
- Multiscale adaptation for segmentation based on general layers
- Could be remote sensing, but any domain for geoscience, geophysics, geology, structure etc.
- might be continuous or one hot
- assume mse default for testing this and getting to work
- To keep it in human finger space [and patch space]
- Take a set of geophysics grids at 100m resolution
- Take another set at 200m resolution
- Planets, surface etc. - not rectangles
- Likely want on the fly grid slicing into tiles, not directory structures full of sliced up grids in folders
- Is this useful for autoencoders here beyond smoothing reasons
- Needs to be general
- Resolution groupings - this is a satmae parameter already
- Might be fun to get an xarray based training loop going
- Assume all files the same
- Read from a data directory
- Stack
python -m main_pretrain.py
--batch_size 8 --accum_iter 16
--epochs 1 --warmup_epochs 1
--input_size 96 --patch_size 8
--mask_ratio 0.75
--model_type group_c
--dataset_type grid
--grouped_bands 0 --grouped_bands 1
--blr 0.0001 --num_workers 8
--output_dir ./output_dir
--log_dir ./output_dir
- python -m main_pretrain.py --batch_size 8 --accum_iter 16 --epochs 3 --warmup_epochs 1 --input_size 96 --patch_size 8 --mask_ratio 0.75 --model_type group_c --dataset_type grid --grouped_bands 0 --grouped_bands 1 --blr 0.0001 --num_workers 8 --output_dir ./output_dir --log_dir ./output_dir
python main_pretrain.py --batch_size 8 --accum_iter 16 --epochs 1 --warmup_epochs 1 --input_size 96 --patch_size 8 --mask_ratio 0.75 --model_type group_c --dataset_type grid --grouped_bands 0 --grouped_bands 1 --blr 0.0001 --num_workers 8 --input_channels 2 --output_dir ./output_dir --log_dir ./output_dir
#parser.add_argument('--model', default='mae_vit_base_patch16', type=str, metavar='MODEL', help='Name of model to train')
python main_pretrain.py --model mae_vit_base_patch16_small --batch_size 8 --accum_iter 16 --epochs 30 --warmup_epochs 1 --input_size 96 --patch_size 8 --mask_ratio 0.75 --model_type group_c --dataset_type grid --grouped_bands 0 --grouped_bands 1 --blr 0.0001 --num_workers 8 --input_channels 2 --output_dir ./output_dir_small --log_dir ./output_dir_small
python ww_test.py --model mae_vit_base_patch16_small --batch_size 8 --accum_iter 16 --epochs 30 --warmup_epochs 1 --input_size 96 --patch_size 8 --mask_ratio 0.75 --model_type group_c --dataset_type grid --grouped_bands 0 --grouped_bands 1 --blr 0.0001 --num_workers 8 --input_channels 2 --output_dir ./output_dir_small --log_dir ./output_dir_small --weightwatcher_path ww_test_details_small.csv
- Training
- Handle nodata [same thing as below basically]
- Handle nodata
- Handle valid data
- Handle one hot data [although mostly interested in other things here]
- Handle different loss functions
- The Hard Part
- geospatial inference [BASICS DONE]
- Check for edge cases
- reference dataset - take from first of the list [currently hardcoded a trial]
- geospatial inference [BASICS DONE]