tensorflow_cifar10

This folder includes some configuration examples for submmitting jobs to OpenPAI platform.

Here we provides 4 examples. The examples use VGG16 model on the CIFAR-10 task, trained with

The performance of the tasks with different hyper-parameters (batch-size) are shown in the table below:

With batch-size=32:

Mode	Accuracy	Run Time	Job Level Metrics
4 CPU	97.29%	5h 3m	Details
1 GPU	97.44%	48m	Details
4 GPU (with tensorflow distributed training)	96.17%	52m	Details
4 GPU (with horovod)	97.28%	1h 3m	Details

With batch-size=256:

Mode	Accuracy	Run Time	Job Level Metrics
4 CPU	95.62%	5h 15m	Details
1 GPU	95.57%	42m	Details
4 GPU (with tensorflow distributed training)	93.99%	24m	Details
4 GPU (with horovod)	90.00%	28m	Details

Name		Name	Last commit message	Last commit date
parent directory ..
metrics		metrics
src		src
README.md		README.md
cifar10_vgg16_tf_cpu.yaml		cifar10_vgg16_tf_cpu.yaml
cifar10_vgg16_tf_gpu.yaml		cifar10_vgg16_tf_gpu.yaml
cifar10_vgg16_tf_gpu_distributed.yaml		cifar10_vgg16_tf_gpu_distributed.yaml
cifar10_vgg16_tf_gpu_horovod.yaml		cifar10_vgg16_tf_gpu_horovod.yaml

Provide feedback