Skip to content
This repository has been archived by the owner on Jun 6, 2024. It is now read-only.

Latest commit

 

History

History
 
 

tensorflow_cifar10

This folder includes some configuration examples for submmitting jobs to OpenPAI platform.

Here we provides 4 examples. The examples use VGG16 model on the CIFAR-10 task, trained with

  • 4 CPU (Intel(R) Xeon(R) CPU E5-2690 v3): cifar10_vgg16_tf_cpu.yaml
  • 1 GPU (Tesla K80): cifar10_vgg16_tf_gpu.yaml
  • 4 GPU (with tensorflow native distributed training): cifar10_vgg16_tf_gpu.yaml
  • 4 GPU (with horovod): cifar10_vgg16_tf_gpu.yaml

The performance of the tasks with different hyper-parameters (batch-size) are shown in the table below:

With batch-size=32:

Mode Accuracy Run Time Job Level Metrics
4 CPU 97.29% 5h 3m Details
1 GPU 97.44% 48m Details
4 GPU (with tensorflow distributed training) 96.17% 52m Details
4 GPU (with horovod) 97.28% 1h 3m Details

With batch-size=256:

Mode Accuracy Run Time Job Level Metrics
4 CPU 95.62% 5h 15m Details
1 GPU 95.57% 42m Details
4 GPU (with tensorflow distributed training) 93.99% 24m Details
4 GPU (with horovod) 90.00% 28m Details