Project Report
Project Report
Project Report
Tools/Technique Used
Tools used
• Jupyter Notebook/Colab
• Pycharm
• VS Code
Packages Used
• Tensorflow/Keras
• CV2
• Pytorch
• Scikit
Techniques Used
• Convolutional Neural Network
• VGG16
• ResNet
• Inception Resnet V2
• Stacked Ensemble model
• GAN
• DCGAN
• Color Detection using OpenCV
Convolutional Neural Networks
CNNs are widely used for image classification. Each image is passed
through a series of convolutional layers with filters, MaxPooling,
flattening and dense layer. The extraction of features and learning is
done in the initial few layers.
Resnet
Inception Resnet V2
GAN / DCGAN
We have used the DCGAN for apparel generation during the training phase.
The API will in turn call the DCGAN model. This model uses a combination of
GAN generator along with a series of convolutional layers. The CNN model is
used as a discriminator that will classify the apparel as fake or real image.
In the testing and validation phase with the help of the user interface, we
can call the apparel generator API. The generated garments are then sent as
output to the fashion designer who can then verify the designs that have
been auto generated.
The second part of the system architecture is the stacked ensemble model.
The fashion designer can select any apparel of his/her choice and find out the
features he/she would like to know about the garment. This will help him/her
analyze the data and help in making quick decisions.
The garment is fed to different models for classification. There are five models
which are used as weak learners and stacked against each other to extract
features.
• 2 CNN models
• Resnet
• Resnet-Inception
• VGGNET
For the cloth pattern attribute classification each weak learner model has
distinct features and different accuracy and loss values which when stacked
together will contribute to provide an enhanced accuracy.
The output of all the models is added together and provided as an input to
the stacked ensemble model which is tasked with the classification of the
garment.
The classified output from the ensemble model is provided to be used in the
front-end.
Dataset
The dataset contains around 15k images with details of dress patters.
The dataset was taken from the following https://s3.eu-central-
1.amazonaws.com/fashion-gan/images.zip.
Image Classification
In the first module we have developed classifier model to detect
various attributes from a given image(256x256). The following are the
few attributes we obtain from the image –
1. Pattern
2. Sleeve Length
3. Length
4. Color
5. Fit
6. Neckline
Dress Generation
We have used GAN and DCGAN to generate new dress by training on
a wide range of different dress images. The training was done on colab
with nearly 2k different dress images of resolution (256x256) and
(64x64) respectively for both the models.
System Performance
Pattern Classification
There are 6 output classes for the design patterns of the clothes. We
developed multiple CNN models as base models over which a stacked
ensemble model is built for classification. The distribution of data in
the class is ununiform.
Base Model-1
We have trained the dataset on a sequential Convolutional neural
network with the following architecture and observed a validation
accuracy of 78.01%.
Base Model-2
We have trained the dataset on a VGG Net (VGG 16) and observed a
validation accuracy of 79.66%.
The hyper parameters set were Adam optimizer with a learning rate
of 0.00005. The following is the plot for training and test set loss :
Fig – Pattern Classification Base Model 2 – loss evolution over 5 epochs.
Base Model-3
Fig – Pattern Classification Base Model 3 – loss and accuracy evolution over
30 epochs with dataset of 4000 images.
Base Model-4
Fig – Pattern Classification Base Model 4 – loss and accuracy evolution over
5 epochs with dataset of 15k images.
Fig – Pattern Classification Base Model 4 – Classification Report
Base Model-5
A sequential convolutional neural network was trained, and we
observed an accuracy of 79.81%.
The hyper parameters that were set with many dropout layers with
25% rate in all the layers. Two convolution layers before the dense
layers were set to L2 kernel regularization of penalty 0.001. In this
base model we also used an Adam optimizer with a learning rate of
0.0001 which gave a better accuracy than the base model 1.
Ensemble model
Fig – Sleeve length Classification CNN Model – loss and accuracy evolution
over 20 epochs
Fig – Sleeve length Classification CNN Model – loss and accuracy evolution
over 20 epochs
Neckline Classification
There are 6 output classes for the neckline of the clothes dataset that
are named as neckline-round, neckline-v, neckline-deep, neckline-
wide, neckline-lined and neckline-back. We have incorporated VGG
Net (VGG 16) model for classification of the same. Trained the model
on 9319 images belonging to the above mentioned 6 classes.
Color Prediction
Using CV2 we have extracted the hex-code of predominant color in
the image using k-means color cluster with number of clusters as 2
and the most predominant color is taken as the color of the dress.
DCGAN
CNN-1 78.01%
CNN-2 79.81%
ResNet 76.58%
Inception Resnet 77.87%
VGG Net 79.66%
The results for GAN trained on 256x256 are found to be a little spotty
compared to that of the results produced by DCGAN. The use of
convolutional layers and transpose convolutional layers instead of
fully connected layers has helped in improving the image clarity. Use
of convnet tries to find the areas of correlation within the images
looking for spatial correlations.
Resnet_model.png
inception_resnet_v2.png
Ensemble Model.png
CNN_model_sleevelength.png
**Note – Images are too large to be displayed. We have embedded the images
in the icons. Please double click on the icons to view the image properly.
Fig – 8 – DCGAN Architecture for Discriminator
Bibliography
https://medium.com/the-owl/building-inception-resnet-v2-in-keras-from-
scratch-a3546c4d93f0