Mini Project - Cold Storage Case Study

Mini Project – Cold Storage Case Study

Project Report
Table of Contents

1 Project Objective ............................................................................................................................. 3

2 Assumptions .................................................................................................................................... 3

3 Exploratory Data Analysis – Step by step approach ....................................................................... 3

3.1 Environment Set up and Data Import ..................................................................................... 3

3.1.1 Install necessary Packages and Invoke Libraries ............................................................. 3

3.1.2 Set up working Directory ................................................................................................ 3

3.1.3 Import and Read the Dataset .......................................................................................... 4

3.2 Variable Identification ............................................................................................................. 4

3.2.1 Variable Identification – Inferences ................................................................................ 4

3.3 Univariate Analysis .................................................................................................................. 4

3.4 Bi-Variate Analysis................................................................................................................... 5

3.5 Variable Transformation / Feature Creation .......................................................................... 5

4 Conclusion ....................................................................................................................................... 5

5 Appendix A – Source Code .............................................................................................................. 5

1. Project Objective

The objective of the report is to explore the cold storage data set (“Cold
Storage Case Study”) in R and generate insights about the data set. This
exploration report will consists of the following:

 Importing the dataset in R

 Understanding the structure of dataset
 Graphical exploration
 Descriptive statistics
 Insights from the dataset

2. Assumptions

 The freshness of the products is expected to remain good under the

temperature range of 2 to 4 degree celsius.
 Designing of a cold store and choosing suitable cooling system are
important for effective cooling and creating suitable storage conditions.

3. Exploratory Data Analysis – Step by step approach

A Typical Data exploration activity consists of the following steps:

1. Environment Set up and Data Import

2. Variable Identification
3. Univariate Analysis
4. Bi-Variate Analysis
5. Variable Transformation / Feature Creation
6. Feature Exploration

3.1 Environment Set up and Data Import

3.1.1 Install necessary Packages and Invoke Libraries

Use this section to install necessary packages and invoke associated libraries.
Having all the packages at the same places increases code readability.

List of packages to be installed:

1. library(readr)
2. library(ggplot2)
3. library(readxl)

3.1.2 Set up working Directory

Setting a working directory on starting of the R session makes importing and

exporting data files and code files easier. Basically, working directory is the
location/ folder on the PC where you have the data, codes etc. related to the

Please refer Appendix A for Source Code.

3.1.3 Import and Read the Dataset

The given dataset is in .csv format. Hence, the command ‘read.csv’ is used for
importing the file.

Please refer Appendix A for Source Code.

3.2 Variable Identification

meantemp = Used to calculate the mean temperature for full year

sdtemp = Used to calculate the standard deviation of temperature for full year
probtemp = Probability of temperature having fallen below 2 deg C
probtemp2 = Probability of temperature having gone above 4 deg C
P = Probability of penalty for the AMC Company

3.2.1 Variable Identification – Inferences

setwd() = To set working directory

getwd() = To get working directory
attach() = By attaching you can call variables directly (you could avoid using $)
summary() = To analyze the data
nrow() = For number of Samples
ncol() = For number of independent variables
dim() = For dimensions of the data
str() = To understand datatype for each variable
plot() = For graphical representation of data
col() = For colour in box plot
mean() = To calculate the mean
sd() = To calculate the standard deviation
pnorm() = To calculate the probability
aggregate() = To calculate the mean temperature season wise
list() = To list

3.3 Univariate Analysis

Dataframe ATemp is used to tabulate the mean temperature season wise.

Season Mean Temperature

Rainy 3.039344
Summer 3.153333
Winter 2.70813

3.4 Bi-Variate Analysis

Plot function is used to graphically represent the mean cold storage

temperature season wise.

3.5 Variable Transformation / Feature Creation

No need was seen of transforming any variable, few new variables were
created for better understanding of the data, and presenting the results.
4. Conclusion

1. The mean cold storage temperature found season wise was;

 Rainy – 3.039344
 Summer – 3.153333
 Winter – 2.70813

2. Overall mean temperature calculated for full year is 2.96

3. Overall standard deviation calculated for full year is 0.50
4. Probability of temperature having fallen below 2 deg C was calculated to
be 2.91%
5. Probability of temperature having gone above 4 deg C was calculated to
be 2.07%
6. The penalty for the AMC Company calculated is 10%

5. Appendix A – Source Code

#Environment setup and data import

# Set Working Directory
setwd("C:\Users\PKG\Desktop\R Files")

# Get Working Directory


# Importing data
mydata = read.csv("Cold_Storage_Temp_Data (1).csv", header = TRUE)

#To view your dataset in R window


#By attaching you can call variables directly (you could avoid using $)

#Analyzing/Summary data

#Dimensions of the data

nrow(mydata)# Number of Samples
ncol(mydata)# Number of independent variables

#Total no of records : 365 and 4 variables/columns

#Datatype for each variable


#Exploratory Data Analysis#

# Question 1

#Mean cold storage temperature for Summer, Winter and Rainy Season
#Creating a data frame for same
ATemp = aggregate(Temperature,

#Viewing dataframe

#Calling function

#Plotted a graph to understand season wise mean temperature

plot(Season, Temperature, horizontal = TRUE,
geom = "boxplot", col = ("Blue"),
main = "Mean cold storage temperature season wise",
xlab = "Temperature",
ylab = "Season")

#Question 2
#Overall mean for the full year
#Created a variable to store the mean temperature for full year
meantemp = mean(Temperature)

#Question 3

#Standard Deviation for the full year

#Created a variable to store the standard deviation for full year
sdtemp = sd(Temperature)

#Question 4
#probability of temperature having fallen below 2 deg C
probtemp = pnorm(2,

#To view probability in R window


#Question 5
#probability of temperature having gone above 4 deg C
probtemp2 = pnorm(4,
lower.tail = FALSE)

#To view probability in R window


#Question 6
#To calculate penalty for the AMC Company
P = probtemp + probtemp2

**The End**

