Shivam Vora - Sa Exp 1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Academic Year 2021-22 SAP ID: 60003190051

DEPARTMENT OF INFORMATION TECHNOLOGY

COURSE CODE: DJ19TEL5013 DATE: 20/09/21


COURSE NAME: Statistical Analysis Lab CLASS: TY B. TECH

EXPERIMENT NO. 1
CO/LO: Apply the basic statistical concepts for data sampling, probability distributions

and summarize them by using suitable diagrams.

AIM / OBJECTIVE:

To get started with R programming language for statistical analysis followed by its installation
and understanding of the basic data types of the language.

DESCRIPTION OF EXPERIMENT:

This experiment involves:

• Understanding the basic history of the R programming language and getting a gist about
the evolution and features of the language for it to become so popular for statistical and
data analysis.
• For working on R language, it is necessary to install IDE supporting R language. Thus, R
Studio provides all the features.
• After installation, we study the basic computations in R language, learning about the
essentials in R programming.

PROCEDURE:
We divide the section into two segments:
1. Essentials of R programming
R has 5 types of class objects. Character, Numeric, Integer, Complex and Logical. These classes
have attributes. These are name, dimensions, class, length. Since everything in this language is an
object, things revolve around these class objects.
Academic Year 2021-22 SAP ID: 60003190051

2. Data types in R

R comprises of four types of data types:

a. Vector

A vector is a (m * 1) or (1* m) matrix, which contains object of the same class. We can create a
vector using c ().

Eg: a = c (1.2, 1.3, 1.4)

b. List:

A list is a special kind of vector which can hold distinct types of objects unlike vectors holding
only objects of same class.

Eg: lst = list (22, “ab”, 1.3, TRUE, 1 – 2i)

c. Matrices:

When vector is introduced using rows and columns then it becomes matrix. It is a 2d data structure.
It holds elements of the same class.

Eg: mat = matrix (1:6, nrow=3, ncol=2)

d. Data Frame:

This is used to store tabular data. It is a special form of matrix where it can hold multiple lists of
vectors containing objects of different classes. Here every column act like a list. Each data read in
R is in the form of a data frame.

Eg: df = data. frame (name=c (‘a’, ‘b’), score=c (95, 96))

TECHNOLOGY STACK USED: R Programming


Academic Year 2021-22 SAP ID: 60003190051

EXERCISE:
1. Compare R and Python Programming language.

SOLUTION:
R Programming Python Programming

R is a statistical programming language that Python is an all-purpose programming


integrates statistical computing and graphics. language that can be used for data analysis,
data science and machine learning
It offers a lot of features that help with It can be used to create Gui and web
statistical analysis and visualization. applications.

2. What are advantages of R programming?

SOLUTION:
• R is a well-developed, simple, and effective programming language which includes
conditionals, loops, user defined recursive functions and input and output facilities.
• R has an effective data handling and storage facility,
• R provides a suite of operators for calculations on arrays, lists, vectors, and matrices.
• R provides a large, coherent, and integrated collection of tools for data analysis.
• R provides graphical facilities for data analysis and display either directly at the computer
or printing at the papers.

3. Write a R program to create a vector which contains 10 random integer values between -50
and +50.
v = sample (-50:50, 10, replace=TRUE)
print ("The vector:")
print ("10 random integer values between -50 and +50:")
print(v)

OUTPUT:
> v = sample (-50:50, 10, replace=TRUE)
> print ("The vector:")
Academic Year 2021-22 SAP ID: 60003190051

[1] "The vector:"


> print ("10 random integer values between -50 and +50:")
[1] "10 random integer values between -50 and +50:"
> print(v)
[1] -34 12 -41 -42 25 10 33 -14 -21 -42

4. Write a R program to find the maximum and the minimum value of a given vector.

vector1 = c (1, 2, 3, 4, 5, 6)
print ('The vector:')
print(vector1)
print (paste ("Maximum value of the vector:",max(vector1)))
print (paste ("Minimum value of the vector:",min(vector1)))

OUTPUT:
> vector1 = c (1, 2, 3, 4, 5, 6)
> print ('The vector:')
[1] "The vector:"
> print(vector1)
[1] 1 2 3 4 5 6
> print (paste ("Maximum value of the vector:”, max(vector1)))
[1] "Maximum value of the vector: 6"
> print (paste ("Minimum value of the vector:”, min(vector1)))
[1] "Minimum value of the vector: 1"
5. Write a R program to create three vectors a, b, c with 3 integers. Combine the three vectors
to become a 3×3 matrix where each column represents a vector. Print the content of the
matrix
a=c (2,4,6)
b=c (8,10,12)
c=c (14,16,18)
d=cbind(a,b,c)
print(d)
Academic Year 2021-22 SAP ID: 60003190051

OUTPUT:
a=c (2,4,6)
> b=c (8,10,12)
> c=c (14,16,18)
> d=cbind (a, b, c)
> print(d)
a b c
[1,] 2 8 14
[2,] 4 10 16
[3,] 6 12 18

6. Write a R program to create an array with three columns, three rows, and two "tables",
taking two vectors as input to the array. Print the array.

vector1=c (10,20,30,40)
vector2=c (50,60,70,80,90)
array1=array (c (vector1, vector2), dim=c(3,3,2))
print ("The array is:")
print(array1)

OUTPUT:
> vector1=c (10,20,30,40)
> vector2=c (50,60,70,80,90)
> array1=array (c (vector1, vector2),dim=c(3,3,2))
> print ("The array is:")
[1] "The array is:"
> print(array1)
,, 1
[,1] [,2] [,3]
Academic Year 2021-22 SAP ID: 60003190051

[1,] 10 40 70
[2,] 20 50 80
[3,] 30 60 90
,, 2
[,1] [,2] [,3]
[1,] 10 40 70
[2,] 20 50 80
[3,] 30 60 90

7. Write a R program to create a Data frame which contain details of 5 employees and
display summary of the data.

df=data.frame(name=c("ash","jane","paul","mark"),id=c(67,56,87,91),age=c(23,21,25,26),gen
der=c("m","m","m","m"))
print("Summary")
print(summary(df))

OUTPUT:
> df= data.frame(name = c("ash","jane","paul","mark"),id = c(67,56,87,91),age =
c(23,21,25,26),gender=c("m","m","m","m"))
> print("Summary")
[1] "Summary"
> print(summary(df))
name id age gender
Length:4 Min. :56.00 Min. :21.00 Length:4
Class :character 1st Qu.:64.25 1st Qu.:22.50 Class :character
Mode :character Median :77.00 Median :24.00 Mode :character
Mean :75.25 Mean :23.75
3rd Qu.:88.00 3rd Qu.:25.25
Max. :91.00 Max. :26.00
Academic Year 2021-22 SAP ID: 60003190051

OBSERVATIONS / DISCUSSION OF RESULT:


From all the studies and practical experiment performed, we observe that R is a perfect language
for deep data analytics, catering from processing to visualization of data with correspondingly
being able to perform mathematical perform on very large set of data without and issues. It
provides large set of options with an easy-to-use code syntax. We were able to work around with
all objects of classes and all the data types by performing some operations on them.

CONCLUSION:
Thus, we successfully install R studio and work on the basics of R language summarizing its
different data types, essentials, class objects followed by some operations on them.

REFERENCES:
Website References:
[1] https://www.ibm.com/cloud/blog/python-vs-r
[2] https://data-flair.training/blogs/pros-andcons-of-r-programming-language/

You might also like