0% found this document useful (0 votes)
123 views

Data Analytics Using R

Uploaded by

Sowndarya C
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
123 views

Data Analytics Using R

Uploaded by

Sowndarya C
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 37

Data Analytics using R

Programming
By
Dr. K. Sasirekha M.C.A., M.Phil., Ph.D.,
Department of Computer Science
Periyar University

1
Agenda

• Data Analytics – Basics


• Getting Started with R
• Some basic operations in R
• Packages in R
• Data Analytics with R

2
Data Analytics - Basics
What is Data?
Data

Qualitative Quantitative

Discrete Continuous
5 3.45

3
Data Analytics - Basics
What is Data Analytics?

• Analyzing raw data in order to make


conclusions about that information

4
Data Analytics - Basics
• Descriptive analytics What has happened ?

• Diagnostic analytics What has happened in


depth?

• Predictive analytics What might happen ?

• Prescriptive analytics What should we do ?

5
Data Analytics - Basics

Applications
• Business Analytics
• Health Analytics
• Web Analytics
• Risk Analytics

6
Getting Started with R

• R (the language) was created in the early


1990s
 is based upon the S language
 is a high-level language
 is an interpreted language

7
Installing R
• To install R you must first go to
http://www.r-project.org
• Once you’ve chosen a mirror close to you,
click that link and select your platform.

8
Choosing an IDE
• If you use R under Windows or Mac OS X, then a graphical

user interface (GUI) is available to you.

• Some of he best GUIs are:


 Eclipse/Architect
 RStudio
 Revolution-R
 Live-R
 Tinn-R
https://www.rstudio.com/
9
Variable Assignment
• Assign values to variables with the assignment operator "=“

• Note that another form of assignment operator "<-" is also in use


> X = 2;
X
[1] 2

> X <- 5
X
[1] 5

• Comment : #
10
Basic Data Types
 Numeric
 Integer
 Complex
 Logical
 Character
 Factor
 Date

11
Numeric
> x = 10.5 # assign a decimal value
>x # print the value of x
[1] 10.5

> class(x) # print the class name of x


[1] "numeric"

12
Integer
• In order to create an integer variable in R, we invoke the as.integer
function.

For example,

> y = as.integer(3)
>y # print the value of y
[1] 3
> class(y) # print the class name of y
[1] "integer"
> is.integer(y) # is y an integer?
[1] TRUE
13
Complex
• A complex value in R is defined via the pure imaginary
value i
• For example,

> z = 1 + 2i # create a complex number


>z # print the value of z
[1] 1+2i

> class(z) # print the class name of z


[1] "complex“
14
Logical
> x = 1; y = 2 # sample values
>z=x>y # is x larger than y?
>z # print the logical value

[1] FALSE

> class(z) # print the class name of z

[1] "logical"
15
Character

> x = as.character( “hai”)

>x # print the character string

[1] “hai”

> class(x) # print the class name of x

[1] "character"
16
Date
> temp <- c("12-09-1973")
> z <- as.Date(temp, "%d-%m-%Y")
>z
[1] "1973-09-12”

> class(z)
[1] "Date"

17
Data structures

 Before you can perform statistical analysis in R, your


data has to be structured in some coherent way. To
store your data R has the following structures:

 Vector
 Matrix
 Array
 Data frame
 List
 Time-series
18
Vector
 A vector is a sequence of data elements of the same basic type.

 For example, Here is a vector containing three numeric values 2, 3, 5.

> c(2, 3, 5)

[1] 2 3 5
 Here is a vector of logical values.

> c(TRUE, FALSE, TRUE, FALSE, FALSE)


[1] TRUE FALSE TRUE FALSE FALSE
19
Vector Operations
#creating vector using ':' operator
a = 1:5; a
b = -3:4; b

#creating vector using seq function


c=seq(from=1, to=10, by=2); c

#Access Elements of a Vector


a[3]
a[1:3]
a[c(F,T,T,F,T)]
20
Vector Operations Cont’d
#Performing Vector Arithmetic
a = 1:4; b = 5:8 ;
a
b
c = a + b; c
c = a - b; c
c =a * b; c
c = a / b; c
c = a + (b)^2; c
c = 2+a; c

c =2+3*b; c
c =(2+3)*b; c
21
Vector Operations Cont’d
#Vector Repetition
e=rep(5,4) ; e

# Replace single element


e[1]=10
e
e=e[e!=10]
e

#Delete single element


e=e[-1]
e

#Delete Entire Vector


e= NULL
e
22
Matrix Operations
#A matrix is a two-dimensional array
#Creating a Matrix
A=matrix(1:9, nrow = 3); A
B=matrix(1:9, nrow=3, byrow=TRUE); B

#Access Elements of a matrix


A[2, 3]
A[2, ]
A[ ,3]

#Combining Matrices
a = matrix(1:9, 3,3); a
b = matrix(10:18, 3,3); b
cbind(a,b)
rbind(a,b)
23
Matrix Operations Cont’d
#Matrix Arithmetic
c = a+b; c
c = a-b; c
c = a*b; c
c = a/b; c

#Modify Matrix Elements


a[3,3] = 0; a
a[a > 5] = 0; a
24
Array
• In R, Arrays are generalizations of vectors and
matrices.
> z = array(1:27,dim=c(3,3,3))

> dim(z)
[1] 3 3 3

print(z)

z[,,3]
25
List Operations
# A list contain elements of different types like − numbers, strings,
vectors
mylist= list( c(1, 1, 2, 5, 14, 42), month.abb, matrix(c(3, -8, 1, -3),
nrow = 2))
mylist

#Naming list elements


names(mylist) = c("numbers", "months", "matrix")
mylist
#A list’s length is the number of top-level elements that it
contains
length(mylist)

26
List Operations Cont’d
#Arithmetic operations on list
L1 = list(1:5);
L1

L2 = list(6:10);
L2

L1[[1]] + L2[[1]]
L1[[1]] - L2[[1]]
L1[[1]] * L2[[1]]
L1[[1]] / L2[[1]]
27
Data Frame
#Data frame is a two dimensional data structure in R

#hold different type of data

#A data frame is created with the data.frame() function

#mydata <- data.frame(col1, col2,.,colN)

#where col1, col2, col3, . are column vectors of any type


(such as character, numeric, or logical)
28
Data Frame Operations
#Creating a Data Frame
patientID <- c(1, 2, 3, 4)
age <- c(25, 34, 28, 52)
diabetes <- c("Type1", "Type2", "Type1", "Type1")
status <- c("Poor", "Improved", "Excellent", "Poor")

patientdata <- data.frame(patientID, age, diabetes, status)


patientdata

#Data Frame Properties


nrow(patientdata)
ncol(patientdata)
29
Data Frame Operations
#Accessing of a elements in Data Frame
patientdata[1:2]

#Modifying elements in Data Frame


patientdata[1, "age"] <- 30
patientdata

#Adding elements to a Data Frame


patientdata <- rbind(patientdata, list(5, 40, "Type2", "Improved"))
Patientdata

#Deleting Components from Data Frame


patientdata$gender <- NULL
patientdata
patientdata[-5,]
30
Function and Control Stmt
#A series of numbers in which each number #is the sum of the two preceding numbers.
#The simplest is the series 1, 1, 2, 3, 5, 8, etc.

Fibonacci <- function(n)


{
#if else Statement
if (n==1)
{
x <- 0
}
else
{
x <- c(0,1)
# While Loop
while (length(x) < n)
{
position <- length(x)
new <- x[position] + x[position-1]
x <- c(x,new)
}
}
return(x)
} 31
Packages

 Packages are collections of R functions, compiled code, data,


documentation, and tests, in a well-defined format.

 The directory where packages are stored is called the library.

 R comes with a standard set of packages.

 Others are available for download and installation.

>library() # see all packages installed


>install.packages("class")
>search() # see packages currently loaded
32
Packages Cont’d

• Adding Packages

33
Statistical Operations
#to get the iris dataset
dm=iris[,-5]

#dataset to convert into matrix


dm=as.matrix(dm)

meandm=mean(dm)
meandm

mediandm=median(dm)
mediandm

sddm=sd(dm)
sddm
34
Data Exploration Operations
s=c(50,80,90,25,70)

maximum=max(s)
minimum=min(s)

total=sum(s)
average=ave(s)

squareroot=sqrt(s)
round=round(squareroot)

Summary ()
35
DATA VISUALIZATION OPERATIONS
#Visualization of Average Rainfall in India for Last 10 Years

Year=c(2009,2010,2011,2012,2013,2014,2015,2016,2017,2018);
Rainfall=c(69.43,43.15,35.23,50.03,60.02,47.62,48.38,38.69,52.48,58.18);

names(Rainfall)=Year

#Pie Chart
pie(Rainfall,col=Year,main="Average Rainfall in India for Last 10 Years")

#Bar Chart
barplot(Rainfall,col=Year, main="Average Rainfall in India for Last 10 Years")

36
DATA VISUALIZATION OPERATIONS Cont’d

#Histograms
hist(Rainfall,col="yellow", border="blue")

#Line Graph
plot(Year,Rainfall,type='o', col="blue", main="Average
Rainfall in India for Last 10 Years")

#Scatterplot
plot(Year, Rainfall, col="red", main="Average Rainfall in
India for Last 10 Years")
37

You might also like