R Programming by Adi
R Programming by Adi
to
R-Programming Introduction
Data Objects in R
Variables in R
Operators in R
R Programming
3
Introduction
R Programming:
"R is an interpreted computer programming language which was created by Ross
Ihaka and Robert Gentleman at the University of Auckland, New Zealand."
➢ It is also a software environment used to analyze statistical information, graphical
representation, reporting, and data modeling.
➢ R is the implementation of the S programming language, which is combined with
lexical scoping semantics.
➢ R is one of the most important tool which is used by researchers, data analyst,
statisticians, and marketers for retrieving, cleaning, analyzing, visualizing, and
presenting data.
➢ R allows integration with the procedures written in the C, C++, .Net, Python, and
FORTRAN languages to improve efficiency.
Prof. Dr. K. Adisesha
4
Introduction
History of R Programming:
The history of R goes back about 20-30 years ago. R was developed by Ross lhaka and
Robert Gentleman in the University of Auckland, New Zealand, and the R Development
Core Team currently develops it.
➢ This programming language name is taken from
the name of both the developers.
➢ R communicate with the other languages and
possibly calls Python, Java, C++.
➢ The big data world is also accessible to R. We can
connect R with different databases like Spark or
Hadoop.
Prof. Dr. K. Adisesha
5
Introduction
Features of R programming:
R is a domain-specific programming language which aims to do data analysis. The
most important arguably being the notation of vectors allowing us to perform a
complex operation on a set of values in a single command.
➢ There are the following features of R programming:
❖ R is an interpreted language used as data analysis software.
❖ R supports procedural programming with functions and object-oriented programming
with generic functions.
❖ It has a consistent and incorporated set of tools which are used for data analysis.
❖ It is an open-source, powerful, and highly extensible software.
❖ It provides highly extensible graphical techniques.
❖ It allows us to perform multiple calculations using vectors.
Prof. Dr. K. Adisesha
6
Introduction
Why R programming:
There are various tools available in the market for Data Visualization: R, Power BI,
Spark, Qlikview etc..
➢ R, SAS, and SPSS are three statistical languages. Of these three statistical languages,
R is the only an open source.
➢ Uses of R programming:
❖ R is a programming and statistical language.
❖ R is used for data Analysis and Visualization.
❖ R is simple and easy to learn, read and write.
❖ R is an example of a FLOSS (Free Libre and Open Source Software) where one can
freely
❖ distribute copies of this software, read its source code, modify it, etc.
Prof. Dr. K. Adisesha
7
Introduction
Applications of R:
R programming is used for statistical information and data representation. So it is
required that we should have the knowledge of statistical theory in mathematics.
➢ There are several-applications available in real-time.
❖ Facebook ❖ XBOX ONE
❖ Google ❖ ANZ
❖ Twitter ❖ FDA
❖ HRDAG
❖ Sunlight Foundation
❖ RealClimate
❖ NDAA
Prof. Dr. K. Adisesha
8
Introduction
Installation of R:
R programming is a very popular language and to work on that we have to install two
things, i.e., R and RStudio. R and RStudio works together to create a project on R.
➢ The official site https://cloud.r-project.org provides
binary files for major operating systems including
Windows, Linux, and Mac OS.
➢ First, we have to download the R setup from
https://cloud.r-project.org/bin/windows/base/.
➢ When we click on Download R 3.6.1 for windows,
our downloading will be started of R setup. Once
the downloading is finished, we have to run the
setup
Prof. Dr. of R
K. Adisesha
9
Introduction
Computations in R:
To understand computations in R, two slogans are helpful:
❖ Everything that exists is an object.
❖ Everything that happens is a function call.
➢ Variables, Datatypes in R: Everything in R is an object. R has 5 atomic vector types.
By atomic, we mean the vector only holds data of a single type.
❖ Character : "a", “adi"
❖ Numeric (real or decimal) : 10, 25.5
❖ Integer : 2L (the L tells R to store this as an integer)
❖ Logical : TRUE, FALSE
❖ Complex : 1+4i (complex numbers with real and imaginary parts)
Prof. Dr. K. Adisesha
12
Introduction
Computations in R:
R provides many functions to examine features of vectors and other objects, for
example:
➢ class() - what kind of object is it (high-level)?
➢ typeof() - what is the object’s data type (low-level)?
➢ length() - how long is it? What about two dimensional objects?
➢ attributes() - does it have any metadata?
v = “Adisesha" v = 2L
print(class(v)) print(class(v))
o/pt: [1] "character“ o/p: [1] "integer"
Data Objects in R:
Data types are used to store information. In R, we do not need to declare a variable as
some data type.
➢ The variables are assigned with R-Objects and the data type of the R-object becomes
the data type of the variable.
➢ There are mainly six data types present in R:
❖ Vectors
❖ Lists
❖ Matrices
❖ Arrays
❖ Factors
❖ Data Frames
Prof. Dr. K. Adisesha
14
Data Objects in R
List: Lists are the R objects which contain elements of different types like − numbers,
strings, vectors and another list inside it.
➢ A list can also contain a matrix or a function as its elements. List is created using list()
function.
>n = c(1, 2, 3, 4, 5) O/p –
>s = c("adi", “sunny", “prajwal") [[1]]
>x = list(n, s, TRUE) [1] 1 2 3 4 5
>x [[2]]
[1] "adi", “sunny", “prajwal"
[[3]]
[1] TRUE
Matrices: Matrices are the R objects in which the elements are arranged in a two-
dimensional rectangular layout. A Matrix is created using the matrix() function.
➢ A list can also contain a matrix or a function as its elements. List is created using list()
function.
➢ Example: matrix (data, nrow, ncol, byrow, dimnames) where,
❖ data is the input vector which becomes the data elements of the matrix.
❖ nrow is the number of rows to be created.
❖ ncol is the number of columns to be created.
❖ byrow is a logical clue. If TRUE then the input vector elements are arranged by row.
❖ dimname is the names assigned to the rows and columns. o/p: [,1] [,2] [,3]
[1,] "a" "a" "b"
>M = matrix( c('a','a','b','c','b','a'), nrow = 2, ncol = 3, byrow = TRUE)
[2,] "c" "b" "a"
>print(M)
Prof. Dr. K. Adisesha
17
Data Objects in R
Arrays: Arrays are the R data objects which can store data in more than two-
dimensions.
➢ For example − If we create an array of dimension (2, 3, 4) then it creates 4 rectangular
matrices each with 2 rows and 3 columns. While matrices are confined to two
dimensions, arrays can be of any number of dimensions.
➢ An array is created using the array() function. It takes vectors as input and uses the
values in the dim parameter to create an array.
➢ Example: Here we create two arrays with two elements which are 3x3 matrices each
>v1 <- c(5,9,3) Output – , , 1 Output – , , 2
>v2 <- c(10,11,12,13,14,15) [,1] [,2] [,3] [,1] [,2] [,3]
[1,] 5 10 13 [1,] 5 10 13
>result<- array(c(v1,v2),dim = c(3,3,2))
[2,] 9 11 14 [2,] 9 11 14
>result [3,] 3 12 15 [3,] 3 12 15
Prof. Dr. K. Adisesha
18
Data Objects in R
Factors: Factors are the data objects which are used to categorize the data and store it
as levels. They can store both strings and integers.
➢ They are useful in data analysis for statistical modeling.
➢ Factors are created using the factor() function. The n-levels functions gives the count of
levels.
➢ Example: # Create a factor object from vector:
apple_colors<- c('green','green','yellow','red','red','red','green')
# Create a factor object.
factor_apple<- factor(apple_colors) Output –
# Print the factor. [1] green green yellow red red red green
print(factor_apple) Levels: green red yellow
print(nlevels(factor_apple)) [1] 3
Prof. Dr. K. Adisesha
19
Data Objects in R
Variables: A variable provides us with named storage that our programs can
manipulate.
➢ A variable in R can store an atomic vector, group of atomic vectors or a combination of
many R objects.
➢ A valid variable name consists of letters, numbers and the dot or underline characters.
➢ The variable name starts with a letter or the dot not followed by a number.
➢ The variables can be assigned values using leftward, rightward and equal to operator.
➢ The values of the variables can be printed using print() or cat()function.
➢ Example: var_name2. is valid it has letters, numbers, dot and underscore.
var_name% is Invalid it has the character '%'. Only dot(.) and underscore allowed.
# Assignment using equal operator. var.1 = c(0,1,2,3)
# Assignment using leftward operator. var.2 <- c("learn","R")
# K.Assignment
Prof. Dr. Adisesha using rightward operator. c(TRUE,1) -> var.3
21
Variables in R
Data Type of a Variable: In R, a variable itself is not declared of any data type,
rather it gets the data type of the R - object assigned to it.
➢ So R is called a dynamically typed language, which means that we can change a
variable’s data type of the same variable again and again when using it in a program.
➢ To know all the variables currently available in the workspace we use the ls() function.
print(ls())
➢ Variables can be deleted by using the rm() function. rm(var.1)
➢ Example:
> var_x<- "Hello"
> cat("The class of var_x is ",class(var_x),"\n")
> var_x<- 34.5
> cat(" Now the class of var_x is ",class(var_x),"\n")
Prof. Dr. K. Adisesha
22
Keywords in R
R – Keywords: Keywords are specific reserved words in R, each of which has a specific
feature associated with it.
➢ Almost all of the words which help one to use the functionality of the R language are
included in the list of keywords.
➢ In R, one can view these keywords by using either help(reserved) or ?reserved.
➢ Example: if in NaN
else next NA
while break NA_integer
repeat TRUE NA_real
for FALSE NA_complex_
function NULL NA_character_
Inf
Prof. Dr. K. Adisesha
23
Operators in R
Miscellaneous Operators: These are the mixed operators in R that simulate the
printing of sequences and assignment of vectors, either left or right-handed.
➢ There are two kinds of Miscellaneous Operators:
❖ %in% Operator: Checks if an element belongs to a list and returns a boolean value TRUE if
the value is present else FALSE.
val <- 0.1
list1 <- c(TRUE, 0.1,"apple")
print (val %in% list1)
Output : TRUE Checks for the value 0.1 in the specified list. It exists, therefore, prints TRUE.
❖ %*% Operator: This operator is used to multiply a matrix with its transpose. Transpose of
the matrix is obtained by interchanging the rows to columns and columns to rows.
pro = mat %*% t(mat)
Prof. Dr. K. Adisesha
print(pro)
32
Operators in R
Miscellaneous Operators: These are the mixed operators in R that simulate the
printing of sequences and assignment of vectors, either left or right-handed.
❖ %*% Operator: This operator is used to multiply a matrix with its transpose. Transpose of
the matrix is obtained by interchanging the rows to columns and columns to rows.
# R program to illustrate the use of Miscellaneous operators
mat <- matrix (1:4, nrow = 1, ncol = 4) Output
print("Matrix elements using : ") [1] "Matrix elements using : "
print(mat) [,1] [,2] [,3] [,4]
product = mat %*% t(mat) [1,] 1 2 3 4
print("Product of matrices")
[1] "Product of matrices"
print(product,) [,1]
cat ("does 1 exist in prod matrix :", "1" %in% product) [1,] 30
Prof. Dr. K. Adisesha
does 1 exist in prod matrix : FALSE
33
Control Statements
Classes in R Programming:
Classes and Objects are basic concepts of Object-Oriented Programming that revolve
around the real-life entities. Everything in R is an object.
➢ An object is simply a data structure that has some methods and attributes. A class is just
a blueprint or a sketch of these objects.
➢ It represents the set of properties or methods that are common to all objects of one type.
➢ Unlike most other programming languages, R has a three-class system.
❖ S3 Classes
❖ S4 Classes
❖ Reference Classes
Classes in R Programming:
S3 Class : S3 is the simplest yet the most popular OOP system and it lacks formal
definition and structure..
➢ An object of this type can be created by just adding an attribute to it.
➢ In S3 systems, methods don’t belong to the class. They belong to generic functions.
Example:
# create a list with required components
Course <- list(name = “K. Adi", Dept = “Computers") Output:
$name
# give a name to your class [1] “K. Adi"
class(BCA) <- Course $Dept
print(BCA) [1] “Computers"
Prof. Dr. K. Adisesha
43
R Programming
Classes in R Programming:
S4 Class : Programmers of other languages like C++, Java might find S3 to be very
much different than their normal idea of classes as it lacks the structure that classes are
supposed to provide.
➢ S4 is a slight improvement over S3 as its objects have a proper definition and it gives a
proper structure to its objects.
➢ setClass() is used to define a class and new() is used to create the objects.
➢ Example: Output:
An object of class “Course"
library(methods) # definition of S4 class
Slot "name":
setClass(“Course", slots=list(name="character", Subject = "character"))
[1] “Adi"
# creating an object using new() by passing class name and slot values
Slot “Subject":
Course <- new(“Dept", name=“Adi", Subject = “R Programming")
Prof. Dr. K. Adisesha
[1] “R Programming"
44
R Programming
Classes in R Programming:
Reference Class: Reference Class is an improvement over S4 Class. Here the methods
belong to the classes. These are much similar to object-oriented classes of other
languages.
➢ Defining a Reference class is similar to defining S4 classes. We use setRefClass()
instead of setClass() and “fields” instead of “slots”.
Example: library(methods)
# setRefClass returns a generator Output:
Course <- setRefClass(“Course", fields = list(name = "character", Reference class object of class “Course"
Subject = "character",)) Field "name":
#now we can use the generator to create objects [1] “Adi"
Field “Subject":
Dept <- Course(name = “Adi", Subject = “R Pgm", )
[1] “R Prog"
Dept
Prof. Dr. K. Adisesha
45
R Programming
Data Visualization in R:
Data visualization is the technique used to deliver insights in data using visual cues such
as graphs, charts, maps, and many others.
➢ This is useful as it helps in intuitive and easy understanding of the large quantities of
data and thereby make better decisions regarding it.
➢ Some of the various types of visualizations offered by R are:
❖ R Plotting: The plot() function is used to draw points (markers) in a diagram.
❖ R Line: To create a line, use the plot() function and add the type parameter with a value "l":
❖ R Scatter Plot: A "scatter plot" is a type of plot used to display the relationship between two
numerical variables, and plots one dot for each observation along x-axis and y-axis.
❖ R Pie Charts: A pie chart is a circular graphical view of data using the pie() function.
❖ R Bar Charts: A bar chart uses rectangular bars to visualize data using the barplot() func.
Prof. Dr. K. Adisesha
49
R Programming
Data Visualization in R:
Data visualization is the technique used to deliver insights in data using visual cues such
as graphs, charts, maps, and many others.
Queries ?
Prof. K. Adisesha
9449081542