r_programme notes
r_programme notes
Part-III Semester V
SUBJECT – STATISTICS - XII
DSE-E16: R-Programming and Quality Management
Unit-1: R Programming:
1.1: Introduction :History, Feathers of R, Character sets,
Identifiers: Variable, Constants, Symbolic constant, key words, Data Types and
Data Structure.
Features of R
As stated earlier, R is a programming language and software environment for
statistical analysis, graphics representation and reporting. The following are the
important features of R:
R is a well-developed, simple and effective programming language which includes
conditionals, loops, user defined recursive functions and input and output facilities.
R is use large collection of tools of data collection.
R has an effective data handling and storage facility,
R provides a suite of operators for calculations on arrays, lists, vectors and
matrices.
R provides a large, coherent and integrated collection of tools for data analysis.
R provides graphical facilities for data analysis and display either directly at the
computer or printing at the papers.
Identifiers
Variables are used to store data, whose value can be changed according to our
need. Unique name given to variable (function and objects as well) is identifier.
Rules for writing Identifiers
1. Identifiers can be a combination of letters, digits, period (.) and underscore (_).
2. It must start with a letter or a period. If it starts with a period, it cannot be followed
by a digit.
3. Reserved words in R cannot be used as identifiers.
Example:
Valid identifiers
Sum, .fine.with.dot, this_is_acceptable, Number5
Invalid identifiers
tot@l, 5um, _fine, TRUE, .0ne
1. Variable
A variable provides us with named storage that our programs can manipulate.
A variable in R can store an atomic vector, group of atomic vectors or a
combination of many R objects. A valid variable name consists of letters,
numbers and the dot or underline characters. The variable name starts with a
letter or the dot not followed by a number. Variables are used to store dat
Types of variable
1) Boolean Variables: This is the simplest type of variable. It contains a single bit,
and indicate a binary result (0 and 1, yes and no, or true and false).
e.g. a = TRUE
b = FALSE
2) Integer variables: Numbers with no floating point are called integers. In R
programming, sometimes it is difficult to declare a single integer. In most cases,
try to do so will actually declare a numeric value.
3) Numeric Variables: Numeric variables are used to store numbers. It can contain
floating point numbers.
e.g. a = 1
b = 3.14
5) String Variables: String variables are those variables which contain one or
more characters.
e.g. x= “abcd2”
y= “Hello World”
= “x”
2. Constants
Constants are entities within a program whose value can't be changed. There are 2
basic types of constant. These are numeric constants and character constants.
1) Numeric Constants: numeric constants are the numbers which can be integer,
double or complex. You can check the type of constant through the typeof()
function. Numeric constant suffix with L are the integer type and suffix with i are
called complex type.
e.g. > typeof(10)
[1] "double"
> typeof(10L)
[1] "Integer"
> typeof(10i)
[1] "complex"
3) Built-in Constants: Some of the built-in constants of R along with their values
are shown below:
e.g. > LETTERS
[1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q" "R" "S" "T"
[21] "U" "V" "W" "X" "Y" "Z"
> letters
[1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" "t"
[21] "u" "v" "w" "x" "y" "z"
> month.name
[1] "January" "February" "March" "April" "May" "June"
[7] "July" "August" "September" "October" "November" "December"
> month.abb
[1] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec"
> pi
[1] 3.141593
3. key words
In programming, a keyword is a word which is reserved by a program because it
has a special meaning. A keyword can be a command or a parameter. Like in C,
C++, Java, there is also a set of keywords in R. A keyword can't be used as a
variable name. Keywords are also called as "reserved names."
There are the following keywords as per ?reserved or help(reserved) command:
if else repeat
NaN NA NA_integer_
In R, there are several data types such as integer, string, etc. The operating
system allocates memory based on the data type of the variable and decides what
can be stored in the reserved memory.
There are the following data types which are used in R programming:
Note: We can also use the capital ‘L’ notation to denote that a particular
value is of the integer data type.
> class(int3)
[1] "integer"
> typeof(int3)
[1] "integer"
> class(comp)
[1] "complex"
> typeof(comp)
[1] "complex"
4) Logical Data Type :
The logical data type stores logical or boolean values of TRUE or FALSE.
e.g.
> class(logi)
[1] "logical"
> typeof(logi)
[1] "logical"
> class(char)
[1] "character"
> typeof(char)
[1] "character"
Code :
comp <- 22-6i
int2
char2
char3
int4
comp2
5. Data Structure.
A data structure is a particular way of organizing data in a computer so that it
can be used effectively. The idea is to reduce the space and time
complexities of different tasks. Data structures in R programming are tools
for holding multiple values.
R’s base data structures are often organized by their dimensionality (1D, 2D,
or nD) and whether they’re homogeneous (all elements must be of the
identical type) or heterogeneous (the elements are often of various types).
This gives rise to the six data types which are most frequently utilized in data
analysis.
The most essential data structures used in R include:
Vectors
Lists
Data frames
Matrices
Arrays
Factors
a. Vectors
A vector is an ordered collection of basic data types of a given length. The
only key thing here is all the elements of a vector must be of the identical
data type e.g homogeneous data structures. Vectors are one-dimensional
data structures.
>X
[[1]]
[1] 1
[[2]]
[1] 3
[[3]]
[1] 5
[[4]]
[1] 7
[[5]]
[1] 8
[[6]]
[1] "r"
> length(X)
[1] 6
> class(X)
[1] "list"
> numberOfEmp = 4
[[1]]
[1] 1 2 3 4
[[2]]
[[3]]
[1] 4
c. Data frames
Data frames are generic data objects of R which are used to store the
tabular data. Data frames are the foremost popular data objects in R
programming because we are comfortable in seeing the data within the
tabular form. They are two-dimensional, heterogeneous data structures.
These are lists of vectors of equal lengths.
Data frames have the following constraints placed upon them:
A data-frame must have column names and every row should have a unique
name.
Each column must have the identical number of items.
Each item in a single column must be of the same data type.
Different columns may have different data types.
To create a data frame we use the data.frame() function.
> print(df)
Output:
1 Amiya R 22
2 Raj Python 25
3 Asish Java 45
d. Matrices
A matrix is a rectangular arrangement of numbers in rows and columns. In a
matrix, as we know rows are the ones that run horizontally and columns are the
ones that run vertically. Matrices are two-dimensional, homogeneous data
structures.
Now, let’s see how to create a matrix in R. To create a matrix in R you need to
use the function called matrix. The arguments to this matrix() are the set of
elements in the vector. You have to pass how many numbers of rows and how
many numbers of columns you want to have in your matrix and this is the
important point you have to remember that by default, matrices are in column-
wise order.
e.g. A = matrix(
c(1, 2, 3, 4, 5, 6, 7, 8, 9),
nrow = 3, ncol = 3,
byrow = TRUE
> print(A)
Output:
[1,] 1 2 3
[2,] 4 5 6
[3,] 7 8 9
e. Arrays
Arrays are the R data objects which store the data in more than two
dimensions. Arrays are n-dimensional data structures. For example, if we
create an array of dimensions (2, 3, 3) then it creates 3 rectangular matrices
each with 2 rows and 3 columns. They are homogeneous data structures.
Now, let’s see how to create arrays in R. To create an array in R you need to
use the function called array(). The arguments to this array() are the set of
elements in vectors and you have to pass a vector containing the
dimensions of the array.
e.g. > A = array(
c(1, 2, 3, 4, 5, 6, 7, 8),
dim = c(2, 2, 2)
)
> print(A)
Output:
, , 1 , , 2
[,1] [,2] [,1] [,2]
[1,] 5 7 [1,] 1 3
[2,] 2 4 [2,] 2 4
f. Factors
Factors are the data objects which are used to categorize the data and
store it as levels. They are useful for storing categorical data. They can
store both strings and integers. They are useful to categorize unique
values in columns like “TRUE” or “FALSE”, or “MALE” or “FEMALE”, etc..
They are useful in data analysis for statistical modeling.
Now, let’s see how to create factors in R. To create a factor in R you
need to use the function called factor(). The argument to this factor() is
the vector.
e.g.
> print(fac)
Output:
A character is held as a one-byte integer in memory. In R, there are two different ways
to create a character data type value, i.e., using as.character() function and by typing
string between double quotes("") or single quotes('').
e.g.
d<-'shubham'
e<-"Arpita"
f<-65
f<-as.character(f)
d
e
f
char_vec<-c(1,2,3,4,5)
char_vec<-as.character(char_vec)
char_vec1<-c("shubham","arpita","nishka","vaishali")
char_vec
class(d)
class(e)
class(f)
class(char_vec)
class(char_vec1)
Operators:
Operators are the symbols directing the compiler to perform various kinds of
operations between the operands. Operators simulate the various mathematical,
logical, and decision operations performed on a set of Complex Numbers, Integers,
and Numericals as input operands.
Following are the types of the operators:
1. Arithmetic 2. Relational
3. Logical 4. Assignment
5. Increasing 6.Decreasing
7.Specialoperators
1. Arithmetic operator :
Arithmetic operations simulate various math operations, like addition,
subtraction, multiplication, division, and modulo using the specified operator
between operands, which may be either scalar values, complex numbers, or
vectors. The operations are performed element-wise at the corresponding
positions of the vectors.
a)Addition operator (+):
The values at the corresponding positions of both the operands are added.
Consider the following R snippet to add two vectors:
e.g
Input : a <- c (1, 0.1)
b <- c (2.33, 4)
print (a+b)
Output : 3.33 4.10
Examples:
vec1 <- c(0, 2)
vec2 <- c(2, 3)
v <- c( 2,5.5,6)
t <- c(8, 3, 4)
print(v+t)
it produces the following result −
[1] 10.0 8.5 10.0
v <- c( 2,5.5,6)
t <- c(8, 3, 4)
print(v*t)
it produces the following result −
[1] 16.0 16.5 24.0
1. Relational operator :
The relational operators carry out comparison operations between the
corresponding elements of the operands. Returns a boolean TRUE value if the
first operand satisfies the relation compared to the second. A TRUE value is
always considered to be greater than the FALSE.
Less than (<):
Returns TRUE if the corresponding element of the first operand is less than that
of the second operand. Else returns FALSE.
Input : list1 <- c(TRUE, 0.1)
list2 <- c(0,0.1)
print(list1<list2)
Output : FALSE FALSE
Less than equal to (<=):
Returns TRUE if the corresponding element of the first operand is less than or
equal to that of the second operand. Else returns FALSE.
Input : list1 <- c(TRUE, 0.1)
list2 <- c(0,0.1)
print(list<=list2)
Output : FALSE TRUE
Examples:
x <- 5
> y <- 16
> x<y
[1] TRUE
> x>y
[1] FALSE
> x<=5
[1] TRUE
> y>=20
[1] FALSE
> y == 16
[1] TRUE
> x != 5
[1] FALSE
>
<
==
<=
>=
3.Logical Operator :
Logical operations simulate element-wise decision operations, based on the
specified operator between the operands, which are then evaluated to either a
True or False boolean value. Any non zero integer value is considered as a
TRUE value, be it complex or real number.
Element-wise Logical AND operator (&):
Returns True if both the operands are True.
Input : list1 <- c(TRUE, 0.1)
list2 <- c(0,4+3i)
print(list1 & list2)
Output : FALSE TRUE
&
It is called Element-wise Logical AND
operator. It combines each element of the first v <- c(3,1,TRUE,2+3i)
vector with the corresponding element of the t <- c(4,1,FALSE,2+3i)
second vector and gives a output TRUE if both print(v&t)
the elements are TRUE.
it produces the following result −
[1] TRUE TRUE FALSE TRUE
|
It is called Element-wise Logical OR operator.
It combines each element of the first vector v <- c(3,0,TRUE,2+2i)
with the corresponding element of the second t <- c(4,0,FALSE,2+3i)
print(v|t)
vector and gives a output TRUE if one the
elements is TRUE. it produces the following result −
[1] TRUE FALSE TRUE TRUE
The logical operator && and || considers only the first element of the vectors and give a
vector of single element as output.
&&
v <-
c(3,0,TRUE,2+2i)
Called Logical AND operator. Takes first element of both the t <-
vectors and gives the TRUE only if both are TRUE. c(1,3,TRUE,2+3i)
print(v&&t)
it produces the following
result −
[1] TRUE
||
v <-
c(0,0,TRUE,2+2i)
Called Logical OR operator. Takes first element of both the t <-
c(0,3,TRUE,2+3i)
vectors and gives the TRUE if one of them is TRUE.
print(v||t)
it produces the following
result −
[1] FALSE
4. Assignment Operator:
Assignment operators are used to assign values to various data objects in R.
The objects may be integers, vectors, or functions. These values are then
stores by the assigned variable names. There are two kinds of assignment
operators: Left and Right
v1 <- c(3,1,TRUE,2+3i)
<− v2 <<- c(3,1,TRUE,2+3i)
v3 = c(3,1,TRUE,2+3i)
or print(v1)
= print(v2)
print(v3)
or
it produces the following result −
<<−
[1] 3+0i 1+0i 1+0i 2+3i
[1] 3+0i 1+0i 1+0i 2+3i
[1] 3+0i 1+0i 1+0i 2+3i
Called Right Assignment
c(3,1,TRUE,2+3i) -> v1
-> c(3,1,TRUE,2+3i) ->> v2
print(v1)
or print(v2)
->> it produces the following result −
[1] 3+0i 1+0i 1+0i 2+3i
[1] 3+0i 1+0i 1+0i 2+3i
Checks for the value 0.1 in the specified list. It exists, therefore,
prints TRUE.
Colon Operator(:):
Prints a list of elements starting with the element before the color to the element
after it.
Input : print (1:5)
Output : 1 2 3 4 5
%*% Operator:
This operator is used to multiply a matrix with its transpose. Transpose of the
matrix is obtained by interchanging the rows to columns and columns to rows.
The number of columns of first matrix must be equal to number of rows of
second matrix. Multiplication of the matrix A with its transpose, B, produce a
square matrix.
Input : mat = matrix(c(1,2,3,4,5,6),nrow=2,ncol=3)
print (mat)
print( t(mat))
pro = mat %*% t(mat)
print(pro)
Output : [,1] [,2] [,3] #original matrix of order 2x3
[1,] 1 3 5
[2,] 2 4 6
[,1] [,2] #transposed matrix of order 3x2
[1,] 1 2
[2,] 3 4
[3,] 5 6
[,1] [,2] #product matrix of order 2x2
[1,] 35 44
[2,] 44 56
: Colon
operator. It
creates the v <- 2:8
series of print(v)
numbers in
sequence it produces the following result −
for a vector. [1] 2 3 4 5 6 7 8
%in%
This v1 <- 8
operator is v2 <- 12
used to t <- 1:10
identify if an print(v1 %in% t)
element print(v2 %in% t)
belongs to a
vector. it produces the following result −
[1] TRUE
[1] FALSE
%*%
This M = matrix( c(2,6,5,1,10,4), nrow = 2,ncol = 3,byrow =
operator is TRUE)
used to t = M %*% t(M)
multiply a print(t)
matrix with
its it produces the following result −
transpose. [,1] [,2]
[1,] 65 82
[2,] 82 117
Input/Output Functions in R
With R, we can read inputs from the user or a file using simple and easy-to-
use functions. Similarly, we can display the complex output or store it to a file
using the same. R’s base package has many such functions, and there are
packages that provide functions that can do the same and process the
information in the required ways at the same time.
In this article, you’ll get the answers to these:
In R, there are multiple ways to read and save input given by the user. here are a
few of them:
1. readline() function
We can read the input given by the user in the terminal with the readline() function.
Code:
input_read <- readline()
User Input:
412803 is pin code of wai
Code:
input_read
2. scan() function
We can also use the scan() function to read user input. This function, however, can only
read numeric values and returns a numeric vector. If a non-numeric input is given, the
function gives an error.
E.g.:
1)
input_scan <- scan()
User Input:
34 54 65 75 25
input_scan
2)
input_scan2 <- scan()
User Input:
34 566 2 a 2+1i
input_scan2
output:
Error in scan() : scan() expected ‘a real’, got ‘a’
How to Display Output in R?
To display the output of your program to the screen, you can use one of the following
functions:
1. print() functions
We can use the print() function to display the output to the terminal. The print()
function is a generic function. This means that the function has a lot of different
methods for different types of objects it may need to print. The function takes an
object as the argument. For example:
Example 1:
print(input_read)
Example 2:
print(input_scan)
Example 3:
print("abc")
Example 4:
print(34)
2. cat() function
We can also use the cat() function to display a string. The cat() function
concatenates all of the arguments and forms a single string which it then prints. For
example:
Code:
cat("hello", "this","is","techvidvan",12345,TRUE)
Data Import
1) Importing the data into csv files from the syntax
Method-1:
Using read.csv() function
Syntax:
read.csv(“path/file_name.csv”,header=TRUE)
Or
read.csv(file.choose(),header=TRUE)
Method-2:
Syntax:
read.csv(“path.csv”,header=TRUE,sep= “,”)
where,
path=the path of the files to be imported
header=by default TRUE
sep=the seprated of values in each row if we give (,) in “ “
# i.e “ , “ then separated the values in each row by ‘ , ‘
Exporting Data
Importing data in R is surely important for the user. However, exporting
data from R to other platform is equally important as well may want to
export the data from R workspace in to an excel file or csv or text file.
1) Exporting the data into text files from R-
Syntax-
write.table(R-data file,“path/file_name.txt”,row.names=FALSE)
R built-in functions
The function Which are already created or define in the programming frame work
are known as a built in functions in R has a reach set of functions that are used
to perform almost every task for user. These built-in function are divided into
following categories based on their functionality.
Math Functions
R provides the various mathematical functions to perform the mathematical
calculation. These mathematical functions are very helpful to find absolute value,
square value and much more calculations. In R, there are the following functions
which are used:
3. ceiling(x) It returns the smallest integer which is larger than or x<- 4.5
equal to x. print(ceiling(x))
Output
[1] 5
4. floor(x) It returns the largest integer, which is smaller than or x<- 2.5
equal to x. print(floor(x))
Output
[1] 2
5. trunc(x) It returns the truncate value of input x. x<- c(1.2,2.5,8.1)
print(trunc(x))
Output
[1] 1 2 8
String Function
R provides various string functions to perform tasks. These string functions allow us to
extract sub string from string, search pattern etc. There are the following string functions
in R:
1. sub(pattern, It finds pattern in x and replaces it with st1<- "England is beautiful but no
replacement,x, replacement (new) text. the part of EU"
sub("England', "UK", st1)
ignore.case=FALSE,
Output
fixed=FALSE)
[1] "UK is beautiful but not a part
of EU"
Function Description
When you are telling the computer what to do, you also get to choose how it’s going
to be done. That’s where computer algorithms come in. An algorithm is the basic
technique used to get the job done. For example, let’s say you have a friend arriving
at the airport and she needs to get from the airport to your house. She might use the
following algorithm:
You will note that the algorighm is written in the order in which it is to be executed. It
wouldn’t make sense to perform Step 4 (Walk two blocks north) until after the other
three steps have been computed.
Let’s say we have a bunch of words - say, the names of colors. We want to compute
the average number of characters in these words. If we were going to do this by
hand, we would use the following algorithm:
Code:
Col_name=c(“grey”, “grey”, “brown”, “orange”, “olive”, “green”, “cyan”, “blue”, “purple”, “pink”,
“red”)
Col_length=nchar(Col_name)
mean(col_length)
Flowchart
Flowchart is a graphical representation of an algorithm. Programmers often
use it as a program-planning tool to solve a problem. It makes use of symbols
which are connected among them to indicate the flow of information and
processing. The process of drawing a flowchart for an algorithm is known as
“flowcharting”.
Basic Symbols used in Flowchart Designs
1. Terminal: The oval symbol indicates Start, Stop and Halt in a program’s
logic flow. A pause/halt is generally used in a program logic under some
error conditions. Terminal is the first and last symbols in the flowchart.
Conditional Statements
The conditional statement is mainly use for decision making on R-
programming. Here we can discuss two type of conditional statement
1. If statement
2. If-else statement
1) If statement:
If statement is one of the Decision-making statements in the R programming
language. It is one of the easiest decision-making statements. It is used to
decide whether a certain statement or block of statements will be executed or
not i.e if a certain condition is true then a block of statement is executed
otherwise not.
The basic structure of if statement is given by
Syntax:
if (expression) {
#statement to execute if condition is true
}
If the expression is true, the statement gets executed. But if the expression
is FALSE, nothing happens. The expression can be a logical/numerical vector,
but only the first element is taken into consideration. In the case of numeric
vector, zero is taken as FALSE, rest as TRUE.
Flowchart R Programming if statement
Examples:
1)# assigning value to variable a
a <- 5
if(a > 0)
{
print("Positive Number") # Statement
}
if (condition)
{
// Executes this block if
// condition is true
} else
{
// Executes this block if
// condition is false
}
Output:
[1] "5 is less than 10"
Here in the above code, Firstly, x is initialized to 5, then if-condition is
checked(x > 10), and it yields false. Flow enters the else block and prints the
statement “5 is less than 10”.
2) x <- 5
# Check if value is equal to 10
if(x == 10)
{
print(paste(x, "is equal to 10"))
}
else
{
print(paste(x, "is not equal to 10"))
}
Output:
[1] "5 is not equal to 10"
Loops:
The loop statement are essential to construct systematically block stile
programming. Here we can discuss two type of conditional statement
1. for loop
2. while loop
1)for loop:
For loop in R Programming Language is useful to iterate over the elements of
a list, dataframe, vector, matrix, or any other object. It means, the for loop can
be used to execute a group of statements repeatedly depending upon the
number of elements in the object. It is an entry controlled loop, in this loop the
test condition is tested first, then the body of the loop is executed, the loop body
would not be executed if the test condition is false.
Examples:
1) # the use of for loop
for (i in 1: 4)
{
print(i ^ 2)
}
Output:
[1] 1
[1] 4
[1] 9
[1] 16
In the above example, we iterated over the range 1 to 4 which was our vector.
Now there can be several variations of this general for loop. Instead of using a
sequence 1:5, we can use the concatenate function as well.
2)While Loop:
It is a type of control statement which will run a statement or a set of statements
repeatedly unless the given condition becomes false. It is also an entry
controlled loop, in this loop the test condition is tested first, then the body of the
loop is executed, the loop body would not be executed if the test condition is
false.
Examples:
Example 1: Program to display numbers from 1 to 5 using while loop in R.
# R program to demonstrate the use of while loop
val = 1
Output:
[1] 120
Unconditional Statement
In R programming, we require a control structure to run a block of code multiple
times. Loops come in the class of the most fundamental and strong
programming concepts. A loop is a control statement that allows multiple
executions of a statement or a set of statements. The word ‘looping’ means
cycling or iterating.
Jump statements are used in loops to terminate the loop at a particular iteration
or to skip a particular iteration in the loop.The two most commonly used jump
statements in loops are:
Break Statement
Next Statement
Note: In R language continue statement is referred to as the next statement.
Break Statement
The break keyword is a jump statement that is used to terminate the loop at a
particular iteration.
Syntax:
if (test_expression) {
break
}
Examples:
Example 1: Using break in For-loop
# R program for break statement in For-loop
no<- 1:10
Output:
[1] "Values are: 1"
[1] "Values are: 2"
[1] "Values are: 3"
[1] "Values are: 4"
[1] "Coming out from for loop Where i= 5"
Output:
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
Next Statement
The next statement is used to skip the current iteration in the loop and move to
the next iteration without exiting from the loop itself.
Syntax:
if (test_condition)
{
next
}
Example 1: Using next statement in For-loop
# R Next Statement Example
no<- 1:10
Output:
[1] "Values are: 1"
[1] "Values are: 2"
[1] "Values are: 3"
[1] "Values are: 4"
[1] "Values are: 5"
[1] "Skipping for loop Where i = 6"
[1] "Values are: 7"
[1] "Values are: 8"
[1] "Values are: 9"
[1] "Values are: 10"
Output:
[1] 2
[1] 4
[1] 5
goto statement in R Programming
Goto statement in a general programming sense is a command that takes the
code to the specified line or block of code provided to it. This is helpful when
the need is to jump from one programming section to the other without the use
of functions and without creating an abnormal shift.
Unfortunately, R doesn’t support goto but its algorithm can be easily converted
to depict its application. By using following methods this can be carried out
more smoothly:
Use of if and else
Using break, next and return
Functions in R Programming
Functions are useful when you want to perform a certain task multiple times. A
function accepts input arguments and produces the output by executing valid R
commands that are inside the function. In R Programming Language when you
are creating a function the function name and the file in which you are creating
the function need not be the same and you can have one or more function
definitions in a single R file.
Functions in R Language
Note: In the above syntax f is the function name, this means that you are
creating a function with name f which takes certain arguments and executes the
following statements
.
Single Input Single Output
Now create a function in R that will take a single input and gives us a single
output.
Example-1:
# A simple R function to calculate
# area of a circle
areaOfCircle = function(radius){
area = pi*radius^2
print(area)
}
areaOfCircle(2)
Example-2:
# A simple R function to check
# whether x is even or odd
evenOdd = function(x){
if(x %% 2 == 0)
print("even")
else
print("odd")
}
evenOdd(4)
evenOdd(3)
Output:
[1] "even"
[1] "odd"
Rectangle(2, 3)
Output:
6
Graphical Representation of R programming
R language is mostly used for statistics and data analytics purposes to
represent the data graphically in the software. To represent those data
graphically, charts and graphs are used in R.
We can discuss here three type of graphical Representation.
Histogram
Frequency polygon
Ogive curve
Histogram
Example-1:
# Create data for the graph.
v <- c(19, 23, 11, 5, 16, 21, 32,
14, 19, 27, 39)
Output:
Example-2:
Output:
Frequency polygon
Frequency polygons are the plots of the values in a data frame to visualize the
shape of the distribution of the values. It helps us in comparing different data
frames and visualizing the cumulative frequency distribution of the data frames.
The frequency polygon indicates the number of occurrences for each distinct
class in the data frame.
To create a basic frequency polygon in the R Language, we first create a line
plot for the variables under construction. Then we use the polygon() function to
create the frequency polygon.
Syntax: plot( x, y ) polygon( c( xmin, x, xmax ), c( ymin, y, ymax ), col )
where,
x and y: determines the data vector for x and y axes data.
xmin and ymin: determines the minimum limit of x and y axis.
xmax and ymax: determines the maximum limit of x and y axis.
col: determines the color of frequency polygon.
Example-1:
x<-1:40
y<-sample(5:40,40,replace=TRUE)
plot(x,y,type="l")
polygon(c(1,x,40),c(0,y,0),col="green")
output:
Bar plot or Bar Chart in R is used to represent the values in data vector as
height of the bars. The data vector passed to the function is represented over y-
axis of the graph. Bar chart can behave like histogram by using table() function
instead of data vector.
Note: To know about more optional parameters in barplot() function, use the
below command in R console:
Syntax: barplot(data, xlab, ylab)
where:
data is the data vector to be represented on y-axis
xlab is the label given to x-axis
ylab is the label given to y-axis
Example-1:
x <- c(7, 15, 23, 12, 44, 56, 32)
# plotting vector
barplot(x,xlab = "GeeksforGeeks Audience",
ylab = "Count", col = "white",
col.axis = "darkgreen",
col.lab = "darkgreen")
output:
Pie chart is a circular chart divided into different segments according to the ratio
of data provided. The total value of the pie is 100 and the segments tell the
fraction of the whole pie. It is another method to represent statistical data in
graphical form and pie() function is used to perform the same.
Note: To know about more optional parameters in pie() function, use the below
command in the R console:
output:
Example-2:
geeks <- c(23, 56, 20, 63)
labels <- c("Mumbai", "Pune", "Chennai", "Bangalore")
piepercent<- round(100 * geeks / sum(geeks), 1)
# Plot the chart.
pie(geeks, labels = piepercent,
main = "City pie chart", col = rainbow(length(geeks)))
legend("topright", c("Mumbai", "Pune", "Chennai", "Bangalore"),
cex = 0.5, fill = rainbow(length(geeks)))
output:
Programmes:
break
}
i=i+1
}
if (f == 1) {
print(paste("Number is prime :", n))
} else{
print(paste("Number is not prime :", n))
}
Given Range
n = as.integer(readline(prompt = "Enter a number :"))
for (j in 2:n)
{
f=1
i=2
n=j
while (i <= n / 2)
{
if (n %% i == 0)
{
f=0
break
}
i=i+1
}
if (f == 1)
{
print(paste("Number is prime :", n))
}
}
6)To check if number is odd or even:
a = as.integer(readline(prompt = "Enter a number :"))
a
if(a %% 2 == 0)
{
print(paste("The number", a ,"is even"))
}else
{
print(paste("The number", a ,"is odd"))
}
7)To check leap year:
year = as.integer(readline(prompt="Enter a year: "))
if((year %% 4) == 0)
{
if((year %% 100) == 0)
{
if((year %% 400) == 0)
{
print(paste(year,"is a leap year"))
} else
{
print(paste(year,"is not a leap year"))
}
} else
{
print(paste(year,"is a leap year"))
}
} else
{
print(paste(year,"is not a leap year"))
}
8) To find sum of first n natural numbers:
num = as.integer(readline(prompt = "Enter a number: "))
if(num < 0) {
print("Enter a positive number")
} else {
sum = 0
# use while loop to iterate until zero
while(num > 0) {
sum = sum + num
num = num - 1
}
print(paste("The sum is", sum))
}
9) To find AM, GM, and HM for ungrouped data:
Example-1:
Monthly sales of 10 small shops are given below
100,190, 210, 160, 150, 160, 190, 200, 170, 152
Calculate A.M. , G.M., H.M. of the above data and also calculate median, mode
and quartiles.
Sol:
x=c(100,190, 210, 160, 150, 160, 190, 200, 170, 152)
n=length(x)
am=mean(x)
lx=log10(x)
gm=10^mean(lx)
hm=n/sum(1/x)
tx=table(x); m=which(tx==max(tx)); stx=sort(unique(x)); mo=stx[m]
me=median(x)
q1=quantile(x,0.25); q2=quantile(x,0.50); q3=quantile(x,0.75)
Example-2:
For the following frequency distribution
x: 1 2 3 4 5
f: 7 11 9 8 3
Calculate A.M.,G.M. and H.M.
sol:
x=1:5
f=c(7, 11, 9, 8, 3)
n=sum(f)
y=rep(x,f)
am=mean(y)
ly=log10(y)
gm=10^mean(ly)
hm=n/sum(f/x)
10) To find Mean deviation, Variance, Standard deviation for ungrouped data:
Example-1:
The number of mistakes in a page recorded for 20 pages are as follows.
2, 5, 9, 7, 11, 6, 5, 2, 7, 9, 3, 2, 8, 12, 14, 6, 3, 9, 8, 7
Calculate find mean deviation about mean, variance and standard deviation.
Sol:
X=c(, 5, 9, 7, 11, 6, 5, 2, 7, 9, 3, 2, 8, 12, 14, 6, 3, 9, 8, 7)
n=length(x)
mx=mean(x)
md=sum(abs(x-mx))/n
v1=var(x)
v=((n-1)/n)*v1
sd=sqrt(v)
cv=sd*100/abs(mx)
Example-2:
Calculate mean deviation about median, variance, standard deviation and also
calculate quartile deviation and its coefficient.
Match score: 0 1 2 3 4
No. of matches: 27 9 8 5 4
Sol:
x=1:4
f=c(27,9,8,5,4)
n=sum(f)
y=rep(x,f)
mx=sum(f*x)/n
q1=quantile(y,0.25); q2=quantile(y,0.50); q3=quantile(y,0.75)
md=sum(f*abs(x-q2))/n
v=sum(f*(x-mx)^2)/n
sd=sqrt(v)
qd=(q3-q1)/2
cqd=(q3-q1)/(q3+q1)