0% found this document useful (0 votes)
15 views

R PROGRAMMING

The document provides an overview of R programming, including its definition, development, and various concepts such as data types, data structures, statistical methods, and error handling. It covers topics like mean, median, mode, probability distributions, hypothesis testing, and different types of tests like Z-test and T-test. Additionally, it explains operations on data frames and matrices, as well as exception handling and recursion in R.

Uploaded by

dadavalibdadu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

R PROGRAMMING

The document provides an overview of R programming, including its definition, development, and various concepts such as data types, data structures, statistical methods, and error handling. It covers topics like mean, median, mode, probability distributions, hypothesis testing, and different types of tests like Z-test and T-test. Additionally, it explains operations on data frames and matrices, as well as exception handling and recursion in R.

Uploaded by

dadavalibdadu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 13

R PROGRAMMING

SECTION- A
I Answer any SIX of the following: (6 x 2=12)
1.What is R-Programming? Who developed R?
R is a free open source programming language for statistical computing and
data visualization is called R programming.
Ross Ihaka and Robert Gentle man was developed R.
2.List any two differences between vector and list.
Vector:
1. Vector contains elements of the same data type, such as numeric,
character, or logical.
2. Vector is created using c() function.
List:
1.List is a collection of elements of different types such as number, vector,
matrix etc.
2.List is created using list() function.
3.Explain NA and NAN with example.
NA stands for Not available. NA are used to represent missing values.
Example: x<-c(4,5,6,NA,6,8,NA)
Print(x)
NAN stands for Not a Number and applies to numerical values as well as real
imaginary parts of the complex values but not to values of integer vector.
NAN usually appears when you divide zero by zero, find log of negative
numbers.
Example 1: x<-0
Y<-0
Print(x/y)
Example 2: x= -5
Print(log (x))
4.Explain timings in R.
Sys.time():- This function returns the current time in R. Sys.time() is used to
find the duration of the R code or function.
Sys.sleep():-Sys.sleep Suspends the execution of R expression for specified
time interval of a n seconds.
System.time():-This function helps to measure the execution time of code
expression compared to Sys.time() the system.time() returns three
information the user, system and elapsed time.
5.Find mean, median and mode for the following:
12, 10, 9, 11, 7, 11, 8, 13, 6.

Mean=sum of number/ total number


=87/9=9.666
Median=7
Mode=11
6.Define:
(a) Statistics:- Statistics is used to process complex problem in the real
world. It can be used to drive meaningful insights from data by performing
mathematical computations on it.
(b) Probability:- It is a measure of the likelihood of an event occurring.It is a
number between 0 and 1. 0 represent the impossible event. 1 represent an
certain event.
7.What is student’s T-distribution?
Student t distribution is a probability distribution that is used to calculate
population parameters when the sample size is small and the population
varience is unknown.
8.Define:
(a) Simple Linear Regression:- It is a statistical method used to model the
relationship between a dependent variable(Y) and a single independent
variable (x).
(b) Multiple Regression:- It is a statistical method used to model the
relationship between a dependent variable(Y) and two or more independent
variable (x1,x2,….., xn).

SECTION- B

II. Answer any FOUR of the following: (4 x5=20)


9.Explain different data types available in R.
1. Numeric Data: All the decimal values comes under the numeric data. All
the numeric data type initially taken as a “Numeric” set and real
numbers along with whole numbers.
Ex:10.5 ,10
2. Integer data: set of all integer will be having ‘L’ as a end of a number
indicating that integer. ‘L’ declares this is an integer.
Ex:10L, 5L
3. Complex data:- Complex number are the number which are having real
part in the imaginary part.
Ex:4+2i
4. Character data:- Group of characters or a individual character comes
under character type.
Ex: ”R programming”
5. Logical data:- It is a special type of data which will be having only two
values either True or False.
Ex: True

10.What is dada frame? Explain its operations with example.


Data frame in R are used to store tabular data where each different column
can hold different datatype.
Operations with example:
1.Create data frame using data.Frame()
>Student<-data.frame(
Name<-c(“ram”,”arya”),
Marks<-c(90,70),
Address<-c(“Tumkur”,”Hassan”))
2.Add new column to data frame
Use $ symbol to add column to data frame
>Student$phno<-c(897112556,94812333)
3. Removing Columns: remove a column by setting it to NULL.
Example: student_data$Grade <- NULL
4.Removing Rows: remove rows by using negative indices.
Example: student_data <- student_data[-2, ]
11.Explain Exception handling in R with example.
It is a process of handling the error that might occurs in the code and avoid
abrut half of the code.
Syntax:- trycatch(
{
Exception
},
Warning=function(w)
{
Code for handling warning
},
Finally={
Code execute for all input
}
)
Errors: Errors occur when a condition is encountered that prevents the
program from continuing. When an error occurs, the program typically
halts, and an error message is displayed.
Warnings: Warnings are issued when R encounters a condition that might
indicate a problem, but the program can still continue. It's a way for R to let
you know that something might be wrong, but it doesn't prevent the
program from executing further.
Stop:- We can also handle the errors using stop() to generate custom error
messages and stop the execution of the program.
Finally:- It execute the statements when the error occur or not.
Expr:-This specifies the expression we want to evaluate.

Example:
numbers
divide_numbers <- function(num1, num2) {
tryCatch(
expr = {
result <- num1 / num2
print(paste("The result is:", result))
},
error = function(e) {
print(paste("An error occurred:", e$message))
},
warning = function(w) {
print(paste("A warning occurred:", w$message))
}
)
}
12.What is function? Write a R code to find the factorial of a number using
Recursion.
Function is a block of code which only runs when it is calles.

>Factorial<-function(n)
{
If(n==0)
Return(1)
Else
Return(n*factorial(n-1))
}
N=as.integer(readline(prompt=”Enter a number:”))
Factorial(N)

13.What is Probability distribution? Explain Probability density functions.


A probability distribution is a mathematical function with defines that
likihood of different outcomes or value of variable occurring in a random
experiment.
Probability density function: it is type of function defines the relationship
between random variable or it is continuous variable probability distribution
It takes certain values.
Formula:- f(x)=p(X=x)/delta x

14.What is hypothesis testing? Explain its components.


Hypothesis testing is statistical method used to determine if there is
enough evidence in sample data to draw the conclusion about the
population.
Components of hypothesis testing.
1.Null hypothesis:- There is no relationship between two variables under
research does not changes in dependent variables due to changes in
independent variables. It is denoted by Ho:M=M1
2.Alternative hypothesis:- There is a relationship between two variables
under research changes in independent variables that effect dependent
variables. It is denoted by h1:M not equal to M1.
3.Critical value: It is the value that defines the rejection zone it is defined by
the level of significant it divides into two area, rejection and acceptance
area.
4.P value:-The probability of finding the observed p value given that the null
hypothesis is true.
5.Significance level:- The significance level is the probability of the rejecting
the null hypothesis when it is true.
6.Conclusion:- It is the final decision of the hypothesis test the conclusion
must always be clearly stated communicating the decision based on the
components of the test.
7.Test statistic:- It is a value computed from the sample data that is used in
making a decision about the rejection of the null hypothesis.

SECTION- C

III. Answer any FOUR of the following: (4 x 7=28)


15.Explain different data structures available in R with example.
1.Vectors: A vector in R is a basic data structure that contains elements of
the same data type, such as numeric, character, or logical. It is one of the
simplest and most commonly used structures in R.
Example: vec <- c(10, 20, 30, 40)
2.Lists : A list in R is a data structure that can hold elements of different
types (numeric, character, logical, vectors, matrices, data frames, or even
other lists). Unlike vectors, lists can contain heterogeneous data.
Example: my_list <- list(1, "banana", TRUE)
3.Array: Array are used to hold homogeneous data with contegeous
memory allocation they have fixed number of dimension.
Example: my_array <- array(1:9, dim = c(3, 3))
4.Matrices : A matrix in R is a two-dimensional data structure where all
elements are of the same data type (numeric, character, or logical).
Matrices are essentially vectors with a dimension attribute that creates
rows and columns.
Example: my_matrix <- matrix(1:9, nrow = 3, ncol = 3)
5.Factors : A factor in R is a data structure used to represent categorical
data. Factors are stored as integers, where each integer represents a level
(a unique category or value) and a label corresponds to these levels.
Example: genders <- factor(c("male", "female", "male"), levels = c("female",
"male"))
6.Data Frame: A data frame in R is a two-dimensional, table-like data
structure that can hold different data types in different columns (e.g.,
numeric, character, factor, etc.).
Example: df <- data.frame(Name = c("John", "Sarah", "Mike"), Age = c(23,
25, 22), Gender = c("M", "F", "M"))
16.(a) What is Matrix? Explain its operation with example.
(b)Explain Coercion with an example.

(a) Matrices : A matrix in R is a two-dimensional data structure where all


elements are of the same data type (numeric, character, or logical).
Matrices are essentially vectors with a dimension attribute that creates
rows and columns.
Operations on Matrices:
1. Creating Matrices: Matrices can be created using the matrix() function,
which takes data, the number of rows, and the number of columns.
Example: my_matrix <- matrix(1:9, nrow = 3, ncol = 3)
2. Modifying Elements in a Matrix: Modify specific elements by
referencing them with their row and column indices.
Example: my_matrix <- matrix(1:6, nrow = 2, ncol = 3) my_matrix[1, 2] <- 10
3.Viewing Elements in a Matrix: Access specific elements, rows, or columns
Using indexing.
Example: my_matrix <- matrix(1:6, nrow = 2, ncol = 3)
my_matrix[1, 2]
my_matrix[1, ]
my_matrix[, 2]
4.Combining Matrices: combine matrices using the rbind() and cbind()
functions to add rows or columns.
Example: mat1 <- matrix(1:4, nrow = 2, ncol = 2) mat2 <- matrix(5:8, nrow = 2,
ncol = 2) combined_matrix <- rbind(mat1, mat2)
(b) The process of converting data from one data type to another type.
1.as.logical():It converts the value to logical type, converts 0 to False and
non zero to True.
Ex: x<-c(5,1,0,8)
2.as.character():It converts objects or elements to character type.
Ex: x<-c(5,1,0,5)
3.as.integer():It converts objects or elements to integer type.
Ex: x<-as.integer(5,6)
4.as.double():-It converts an integer to double or numeric to double.
Ex: x<-c(5,1,8)
5.as.complex():-It converts the object to complex type.
Ex: x<-c(5,1,8)
6.as.list():-It accepts only vector as input arguments in the parameter.
Ex: x<-c(5,1,8)
17.Explain Bernoulli and Binomial distributions.
Bernoulli distribution:- Bernoulli distribution is concerned with discrete
random variable which as only two possible outcomes.
First one called it as Success(1) the variable value given to success’1’ second
one it as failure the variable value given to ‘0’.
Formula:- f(x)=P power x (1-p) power x
Functions:
1.dbern(x,k,p):- It is a function used to calculate density function of Bernoulli
distribution.
2.pbern(q,k,p):-It is a function used to calculate cumulative function of
Bernoulli distribution.
3.qbern(p,k,p):- It is a function used to calculate quntail function of Bernoulli
distribution.
4.rbern(n,k,p):-It is a function used to generate the random numbers or
random variables in Bernoulli distribution.
Binomial distribution:- It is a probability distribution that models the
likelihood of an event with two possible outcomes.
Formula: PMF, b(x,n,p)=ncx(p) power x (1-p) power x
=ncx(p) power x (1-p)n-x
=n!/(n-x)! x! (p) power x (1-p)n-x
Functions:
1.dbern(x,n,p):- It is a function used to calculate density function of Binomial
distribution.
2.pbern(q,n,p):-It is a function used to calculate cumulative function of
Binomial distribution.
3.qbern(p,n,p):- It is a function used to calculate quntail function of Binomial
distribution.
4.rbern(n,size,p):-It is a function used to generate the random numbers or
random variables in Binomial distribution.

18.(a) Explain ANOVA


(b) What is error? Explain its types.
(a) Analysis test is created by “Ronald Fisher” Anova is also called as “Fisher
Analysis of variance” and it is the extension of t-test and z-test.
Anova is analyst tool used to compare mean between two or more items. It is a
statistical method that yields values that can be tested to determine wheather
a significant relation exists between two variables.
Two types of ANOVA
1.One way ANOVA
2. Two way ANOVA

1.One way ANOVA:- In one way anova there is one factor or independent
variable and compares three or more levels.
One way anova assumes that the varience within each group or roughfly equal
this is known as homogeneity of varience assumption.
Formula:-Anova=varience between/varience within is greater than 1
Anova is greater than one reject the null hypothesis
Anova is less than one accept the null hypothesis.

(b)In R programming, an error is an unexpected event or condition that occurs


during the execution of code, causing the program to stop or produce
unexpected results.
Types of error:
1. Type I Error (α-error): A Type I error occurs when a true null hypothesis is
rejected. This type of error is also known as a "false positive" error.
2. Type II Error (β-error): A Type II error occurs when a false null hypothesis is
not rejected. This type of error is also known as a "false negative" error.

19.Explain the properties of Z-test, T-test and Chi-Square test.


Properties of Z-test:
Z-test is a parametric test which is used to small sample size that is n>30.
It is essential testing the significance of the difference of the mean values
where sample size is large and when population standard deviation is
available.
Assumption:
Population distribution is normal.
Sample are random and independent.
Sample size are large.
Population Standard deviation is available.
Properties of T-test:
T-test is a parametric test which is used to small sample size that is n<30.
It is essential testing the significance of the difference of the mean values
where sample size is small and when population standard deviation is not
available.
Assumption:
Population distribution is normal.
Samples are random and independent.
Sample size are small.
Population standard deviation is not available.

Properties of chi-Square test:


It is a non- parametric test of hypothesis testing chi-square can be used as a
test of (1) goodness of fit (2) Independence of two variables.
1.It helps in assessing the goodness of fit between a set of observe values and
those excepted theoratically.
2.It makes comparision between excepted frequencies€ and the observed
frequencies(o).
3.Greater the difference greater is the value of chi-square if there is no
difference between the excepted and observed frequencies than the value of
chi-square is zero.
4.It is also called goodness of fit test which determines wheater a particular
distribution fits the observed data or not. It is also used to test the
independence of two variable
20.(a)Explain plot customization with example.
(b)What is 3D Scatter plots?

Plot Customization in R Programming Language!

R provides a wide range of options for customizing plots to suit your specific
needs. Here are some ways to customize plots in R:

1. Adding Titles and Labels:


Use the title(), xlab(), and ylab() functions to add titles and labels to your plot.

Example:
plot(x, y)
title("My Plot")
xlab("X Axis")
ylab("Y Axis")

2. Changing Colors:
Use the col argument to change the color of the plot.

Example:

plot(x, y, col = "red")

3. Adding Legends:
Use the legend() function to add a legend to your plot.

Example:

plot(x, y)
legend("topright", c("Line 1", "Line 2"), lty = c(1, 2), col = c("red", "blue"))

4. Customizing Axis:
Use the axis() function to customize the axis.

Example:

plot(x, y)
axis(1, at = c(0, 10, 20), labels = c("0", "10", "20"))

5. Adding Grid Lines:


Use the grid() function to add grid lines to your plot.

Example:
plot(x, y)
grid()

6. Changing Plot Type:


Use the type argument to change the plot type.

Example:

plot(x, y, type = "l") # line plot


plot(x, y, type = "p") # point plot
plot(x, y, type = "b") # both line and point plot

7. Customizing Fonts:
Use the font argument to customize the font.

Example:

plot(x, y, font = 2) # bold font


plot(x, y, font = 3) # italic font
plot(x, y, font = 4) # bold and italic font

8. Adding Text:
Use the text() function to add text to your plot.

Example:

plot(x, y)
text(10, 10, "Hello World")

9. Customizing Plot Size:


Use the width and height arguments to customize the plot size.
Example:

plot(x, y, width = 8, height = 6)

10. Saving Plots:


Use the dev.copy() and dev.off() functions to save plots.

Example:

plot(x, y)
dev.copy(png, "myplot.png")
dev.off()

These are just a few examples of the many ways you can customize plots in R.
By using these options, you can create high-quality plots that effectively
communicate your data insights.

(b) 3D scatter plots are a type of plot that displays the relationship between
three variables in a three-dimensional space. They are useful for visualizing
complex relationships between variables and can be used to identify patterns,
trends, and correlations.

Creating 3D Scatter Plots in R:


1. The plot3d() function
2. The scatterplot3d() function

You might also like