Functions vs Scripts and Datasets
answer the questions based on R
Q) i. What are the advantages of using functions over scripts?
ii. Suppose you have two datasets A and B.
Dataset A has the following data: 6 7 8 9
Dataset B has the following data: 1 2 4 5.
Which function is used to combine the data from both datasets into dataset C.
Demonstrate the function with the input values and write the output.
i) Advantages of Using Functions Over Scripts
Functions offer several advantages over scripts in R:
1. Code Reusability: Functions can be reused in multiple parts of your program, reducing
redundancy and saving time.
2. Modularity: Functions allow you to break your code into smaller, manageable chunks, making it
easier to read and debug.
3. Ease of Maintenance: When code is organized into functions, making changes or fixing issues
becomes simpler without affecting unrelated parts of the code.
4. Parameterization: Functions can accept arguments, allowing you to handle different inputs
dynamically.
5. Improved Readability: Functions encapsulate specific tasks, making the overall program structure
cleaner and more understandable.
ii) Combining Two Datasets into a Single Dataset
To combine datasets in R, you can use the `c()` function, which concatenates values into a single vector.
Example
Given:
Dataset A: `6, 7, 8, 9`
Dataset B: `1, 2, 4, 5`
You can combine these using `c()` as follows:
# Define datasets
A <- c(6, 7, 8, 9)
B <- c(1, 2, 4, 5)
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 1/25
# Combine datasets
C <- c(A, B)
# Print the combined dataset
print(C)
Output:
csharp
[1] 6 7 8 9 1 2 4 5
Q) List and explain various functions that allow users to handle data in R workspace with
appropriate examples.
Functions to Handle Data in R Workspace
In R, the workspace is where all your objects, such as variables, data frames, vectors, and functions,
reside during your session. Below is a list of essential functions to handle data in the R workspace, along
with explanations and examples:
1. `ls()`
Description: Lists all objects currently in the workspace.
Example:
x <- 5
y <- c(1, 2, 3)
ls()
Output:
csharp
[1] "x" "y"
2. `rm()`
Description: Removes specified objects from the workspace.
Example:
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 2/25
R
rm(x) # Removes object 'x'
ls()
Output:
csharp
[1] "y"
3. `rm(list = ls())`
Description: Clears the entire workspace by removing all objects.
Example:
rm(list = ls())
ls()
Output:
scss
character(0)
4. `get()`
Description: Retrieves the value of an object by its name.
Example:
z <- 10
get("z")
Output:
csharp
[1] 10
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 3/25
5. `assign()`
Description: Assigns a value to an object dynamically.
Example:
assign("new_var", 42)
print(new_var)
Output:
csharp
[1] 42
6. `exists()`
Description: Checks whether an object exists in the workspace.
Example:
exists("new_var")
Output:
csharp
[1] TRUE
7. `save()`
Description: Saves specific objects to a file for later use.
Example:
save(y, file = "my_data.RData")
8. `load()`
Description: Loads objects from a saved file back into the workspace.
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 4/25
Example:
load("my_data.RData")
print(y)
9. `save.image()`
Description: Saves the entire workspace to a file.
Example:
save.image(file = "workspace.RData")
10. `history()`
Description: Displays the command history of the current session.
Example:
history()
11. `gc()`
Description: Invokes garbage collection to free up memory.
Example:
gc()
12. `objects()`
Description: Similar to `ls()`, lists all objects in the workspace.
Example:
objects()
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 5/25
13. `str()`
Description: Displays the structure of an object.
Example:
str(y)
Output:
css
num [1:3] 1 2 3
14. `summary()`
Description: Provides a summary of an object, such as statistical measures for a dataset.
Example:
summary(y)
Output:
mathematica
Min. 1st Qu. Median Mean 3rd Qu. Max.
1 1 2 2 3 3
15. `dir()`
Description: Lists files in the current working directory.
Example:
dir()
These functions allow users to efficiently manage, organize, and retrieve data in the R workspace.
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 6/25
Q) Consider the following data frame given below: course id class 1 11 marks 1 2 12 56 2 3 13 75 1 4
14 48 2 5 15 69 1 6 16 84 2 53 i.Create a subset of course less than 3 by using [ ] brackets and
demonstrate the output. ii. Create a subset where the course column is less than 3 or the class
equals to 2 by using subset () function and demonstrate the output.
Q) List and discuss various types of data structures in R.
1. Subsetting the Data Frame
Given the data frame:
df <- data.frame(
course = c(1, 2, 3, 4, 5, 6),
id = c(11, 12, 13, 14, 15, 16),
class = c(1, 2, 1, 2, 1, 2),
marks = c(53, 56, 75, 48, 69, 84)
)
i) Subset of `course < 3` using `[ ]` Brackets
We can subset rows where the `course` column is less than 3:
subset1 <- df[df$course < 3, ]
print(subset1)
Output:
kotlin
course id class marks
1 1 11 1 53
2 2 12 2 56
ii) Subset with `course < 3` OR `class == 2` using `subset()`
The `subset()` function allows conditional filtering with readable syntax:
subset2 <- subset(df, course < 3 | class == 2)
print(subset2)
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 7/25
Output:
kotlin
course id class marks
1 1 11 1 53
2 2 12 2 56
4 4 14 2 48
6 6 16 2 84
2. Various Types of Data Structures in R
R supports several data structures, each suited for different types of data. Here’s a discussion of the
main types:
1. Vectors
Description: A one-dimensional data structure that holds elements of the same type (numeric,
character, logical, etc.).
Example:
vec <- c(1, 2, 3, 4)
print(vec)
Output:
csharp
[1] 1 2 3 4
2. Matrices
Description: Two-dimensional arrays where all elements must be of the same type.
Example:
mat <- matrix(1:9, nrow = 3)
print(mat)
Output:
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 8/25
css
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
3. Arrays
Description: Multi-dimensional data structures (more than 2 dimensions) with elements of the
same type.
Example:
arr <- array(1:8, dim = c(2, 2, 2))
print(arr)
Output:
css
, , 1
[,1] [,2]
[1,] 1 3
[2,] 2 4
, , 2
[,1] [,2]
[1,] 5 7
[2,] 6 8
4. Data Frames
Description: Two-dimensional structures that can hold data of different types in columns.
Example:
df <- data.frame(name = c("A", "B"), age = c(25, 30))
print(df)
Output:
css
name age
1 A 25
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 9/25
2 B 30
5. Lists
Description: A versatile data structure that can hold elements of different types, including other
lists.
Example:
lst <- list(name = "John", age = 25, marks = c(85, 90))
print(lst)
Output:
bash
$name
[1] "John"
$age
[1] 25
$marks
[1] 85 90
6. Factors
Description: Used to represent categorical data with fixed values (levels).
Example:
gender <- factor(c("Male", "Female", "Male"))
print(gender)
Output:
csharp
[1] Male Female Male
Levels: Female Male
7. Tibble
Description: A modern version of a data frame with enhanced features.
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 10/25
Example (requires `dplyr`):
library(tibble)
tib <- tibble(name = c("A", "B"), age = c(25, 30))
print(tib)
Output:
less
# A tibble: 2 × 2
name age
<chr> <dbl>
1 A 25
2 B 30
8. NULL
Description: Represents an empty or undefined value.
Example:
x <- NULL
print(x)
Output:
arduino
NULL
These data structures provide flexibility in data handling and are foundational to programming in R.
i. The following table shows the number of units of different products sold on
different days:
Product
Monday Tuesday Wednesday Thursda
y
Bread
12
3
5
Friday
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 11/25
Milk
11
21
27
18
9
Cola Cans
20
10
1
33
6
15
Chocolate
bars
12
6
7
4
13
Detergent
5
8
12
12
Create five sample numeric vectors from this data.
20
23
ii. Name and explain the operators used to form data subsets in R.
i) Creating Numeric Vectors from the Data
The given table is:
Product Monday Tuesday Wednesday Thursday Friday
Bread 12 3 5 20 23
Milk 11 21 27 18 9
Cola Cans 20 10 1 33 6
Chocolate Bars 12 6 7 4 13
Detergent 5 8 12 12 15
Creating Numeric Vectors
Here, we define one vector for each product:
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 12/25
R
# Bread sales
bread <- c(12, 3, 5, 20, 23)
# Milk sales
milk <- c(11, 21, 27, 18, 9)
# Cola Cans sales
cola_cans <- c(20, 10, 1, 33, 6)
# Chocolate Bars sales
chocolate_bars <- c(12, 6, 7, 4, 13)
# Detergent sales
detergent <- c(5, 8, 12, 12, 15)
# Print vectors to verify
print(bread)
print(milk)
print(cola_cans)
print(chocolate_bars)
print(detergent)
Output:
csharp
[1] 12 3 5 20 23
[1] 11 21 27 18 9
[1] 20 10 1 33 6
[1] 12 6 7 4 13
[1] 5 8 12 12 15
ii) Operators Used to Form Data Subsets in R
R provides several operators for subsetting data. These can be applied to vectors, matrices, data frames,
and lists.
1. Indexing by Position
Use square brackets `[ ]` to specify the position of the elements you want to extract.
Example:
vec <- c(10, 20, 30, 40)
vec[2] # Second element
Output:
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 13/25
csharp
[1] 20
2. Indexing by Logical Vectors
Use logical conditions to extract elements that satisfy a condition.
Example:
vec <- c(10, 20, 30, 40)
vec[vec > 20] # Elements greater than 20
Output:
csharp
[1] 30 40
3. Indexing by Names
Use named elements for subsetting.
Example:
sales <- c(Monday = 10, Tuesday = 20, Wednesday = 30)
sales["Tuesday"]
Output:
csharp
[1] 20
4. Using the `subset()` Function
Extracts rows or columns of a data frame based on a condition.
Example:
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 14/25
R
df <- data.frame(product = c("Bread", "Milk"), sales = c(12, 15))
subset(df, sales > 12)
Output:
product sales
2 Milk 15
5. The `$` Operator
Used to extract a column from a data frame or a list.
Example:
df <- data.frame(product = c("Bread", "Milk"), sales = c(12, 15))
df$sales
Output:
csharp
[1] 12 15
6. Indexing Matrices
Specify row and column positions using `[row, column]`.
Example:
mat <- matrix(1:9, nrow = 3)
mat[1, 2] # First row, second column
Output:
csharp
[1] 4
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 15/25
7. Using `%in%`
Extracts elements that match specified values.
Example:
vec <- c(10, 20, 30, 40)
vec[vec %in% c(20, 30)]
Output:
csharp
[1] 20 30
8. Negative Indexing
Excludes specific elements by their positions.
Example:
vec <- c(10, 20, 30, 40)
vec[-2] # Exclude the second element
Output:
csharp
[1] 10 30 40
These operators make subsetting in R versatile and powerful.
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 16/25
The provided table includes data for the number of units of various products sold on different days.
Here's how to address the questions:
a) Create Five Sample Numeric Vectors from the Data
From the table, each row corresponds to a product's sales over the days of the week. We can create a
numeric vector for each product:
# Numeric vectors for each product
bread <- c(12, 3, 5, 11, 9) # Sales of Bread
milk <- c(21, 27, 18, 20, 15) # Sales of Milk
cola_cans <- c(10, 1, 33, 6, 15) # Sales of Cola Cans
chocolate_bars <- c(6, 7, 4, 13, 12) # Sales of Chocolate Bars
detergent <- c(5, 8, 12, 20, 23) # Sales of Detergent
# Printing the vectors
print(bread)
print(milk)
print(cola_cans)
print(chocolate_bars)
print(detergent)
Output:
csharp
[1] 12 3 5 11 9
[1] 21 27 18 20 15
[1] 10 1 33 6 15
[1] 6 7 4 13 12
[1] 5 8 12 20 23
b) Operators Used to Form Data Subsets in R
Here is an explanation of the operators with examples for subsetting:
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 17/25
1. `[ ]` Brackets
Used to extract specific elements by their index position.
Example:
vec <- c(1, 2, 3, 4, 5)
vec[1:3] # Extracts first three elements
Output:
csharp
[1] 1 2 3
2. `$` Operator
Used to access specific columns in a data frame by name.
Example:
df <- data.frame(product = c("Bread", "Milk"), sales = c(12, 21))
df$sales # Access 'sales' column
Output:
csharp
[1] 12 21
3. `subset()` Function
Allows filtering rows based on conditions.
Example:
df <- data.frame(product = c("Bread", "Milk"), sales = c(12, 21))
subset(df, sales > 15) # Rows where sales > 15
Output:
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 18/25
product sales
2 Milk 21
4. Logical Indexing
Selects elements based on logical conditions.
Example:
vec <- c(12, 21, 15, 18)
vec[vec > 15] # Extract elements greater than 15
Output:
csharp
[1] 21 18
5. Named Indexing
Subset elements using their names.
Example:
vec <- c(Monday = 12, Tuesday = 21, Wednesday = 15)
vec["Tuesday"] # Access value for 'Tuesday'
Output:
mathematica
Tuesday
21
6. Negative Indexing
Excludes elements by their positions.
Example:
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 19/25
R
vec <- c(1, 2, 3, 4, 5)
vec[-c(1, 3)] # Exclude first and third elements
Output:
csharp
[1] 2 4 5
7. `%in%` Operator
Selects elements that match specified values.
Example:
vec <- c("Bread", "Milk", "Cola")
vec[vec %in% c("Milk", "Cola")] # Selects "Milk" and "Cola"
Output:
csharp
[1] "Milk" "Cola"
8. Multiple Conditions
Combine conditions using logical operators (`&`, `|`).
Example:
vec <- c(12, 21, 15, 18)
vec[vec > 15 & vec < 20] # Elements greater than 15 and less than 20
Output:
csharp
[1] 18
These operators provide robust and flexible options for filtering and extracting data in R.
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 20/25
Q) Describe applications of data visualization.
Q) Consider the following data frame given below:
course
id
class
1
11
marks
1
2
12
56
2
3
13
75
1
4
14
48
2
5
15
69
1
6
16
84
2
i.
53
Create a subset of course less than 5 by using [ ] brackets and demonstrate
the output.
ii.
Create a subset where the course column is less than 4 or the class equals
to 1 by using subset () function and demonstrate the output.
Q)i. Write a script to create a dataset named data1 in R containing the following text.
Text: 2, 3, 4, 5, 6.7, 7, 8.1, 9
ii. Explain the various functions provided by R to combine different sets of data.
Q) Applications of Data Visualization
Data visualization has widespread applications across various domains. Some key applications include:
1. Business Intelligence
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 21/25
Helps companies analyze sales trends, customer behavior, and financial performance.
Example: Dashboards displaying monthly sales performance.
2. Scientific Research
Visualizes experimental results, trends, and patterns in scientific data.
Example: Scatter plots to analyze the correlation between variables.
3. Healthcare Analytics
Tracks patient health trends and identifies patterns in disease outbreaks.
Example: Heatmaps to display COVID-19 case density across regions.
4. Education
Assists in understanding complex concepts by presenting data visually.
Example: Using bar charts to illustrate student performance trends.
5. Social Media and Marketing
Analyzes audience engagement, campaign performance, and demographics.
Example: Pie charts to visualize the age distribution of social media users.
6. Government and Policy-Making
Helps policymakers understand societal issues and monitor programs.
Example: Maps to visualize unemployment rates across states.
7. Big Data Analytics
Simplifies the presentation of vast and complex datasets.
Example: Using interactive dashboards for web traffic analysis.
Q) Data Frame Subsetting
Data Frame
Given data:
Course ID Class Marks
1 11 1 53
2 12 56 2
3 13 75 1
4 14 48 2
5 15 69 1
6 16 84 2
i) Subset of `course` Less than 5 Using `[ ]`
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 22/25
R
# Create the data frame
data <- data.frame(
course = c(1, 2, 3, 4, 5, 6),
id = c(11, 12, 13, 14, 15, 16),
class = c(1, 56, 75, 48, 69, 84),
marks = c(53, 2, 1, 2, 1, 2)
)
# Subset where course < 5
subset1 <- data[data$course < 5, ]
# Print the subset
print(subset1)
Output:
kotlin
course id class marks
1 1 11 1 53
2 2 12 56 2
3 3 13 75 1
4 4 14 48 2
ii) Subset Where `course` < 4 or `class` = 1 Using `subset()`
# Subset using conditions
subset2 <- subset(data, course < 4 | class == 1)
# Print the subset
print(subset2)
Output:
kotlin
course id class marks
1 1 11 1 53
2 2 12 56 2
3 3 13 75 1
5 5 15 69 1
Q) Create Dataset and Combine Data
i) Creating a Dataset
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 23/25
R
# Create a dataset named data1
data1 <- c(2, 3, 4, 5, 6.7, 7, 8.1, 9)
# Print the dataset
print(data1)
Output:
csharp
[1] 2.0 3.0 4.0 5.0 6.7 7.0 8.1 9.0
ii) Functions to Combine Datasets
R provides several functions to combine datasets:
1. `c()`:
Combines elements into a vector.
Example:
vec <- c(1, 2, 3, 4)
2. `rbind()`:
Combines datasets by rows.
Example:
df1 <- data.frame(a = c(1, 2), b = c(3, 4))
df2 <- data.frame(a = c(5, 6), b = c(7, 8))
combined <- rbind(df1, df2)
3. `cbind()`:
Combines datasets by columns.
Example:
vec1 <- c(1, 2, 3)
vec2 <- c(4, 5, 6)
combined <- cbind(vec1, vec2)
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 24/25
4. `merge()`:
Merges two data frames based on common columns or rows.
Example:
df1 <- data.frame(ID = c(1, 2), Value = c(10, 20))
df2 <- data.frame(ID = c(2, 3), Value = c(30, 40))
merged <- merge(df1, df2, by = "ID")
5. `list()`:
Combines objects into a list.
Example:
lst <- list(vec1 = c(1, 2), vec2 = c(3, 4))
Each function has specific use cases depending on the structure and requirement of the data.
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 25/25