Introduction To R

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

K.S.

INSTITUTE OF TECHNOLOGY, BENGALURU - 560109


LABORATORY MANUAL 2023-2024 ODD SEMESTER

Degree : B. E Semester : III


Branch : COMPUTER SCICENCE & ENGINEEERING Course : BCS358B
Code
Course Title : R Programming

INTRODUCTION TO R
Aim: To understand about the history of R Programming, the basics of R programming Statistic
Importance of R Programming.
Objective: The students will be able to
• Install R and starting R for implementation
• Understand the History, Evolution, Features, Advantages and disadvantages, Applications of
R programming and Comparison with python.
Concepts / Theory:

History of R
• R is a language and environment for statistical computing and graphics. It is a GNU
project which is similar to the S language and environment which was developed at Bell
Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and
colleagues. R can be considered as a different implementation of S. There are some
important differences, but much code written for S runs unaltered under R.
• R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical
tests, time-series analysis, classification, clustering, …) and graphical techniques, and is
highly extensible. The S language is often the vehicle of choice for research in statistical
methodology, and R provides an Open-Source route to participation in that activity.
• One of R’s strengths is the ease with which well-designed publication-quality plots can be
produced, including mathematical symbols and formulae where needed. Great care has
been taken over the defaults for the minor design choices in graphics, but the user retains
full control.
• R is available as Free Software under the terms of the Free Software Foundation’s GNU
General Public License in source code form. It compiles and runs on a wide variety of
UNIX platforms and similar systems (including FreeBSD and Linux), Windows and
macOS.

The R environment
• R is an integrated suite of software facilities for data manipulation, calculation and
graphical display. It includes
1. an effective data handling and storage facility,
2. a suite of operators for calculations on arrays, in particular matrices,

Namyapriya Dayananda, Asst Professor, CSE, KSIT


3. a large, coherent, integrated collection of intermediate tools for data analysis,
4. graphical facilities for data analysis and display either on-screen or on hardcopy, and
5. a well-developed, simple and effective programming language which includes
conditionals, loops, user-defined recursive functions and input and output facilities.
• The term “environment” is intended to characterize it as a fully planned and coherent
system, rather than an incremental accretion of very specific and inflexible tools, as is
frequently the case with other data analysis software.
• R is an open-source programming language that is widely used as a statistical software and
data analysis tool. R generally comes with the Command-line interface. R is available
across widely used platforms like Windows, Linux, and macOS. Also, the R programming
language is the latest cutting-edge tool.
• It was designed by Ross Ihaka and Robert Gentleman at the University of Auckland,
New Zealand, and is currently developed by the R Development Core Team. R
programming language is an implementation of the S programming language. It also
combines with lexical scoping semantics inspired by Scheme. Moreover, the project
conceives in 1992, with an initial version released in 1995 and a stable beta version in 2000.

Why R Programming Language?

• R programming is used as a leading tool for machine learning, statistics, and data
analysis. Objects, functions, and packages can easily be created by R.
• It’s a platform-independent language. This means it can be applied to all operating
system.
• It’s an open-source free language. That means anyone can install it in any
organization without purchasing a license.
• R programming language is not only a statistic package but also allows us to
integrate with other languages (C, C++). Thus, you can easily interact with many
data sources and statistical packages.
• The R programming language has a vast community of users and it’s growing day
by day.
• R is currently one of the most requested programming languages in the Data Science
job market that makes it the hottest trend nowadays.

Namyapriya Dayananda, Asst Professor, CSE, KSIT


Install R GUI and R Studio
R GUI
1. To install R, go to cran.r-project.org
2. Depending on your operating system, click Download R for (your operating system).
3. Click on install R for the first time.
4. Click Download R for Windows. Open the downloaded file.
5. Select the language you would like to use during the installation. Then click OK.
6. Click Next.
7. Select where you would like R to be installed. It will default to your Program Files on your
C Drive. Click Next.
8. You can then choose which installation you would like.
9. (Optional) If your computer is a 64-bit, you can choose the 64-bit User Installation. Then
click Next.
10. Then specify if you want to customized your startup or just use the defaults. Then click
Next.
11. Then you can choose the folder that you want R to be saved within or the default if the R
folder that was created. Once you have finished, click Next.
12. You can then select additional shortcuts if you would like. Click Next.
13. Click Finish.
R Studio
1. Download RStudio. Go to www.rstudio.com
2. Click Download RStudio
3. Click Download under RStudio Desktop- Open-Source License
4. Click on the operating system that you are working with.
5. The RStudio installation wizard will pop-up. Click Next and go through the installation
steps.
Features of R Programming Language

R is a domain-specific programming language which aims to do data analysis. It has some


unique features which make it very powerful. The most important arguably being the notation
of vectors. These vectors allow us to perform a complex operation on a set of values in a single
command. There are the following features of R programming:

• It is a simple and effective programming language which has been well developed.
• It is data analysis software.
• It is a well-designed, easy, and effective language which has the concepts of user-
defined, looping, conditional, and various I/O facilities.
• It has a consistent and incorporated set of tools which are used for data analysis.

Namyapriya Dayananda, Asst Professor, CSE, KSIT


• For different types of calculation on arrays, lists and vectors, R contains a suite of
operators.
• It provides effective data handling and storage facility.
• It is an open-source, powerful, and highly extensible software.
• It provides highly extensible graphical techniques.
• It allows us to perform multiple calculations using vectors.
• R is an interpreted language.

Advantages and Disadvantages of R


R is the most popular programming language for statistical modeling and analysis. Like other
programming languages, R also has some advantages and disadvantages. It is a continuously
evolving language which means that many cons will slowly fade away with future updates to
R.

Applications of R:
• Data Science
R provides its users with a statistical computing environment ideal for analysing statistical
information and offers a wide range of libraries used in statistics.
• Research
R is used to perform complex calculations. It is used as a statistical research tool to perform
techniques like linear and non-linear modelling, classical statistic tests, time series analysis,
etc.
• IT Industry
R is a business intelligence tool in IT and product-based companies. For example - Infosys,
Google, IBM, Microsoft, Paytm, etc.
• Finance

Namyapriya Dayananda, Asst Professor, CSE, KSIT


R plays a significant role in making commercial and finance-related decisions. R’s data
visualisation and analysis tools are used to make candlestick charts, graphical studies,
financial data mining, etc.
• Healthcare
R is used to make Data Processing and Data Analysis simple. It is used to analyse genetic
sequences in fields like genetics and to research and test chemical reactions in areas like
drug development.

•We use R for Data Science. It gives us a broad variety of libraries related to statistics.
It also provides the environment for statistical computing and design.
• R is used by many quantitative analysts as its programming tool. Thus, it helps in data
importing and cleaning.
• R is the most prevalent language. So many data analysts and research programmers use
it. Hence, it is used as a fundamental tool for finance.
• Tech giants like Google, Facebook, Bing, Twitter, Accenture, Wipro and many more
using R nowadays.
R in Comparison with Python Programming

Comparison R Python
Index
Overview "R is an interpreted computer Python is an Interpreted high-level
programming language which was programming language used for general-
created by Ross Ihaka and Robert purpose programming. Guido Van
Gentleman at the University of Rossum created it, and it was first
Auckland, New Zealand." The R released in 1991. Python has a very
Development Core Team currently simple and clean code syntax. It
develops R. R is also a software emphasizes the code readability and
environment which is used to analyze

Namyapriya Dayananda, Asst Professor, CSE, KSIT


statistical information, graphical debugging is also simple and easier in
representation, reporting, and data Python.
modeling.
Specialties for R packages have advanced techniques For finding outliers in a data set both R
data science which are very useful for statistical and Python are equally good. But for
work. The CRAN text view is provided developing a web service to allow peoples
by many useful R packages. These to upload datasets and find outliers,
packages cover everything from Python is better.
Psychometrics to Genetics to Finance.
Functionalities For data analysis, R has inbuilt Most of the data analysis functionalities
functionalities are not inbuilt. They are available through
packages like Numpy and Pandas
Key domains Data visualization is a key aspect of Python is better for deep learning because
of application analysis. R packages such as ggplot2, Python packages such as Caffe, Keras,
ggvis, lattice, etc. make data OpenNN, etc. allows the development of
visualization easier. the deep neural network in a very simple
way.
Availability of There are hundreds of packages and Python has few main packages such as
packages ways to accomplish needful data viz, Sccikit learn, and Pandas for data
science tasks. analysis of machine learning,
respectively.

Conclusion: In Conclusion, we were able to understand the R Programming Platform, features,


applications in real world and its statistical approach is towards data visualization.

Namyapriya Dayananda, Asst Professor, CSE, KSIT

You might also like