(Slides) - R As A Calculator and Vectors - Final PDF
(Slides) - R As A Calculator and Vectors - Final PDF
(Slides) - R As A Calculator and Vectors - Final PDF
INTRODUCTION
INSTRUCTED BY :
TUSHAR KAKAIYA
PROFESSOR OF PRACTICE
NARAYANA BUSINESS SCHOOL
The R Language
R is a language and environment for statistical computing and graphics
R provides a wide variety of statistical techniques like:
linear and nonlinear modelling,
classical statistical tests,
time-series analysis,
classification, clustering, and so many…)
It is highly extensible.
R has become popular as the single most important tool for computational statistics, visualization and
data science.
Evolution of R Language
It was inspired by, and is mostly compatible with,
the statistical language S developed at Bell
laboratory (formerly AT & T, now Lucent
technologies).
Although there are some very important
differences between R and S, nevertheless much of
the code written for S runs unaltered on R.
R was initially written by Ross Ihaka and Robert
Gentleman at the Department of Statistics of the
University of Auckland in Auckland, New Zealand. R
made its first appearance in 1993.
A large group of individuals has contributed to R by
sending code and bug reports.
Since mid-1997 there has been a core group (the
"R Core Team") who can modify the R source code
archive.
Features of R
As stated earlier, R is a programming language and software environment for statistical analysis,
graphics representation and reporting.
The following are the important features of R −
R is a well-developed, simple and effective programming language which includes conditionals, loops, user
defined recursive functions and input and output facilities.
R has an effective data handling and storage facility,
R provides a suite of operators for calculations on arrays, lists, vectors and matrices.
R provides a large, coherent and integrated collection of tools for data analysis.
R provides graphical facilities for data analysis and display either directly at the computer or printing at the
papers.
Why should you learn R?
If you have a need to run statistical calculations in your application. Learn and deploy R! It integrates
with programming languages such as Java, C++, Python, Ruby.
If you need to run your own analysis, think of R.
If you are working on an optimization problem, use R.
If there is a need to use reusable libraries to solve a complex problem. Leverage the 2000+ free
libraries provided by R.
If you wish to create compelling charts, leverage the power of R.
If you aspire to be a Data Scientist in future, you will learn R.
If you wish to have fun with statistics, you will learn R.
As of August 2021, R is one of the top five programming languages of the year, so it’s a favorite
among data analysts and research programmers.
Why should you learn R?
R is free. It is available under the terms of the Free Software Foundation’s GNU General
Public License in source code form
It is Available for Windows, Macs, wide variety of Unix platforms (including FreeBSD, Linux,
etc.)
In addition to enabling statistical operations, it’s a general programming language, so that
you can automate your analyses and create new functions
It has excellent tools for creating graphics from bar charts to scatter plots to multi-panel
lattice charts
It is object-oriented and functional programming structure
It has a Support from a robust, vibrant community
Why should you learn R?
It has a flexible analysis tool kit: this makes it easy to access data in various formats,
manipulate it (transform, merge, aggregate, etc.), and subject it to traditional and modern
statistical models (such as regression, ANOVA, tree models etc.)
R can be extended easily via packages
R relates easily to other programming languages. Existing software as well as emerging
software can be integrated with R packages to make them more productive
R can easily import data from MS Excel, MS Access, MySQL, SQLite, Oracle etc.
It can easily connect to databases using ODBC (Open Database Connectivity Protocol) and
ROracle Package
Why should you learn R?
It has a flexible analysis tool kit: this makes it easy to access data in various formats,
manipulate it (transform, merge, aggregate, etc.), and subject it to traditional and modern
statistical models (such as regression, ANOVA, tree models etc.)
R can be extended easily via packages
R relates easily to other programming languages. Existing software as well as emerging
software can be integrated with R packages to make them more productive
R can easily import data from MS Excel, MS Access, MySQL, SQLite, Oracle etc.
It can easily connect to databases using ODBC (Open Database Connectivity Protocol) and
ROracle Package
Why should you learn R?
Advanced Statistics
Great Visualization
Easy extensibility
Cross Platform
Compatibility
Where R is being used?
Google Uber
LinkedIn HP
Facebook Twitter
IBM American Express
Bing And many more…
Mozilla
SAP
Oracle
New York Times
Airbnb
Microsoft
Where R is being used?
Fintech Companies (financial services)
Academic Research
Government (FDA, National Weather
Service)
Retail
Social Media
Data Journalism
Manufacturing
Healthcare
And Many More…
R Usage
Use as a calculator.
Compute several statistics about data.
Complexity
Plot data.
Develop machine learning algorithms.
R as a Calculator
R performs most of the mathematical calculations you can think of:
(10/10) / (5/5) = 1
exp(0) * 20 = 20
exp(sqrt(9)) = 20.08554
R Objects
Most of the stuff we use in R are objects.
Example of R objects:
Vector;
Matrix;
List;
DataFrame;
R Vectors
“There was a problem when weighting, the melons have only half the weight”:
melons/2
“There was a problem when weighting, the melons have two times the weight”:
melons*2
Each melon will have it’s weight summed with the corresponding
element of the new vector so the resulting vector will be:
(3.4+0.4, 3,1+0.2, 3+0.4, 4+0.3)
The cool thing is that we can make this more meaningful by calling the
c(0.4, 0.2, 0.4, 0.3) vector something related, such as
adjust_weight and then our calculation could be:
new_melons = melons+adjust_weight
Vector Operations
melons <- c(3.4, 3.1, 3, 4.5)
“The value of the melons weight is the square root of the value we gave
you”
sqrt(melons)
If you use a vector that has an Infinite element in it, our calculations do
not stand:
sum(c(3.4, 3.1, 3, 4.5/0)) would yield Inf
Fortunately, most functions can dodge this by using an extra argument na.rm =
TRUE
sum(c(3.4, 3.1, 3, NA), na.rm=TRUE) would yield 9.5