23CS302 - dslab - experiment 1

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 5

23CS311 DATA SCIENCE LABORATORY L T P C

0 0 3 1.5

Objectives:
 To understand the python libraries for data science
 To understand the basic Statistical and Probability measures for data science
 To learn descriptive analytics on the benchmark data sets
 To apply correlation and regression analytics on standard data sets
 To present and interpret data using visualization packages in Python

List of experiments:
1. Download, install and explore the features of NumPy, SciPy, Jupyter, Statsmodels and
Pandas packages.
2. Implementation of python programs using NumPy arrays.
3. Implementation of python programs using Pandas data frames.
4. Implementation of python programs to perform descriptive analytics on the Iris data set by
reading text files, Excel and the web of Iris data set.
5. Implementation of python programs to perform the following analysis on diabetes data set
from UCI and Pima Indians Diabetes data set :
a. Univariate analysis: Frequency, Mean, Median, Mode, Variance, Standard
Deviation, Skewness and Kurtosis.
b. Bivariate analysis: Linear and logistic regression modelling
c. Multiple Regression analysis
d. Compare the results of the above analysis for the two data sets.
6. Implementation of python programs to apply and explore various plotting functions on
UCI data sets.
a. Normal curves
b. Density and contour plots
c. Corelation and scatter plots
d. Histograms
e. Three dimensional plotting
7. Implementation of python programs for visualizing Geographic Data with Basemap

Ex.No.1 Download, install and explore the features of NumPy,


Date: SciPy, Jupyter, Statsmodels and Pandas packages
Aim:
To download, install and explore the features of NumPy, SciPy, Jupyter,
Statsmodels and Pandas packages.

NumPy:
NumPy(Numerical Python) is a fundamental open source library for numerical
computing in Python, providing support for large, multi-dimensional arrays and matrices,
along with a collection of mathematical functions to operate on these arrays.

SciPy:
SciPy is a Python library that extends NumPy's capabilities by providing additional
functions for scientific and technical computing. It includes modules for optimization,
integration, interpolation, eigenvalue problems, and more, making it a versatile tool for
complex mathematical and scientific tasks.

Jupyter:
Jupyter is an open-source web application that allows you to create and share
documents containing live code, equations, visualizations, and narrative text, facilitating
interactive computing and data analysis.

Statsmodels:
Statsmodels is a Python library for estimating and interpreting statistical models,
providing tools for regression analysis, hypothesis testing, and various statistical methods.

Pandas:
Pandas is a Python library for data manipulation and analysis, offering data structures
like DataFrames and Series to handle and analyze tabular data with ease.

Downloading and Installing NumPy:


1. Ensure Python is Installed:
 Before installing NumPy, make sure you have Python installed on your system.
2. Open Command Prompt or Terminal:
 On Windows, you can open Command Prompt by searching for cmd or PowerShell.
 On macOS or Linux, you can open the Terminal application.
3. Use pip to Install NumPy:
 pip is the package installer for Python. You can use it to install NumPy.
Run the following command in your Command Prompt or Terminal:
 pip install numpy
4. Verify the Installation:
 After installation, you can verify that NumPy is installed correctly by opening a
Python interpreter and importing NumPy:
import numpy as np
print(np.__version__)
Sample Python program using NumPy:
import numpy as np
# Create a 1D array
array = np.array([1, 2, 3, 4, 5])
print("Original Array:")
print(array)
# Add 10 to each element in the array
result = array + 10
print("\nArray after adding 10 to each element:")
print(result)
Output:
Original Array:
[1 2 3 4 5]
Array after adding 10 to each element:
[11 12 13 14 15]

Downloading and Installing SciPy:


Run the following command to install SciPy:
 pip install scipy

Verify the Installation:


 To ensure that SciPy has been installed correctly, you can check the version by
opening a Python interpreter and running the following commands:

import scipy
print(scipy.__version__)
Sample Program:
import numpy as np
from scipy.optimize import minimize
from scipy.integrate import quad
def quadratic_function(x):
return x**2 + 3*x + 2
result = minimize(quadratic_function, x0=0) # x0 is the initial guess
print("Optimization Result:")
print("Optimal value of x:", result.x[0])
print("Minimum value of the function:", result.fun)

Output:
Optimization Result:
Optimal value of x: -1.5
Minimum value of the function: 0.25

Example 2: Integration - Integrate x^2 from 0 to 1


import numpy as np
from scipy.optimize import minimize
from scipy.integrate import quad
def integrand(x):
return x**2
integral, error = quad(integrand, 0, 1)
print("\nIntegration Result:")
print("Integral of x^2 from 0 to 1:", integral)
print("Estimated error:", error)
Output:
Integration Result:
Integral of x^2 from 0 to 1: 0.3333333333333333
Estimated error: 3.700743415417188e-11

Downloading and Installing Statsmodels:

1. Open a Terminal or Command Prompt:


 On Windows, you can open Command Prompt or PowerShell.
 On macOS or Linux, open the Terminal.

2. Run the Installation Command:


 pip install statsmodels

3. Using Conda (if you use Anaconda or Miniconda)


 Open Anaconda Prompt or Terminal:
 Run the Installation Command:
 conda install statsmodels

4. Verifying the Installation


 After installation, you can verify that Statsmodels is installed correctly by running the
following in a Python interpreter or script:

import statsmodels.api as sm
print(sm.__version__)

Downloading and Installing Pandas:

1. Open a Terminal or Command Prompt:


 On Windows, you can open Command Prompt or PowerShell.
 On macOS or Linux, open the Terminal.

2. Run the Installation Command:


 pip install pandas

3. Using Conda (if you use Anaconda or Miniconda)


 Open Anaconda Prompt or Terminal:
 Run the Installation Command:
 conda install pandas

4. Verifying the Installation


 After installing Pandas, you can verify that it is correctly installed by running the
following in a Python interpreter or script:
import pandas as pd
print(pd.__version__)

Downloading and Installing Jupyter:

1. Open a Terminal or Command Prompt:


 On Windows, you can use Command Prompt or PowerShell.
 On macOS or Linux, open the Terminal.

2. Run the Installation Command for Jupyter Notebook:


 pip install notebook
 This installs Jupyter Notebook, which provides a web-based interface for interactive
computing.
 For JupyterLab, which is a more modern and flexible interface, use:
 pip install jupyterlab

3. Using Conda (if you use Anaconda or Miniconda)


 Open Anaconda Prompt or Terminal:
 Run the Installation Command for Jupyter Notebook:
 conda install notebook
 For JupyterLab, use:
 conda install jupyterlab

4. Running Jupyter
 After installation, you can start Jupyter Notebook or JupyterLab:
 For Jupyter Notebook:
 jupyter notebook
 For JupyterLab:
 jupyter lab

5. Verifying the Installation


 To check if Jupyter is installed correctly, you can run:
 jupyter –version

Result:

Thus the NumPy, SciPy, Jupyter, Statsmodels and Pandas packages are downloaded
and installed successfully.

You might also like