0% found this document useful (0 votes)

11 views

Python Codes Test 2

Uploaded by

Manish Mohapatra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views

Python Codes Test 2

Uploaded by

Manish Mohapatra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 12

Test 2

import os
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
import seaborn as sns
import scipy.stats as sps
import statsmodels as sm
import statsmodels.formula.api as smf

Student specific dataset

# Create Student Specific Unique Dataset

"""
Example:
Student's Roll Number = 012345
my_df = df.sample (n = 2345, random_state = 12345, ignore_index = True)
"""

my_df = df.sample(n = 5044, random_state = 44, ignore_index = True)

# Display Sample Information

my_df.info()

A1. Identify the Nominal, Categorical, Ordinal & Continuous Variables in the Data.

import pandas as pd

# Assuming 'df' is your original DataFrame

# Make sure 'df' contains the necessary columns mentioned in your code

# Create Student Specific Unique Dataset

my_df = df.sample(n=5007, random_state=7, ignore_index=True)

# Define variable types

nominal_vars = ['Country', 'Gender', 'Credit_Card', 'Active_Member',
'Customer_Churn']
categorical_vars = nominal_vars
ordinal_vars = ['Product_Category', 'Tenure_Years']
continuous_vars = ['Age', 'Yearly_Average_Salary',
'Yearly_Average_Balance', 'Credit_Score']

# Convert columns to categorical data type

my_df[nominal_vars] = my_df[nominal_vars].astype('category')
my_df[categorical_vars] = my_df[categorical_vars].astype('category')
my_df[ordinal_vars] = my_df[ordinal_vars].astype('category')
# No need to convert continuous variables to a specific type as they
can remain as numeric types

# Print the variable types

print('Nominal variables:', nominal_vars)
print('Categorical variables:', categorical_vars)
print('Ordinal variables:', ordinal_vars)
print('Continuous variables:', continuous_vars)
 nominal_vars: Variables that represent categories without any order (e.g., Country, Gender).
 categorical_vars: Same as nominal_vars in this case. These terms are sometimes used interchangeably.
 ordinal_vars: Variables with categories having a meaningful order (e.g., Product_Category, Tenure_Years).
 continuous_vars: Variables that can take any numerical value within a range (e.g., Age, Yearly_Average_Salary).
 This code converts the specified columns in my_df to the categorical data type. This is done to explicitly indicate to the pandas library that these columns should be treated as categorical variables.

A2. Display the Descriptive Statistics for all Non-Categorical Variables

# A2.
# Select non-categorical variables
non_categorical_vars = ['Age', 'Yearly_Average_Salary',
'Yearly_Average_Balance', 'Credit_Score', 'Tenure_Years','Customer_Id']

# Display descriptive statistics in a tabular format

print(my_df[non_categorical_vars].describe())
1.This line defines a list called non_categorical_vars that contains the names of columns representing non-categorical variables. These
variables include 'Age', 'Yearly_Average_Salary', 'Yearly_Average_Balance', 'Credit_Score', 'Tenure_Years', and 'Customer_Id'.

2.Here, my_df[non_categorical_vars] selects only the columns corresponding to non-categorical variables from the DataFrame my_df.
The describe() function is then applied to this subset of the DataFrame.
The describe() function provides summary statistics of the selected variables, including count, mean, standard deviation, minimum, 25th
percentile (Q1), median (50th percentile or Q2), 75th percentile (Q3), and maximum.

A3. Display the Descriptive Statistics for any 02 Categorical Variables.

# A3.
# Select two categorical variables
cat_var1 = 'Country'
cat_var2 = 'Gender'

# Display descriptive statistics for the selected variables

print('Descriptive statistics for', cat_var1)
print(my_df[cat_var1].value_counts())

print('\nDescriptive statistics for', cat_var2)

print(my_df[cat_var2].value_counts())
These lines define two variables, cat_var1 and cat_var2, representing the names of the categorical variables to be analyzed. In this case,
'Country' and 'Gender' are selected.

These lines print out the descriptive statistics for the first categorical variable, 'Country'. The value_counts() function is used to count the
occurrences of each unique value in the 'Country' column, providing a frequency distribution.
B1. Calculate Maximum Value of the above Continuous Variable lying at the Bottom 10%.

# B1.
# Selecting the continuous variable 'Age'
continuous_variable = 'Age'

# Calculate Maximum Value of the Continuous Variable lying at the

Bottom 10%
bottom_10_percent_value = my_df[continuous_variable].quantile(0.1)
print(f"Maximum value at the bottom 10% of {continuous_variable}:
{bottom_10_percent_value}")
The quantile(0.1) function is used to calculate the value below which 10% of the data falls. In other words, it calculates the 10th percentile
of the 'Age' variable. This value represents the boundary below which the bottom 10% of the data points lie.

''' B1'''
The inference from the calculated maximum value at the bottom 10% of 'Age' (which is 28.0) is that within the lowest
10% of ages in your dataset, the oldest individual is 28 years old. This provides a specific data point that describes the
upper boundary of the age distribution for the bottom decile (10%) of your dataset.

B2. Calculate Minimum Value of the above Continuous Variable lying at the Top 10%.

# B2.
# Calculate Minimum Value of the Continuous Variable lying at the Top
10%
top_10_percent_value = my_df[continuous_variable].quantile(0.9)
print(f"Minimum value at the top 10% of {continuous_variable}:
{top_10_percent_value}")

The quantile(0.9) function is used to calculate the value below which 90% of the data falls. In other words, it calculates the 90th percentile
of the 'Age' variable. This value represents the boundary above which the top 10% of the data points lie.

The output indicates that the minimum value at the top 10% of the continuous variable 'Age' in your DataFrame (my_df) is 53.0. This
means that among the highest 10% of ages in your dataset, the youngest individual is 53 years old.

C1. Taking a Categorical Variariable having 02 Categories and a Continuous Variable, Conduct a T-Test.

import scipy.stats as stats

# Assuming 'Gender' is a categorical variable with 2 categories and

'Age' is a continuous variable
category_1 = my_df[my_df['Gender'] == 'Male']['Age']
category_2 = my_df[my_df['Gender'] == 'Female']['Age']

# Performing a t-test
t_statistic, p_value = stats.ttest_ind(category_1, category_2)
print(f"T-Test results - t-statistic: {t_statistic}, p-value:
{p_value}")

These lines extract the 'Age' values for two categories of the 'Gender' variable. category_1 contains ages for males, and category_2
contains ages for females.
The ttest_ind function from the scipy.stats module is used to perform an independent two-sample t-test. It calculates the t-statistic and
the p-value associated with the test.

The t-test results indicate that there is a statistically significant difference between the two groups (Male and Female) based on the
continuous variable 'Age'.

1. T-Statistic: The t-statistic is approximately -2.62. This value represents the difference between the means of the two groups in terms
of the number of standard deviations. A negative t-statistic suggests that, on average, the 'Male' group has a lower age than the
'Female' group.

2. P-Value: The p-value is approximately 0.0088. This p-value is below the commonly used significance level of 0.05. The p-value
represents the probability of observing such extreme results (or more extreme) under the assumption that the null hypothesis is true. In
this context, a p-value below 0.05 suggests that the observed difference in age between the 'Male' and 'Female' groups is unlikely to
have occurred by random chance alone. Therefore, you may reject the null hypothesis and conclude that there is a statistically
significant difference in age between the two gender groups.

C2. Taking a Categorical Variariable having 03 or more Categories and a Continuous Variable, Conduct an ANOVA.

# C2.
import statsmodels.api as sm
from statsmodels.formula.api import ols

# Assuming 'Country' is a categorical variable with 3 or more

categories and 'Age' is a continuous variable
model = ols('Age ~ Country', data=my_df).fit()
anova_table = sm.stats.anova_lm(model, typ=2)
print("ANOVA results:")
print(anova_table)

This line specifies a linear model using Ordinary Least Squares (OLS) regression to predict the 'Age' variable based on the 'Country' variable.
The .fit() method fits the model to the data.

The sm.stats.anova_lm function is used to perform the ANOVA. The typ=2 argument specifies that it is a two-way ANOVA.

Finally, the code prints the ANOVA results, which include various statistics such as the sum of squares, degrees of freedom, mean squares,
F-statistic, and p-value.

The ANOVA results suggest that there is a statistically significant difference in the mean age across the categories of the 'Country'
variable. Here's a breakdown of the key components of the ANOVA table:

1. sum_sq (Sum of Squares): This column represents the sum of squared deviations from the mean. For 'Country', the sum of squares is
2095.841380.

2. df (Degrees of Freedom): The degrees of freedom associated with the 'Country' variable is 2. This is the number of categories in
'Country' minus 1.

3. F (F-statistic): The F-statistic is a ratio of the variance between group means to the variance within groups. In this case, the F-statistic
is 9.296834.

4. PR(>F) (p-value): The p-value associated with the F-statistic is 0.000093, which is much smaller than the commonly used significance
level of 0.05. This suggests that the difference in mean age across the categories of 'Country' is unlikely to be due to random chance
alone. Therefore, you may reject the null hypothesis and conclude that there is a statistically significant difference in the mean age
across different countries.

C3. Taking 02 Variables, Conduct a Test of Correlation.

# Assuming 'Yearly_Average_Salary' and 'Yearly_Average_Balance' are

continuous variables
Correlation=my_df['Yearly_Average_Salary'].corr(my_df['Yearly_Average_B
alance'])
print(f"Correlation between Salary and Balance: {correlation}")

The .corr() method is used to calculate the correlation coefficient between 'Yearly_Average_Salary' and 'Yearly_Average_Balance'. The
result is stored in the variable correlation.

This line prints the computed correlation coefficient between the two variables.

The correlation coefficient ranges from -1 to 1.

A value of 1 indicates a perfect positive correlation (both variables increase or decrease together).
A value of -1 indicates a perfect negative correlation (one variable increases as the other decreases).
A value of 0 indicates no linear correlation.

The calculated correlation between 'Yearly_Average_Salary' and 'Yearly_Average_Balance' is approximately -0.0016. This correlation is
very close to zero, indicating a very weak or negligible linear relationship between the two continuous variables.

C4. Taking a Continuous Variable, Conduct a Test of Normality.

# C4.
# Assuming 'Credit_Score' is a continuous variable
statistic, p_value = stats.normaltest(my_df['Credit_Score'])
print(f"Normality Test results - statistic: {statistic}, p-value:
{p_value}")
The normaltest function tests the null hypothesis that a sample comes from a normal distribution. It returns a test statistic and a p-value.
The statistic value is the test statistic, and the p_value is the probability that the data is normally distributed.

This line prints the results of the normality test, including the test statistic and the p-value.
Interpretation:
If the p-value is less than a chosen significance level (e.g., 0.05), you may reject the null hypothesis and conclude that the data is not
normally distributed.
If the p-value is greater than the significance level, you may fail to reject the null hypothesis, suggesting that there is no strong evidence
against the assumption of normality.

The normality test results indicate that the 'Credit_Score' variable does not follow a normal distribution. Here's an interpretation of the
results:

Statistic: The normality test statistic is 70.12. This statistic is used to assess how well the data follows a normal distribution. In this case,
the higher the statistic, the more evidence there is against normality.

P-Value: The p-value associated with the normality test is very close to zero (5.95e-16). A low p-value indicates strong evidence against
the null hypothesis that the data follows a normal distribution.

D1. Develop a Multiple Linear Regression Model (with at least 03 Continuous Variables as Input). Describe the Results. Predict the
Dependent Variable with artifitial Inputs.

#d1 import statsmodels.api as sm

import pandas as pd
import numpy as np

# Assuming your dataset is stored in a DataFrame named 'df'

# Replace 'df' with the actual name of your DataFrame

# Generate an example dataset with continuous variables

np.random.seed(42)
df = pd.DataFrame({
'Dependent_Variable': np.random.rand(100),
'Continuous_Var1': np.random.randn(100),
'Continuous_Var2': np.random.randn(100),
'Continuous_Var3': np.random.randn(100)
})

# Model with at least three continuous variables as input

mlr_model = smf.ols('Dependent_Variable ~ Continuous_Var1 +
Continuous_Var2 + Continuous_Var3', data=df).fit()

# Describe the results

print(mlr_model.summary())

# Predict the dependent variable with artificial inputs

new_data = pd.DataFrame({
'Continuous_Var1': [1.5],
'Continuous_Var2': [0.8],
'Continuous_Var3': [-2.0]
})

mlr_prediction = mlr_model.predict(new_data)
print("\nMultiple Linear Regression Prediction:")
print(mlr_prediction)

This part creates a DataFrame 'df' with 100 rows, where 'Dependent_Variable' is a random variable, and 'Continuous_Var1',
'Continuous_Var2', and 'Continuous_Var3' are continuous variables with random values.

This line fits the MLR model using the ordinary least squares (OLS) method. The formula 'Dependent_Variable ~ Continuous_Var1 +
Continuous_Var2 + Continuous_Var3' specifies the relationship between the dependent variable and the three continuous predictor
variables.

This line prints a summary of the MLR model, including coefficients, standard errors, t-statistics, p-values, and other relevant statistics.

This part creates a new DataFrame 'new_data' with artificial values for the continuous predictor variables. The predict method is then used
to predict the dependent variable based on these artificial inputs.

Model Fit: The R-squared value is low (0.025), indicating that the model does not explain a substantial proportion of the variability in
the dependent variable. It suggests that the current set of independent variables may not be strong predictors of the dependent
variable.

Variable Significance: The p-values for the individual coefficients suggest that none of the continuous variables ('Continuous_Var1,'
'Continuous_Var2,' 'Continuous_Var3') are statistically significant predictors of the dependent variable in this model.

Prediction: The predicted value for the new set of input values is provided, but the overall model's predictive power seems limited
based on the low R-squared value.

import statsmodels.api as sm

import pandas as pd

import numpy as np
# Generate an example dataset with continuous and categorical variables

np.random.seed(42)

df = pd.DataFrame({

'Dependent_Variable': np.random.rand(100),

'Continuous_Var1': np.random.randn(100),

'Continuous_Var2': np.random.randn(100),

'Categorical_Var': np.random.choice(['A', 'B', 'C'], size=100)

})

# Model with at least two continuous variables and one categorical variable as input

mlr_model = smf.ols('Dependent_Variable ~ Continuous_Var1 + Continuous_Var2 + C(Categorical_Var)', data=df).fit()

# Describe the results

print(mlr_model.summary())

# Predict the dependent variable with artificial inputs

new_data = pd.DataFrame({

'Continuous_Var1': [1.5],

'Continuous_Var2': [0.8],

'Categorical_Var': ['B']

})

# Create dummy variables for the categorical variable in the new data

new_data_dummies = pd.get_dummies(new_data['Categorical_Var'], prefix='Categorical_Var', drop_first=True)

new_data = pd.concat([new_data, new_data_dummies], axis=1)

new_data.drop('Categorical_Var', axis=1, inplace=True)

# Make predictions using the trained model

mlr_prediction = mlr_model.predict(new_data)

print("\nMultiple Linear Regression Prediction:")

print(mlr_prediction)

The code uses NumPy to generate random data for a dependent variable ('Dependent_Variable'), two continuous independent variables
('Continuous_Var1' and 'Continuous_Var2'), and a categorical variable ('Categorical_Var') with three categories (A, B, C).

 The ols function from the statsmodels.formula.api module is used to specify and fit the MLR model.
 The formula 'Dependent_Variable ~ Continuous_Var1 + Continuous_Var2 + C(Categorical_Var)' expresses the relationship
between the dependent variable and the independent variables. The C() notation indicates that 'Categorical_Var' is a
categorical variable.
 The .fit() method fits the model to the provided dataset.
 The summary() method is called on the fitted model to display detailed statistics and information about the regression results.
 The summary includes coefficients, standard errors, t-statistics, p-values, R-squared, and other relevant statistics.
A new DataFrame (new_data) is created with artificial values for the continuous and categorical predictor variables.

Dummy variables are created for the categorical variable ('Categorical_Var') using pd.get_dummies(). The prefix='Categorical_Var' adds a
prefix to the dummy variable names, and drop_first=True drops one of the dummy variables to avoid multicollinearity.

The predict method is used to make predictions for the dependent variable based on the artificial input values.

The predicted values for the dependent variable are printed based on the artificial input values.

Interpretation
The output from the Multiple Linear Regression (MLR) model summary provides various statistics and information that can be interpreted
to understand the relationships between the variables in the model. Here's an interpretation of some key parts of the output:a

1. **Coefficients:**

- **Intercept (0.5010):** The intercept represents the estimated value of the dependent variable when all independent variables are
zero.

- **Categorical_Var[T.B] (-0.0653):** This is the change in the dependent variable associated with being in category B compared to the
reference category (assumed to be A for dummy variable coding). In this case, it suggests a decrease of 0.0653 in the dependent variable
when 'Categorical_Var' is B.

- **Categorical_Var[T.C] (-0.0500):** Similar interpretation for category C compared to the reference category.

2. **Continuous Variables:**

- **Continuous_Var1 (-0.0510):** A one-unit increase in 'Continuous_Var1' is associated with a decrease of 0.0510 in the dependent
variable.

- **Continuous_Var2 (0.0041):** A one-unit increase in 'Continuous_Var2' is associated with an increase of 0.0041 in the dependent
variable.

3. **P-values:**

- **P>|t|:** These p-values assess the statistical significance of each coefficient. A p-value less than the chosen significance level (e.g.,
0.05) indicates that the variable is statistically significant.

4. **R-squared (0.031):**

- R-squared measures the proportion of the variance in the dependent variable explained by the model. In this case, only 3.1% of the
variance in the dependent variable is explained by the model.

5. **F-statistic (0.7722):**

- The F-statistic tests the overall significance of the model. In this case, the value is relatively low, suggesting that the model may not be
statistically significant.

6. Prob (F-statistic) (0.546):

- This p-value associated with the F-statistic tests the null hypothesis that all coefficients in the model are equal to zero. A low p-value
indicates that at least one coefficient is significantly different from zero.

7. Omnibus, Jarque-Bera, and Skew/Kurtosis:

- These tests assess the normality assumptions of the residuals. Low p-values may indicate departures from normality.

8. **Durbin-Watson (1.982):**

- The Durbin-Watson statistic tests for autocorrelation in the residuals. A value around 2 suggests no significant autocorrelation.

9. Cond. No. (3.51):

- The condition number assesses multicollinearity. A large condition number may indicate high multicollinearity among the independent
variables.

10. **Notes:**

- Provides additional information, including assumptions made during model estimation.

It's important to carefully interpret each coefficient, considering its statistical significance, and keep in mind that statistical significance
does not imply practical significance. The low R-squared and F-statistic suggest that the model may not be a good fit for the data. Further
model refinement or additional variables may be necessary for better predictions.

D3. Develop a Logistic Regression Model (with at least 02 Continuous Variables as Input & 01 Categorical Variable as Output). Describe
the Results. Predict the Dependent Variable with artifitial Inputs.
To develop a Logistic Regression model with at least two continuous variables as input and one categorical variable as output, you can use
the `statsmodels` library in Python. Below is an example code that demonstrates how to create a Logistic Regression model, describe the
results, and predict the dependent variable with artificial inputs:

```python

import statsmodels.api as sm

import pandas as pd

import numpy as np

# Generate an example dataset with continuous and categorical variables

np.random.seed(42)

df = pd.DataFrame({

'Dependent_Category': np.random.choice([0, 1], size=100), # Binary outcome variable

'Continuous_Var1': np.random.randn(100),

'Continuous_Var2': np.random.randn(100),

})

# Model with at least two continuous variables as input

logit_model = sm.Logit(df['Dependent_Category'], sm.add_constant(df[['Continuous_Var1', 'Continuous_Var2']])).fit()

# Describe the results

print(logit_model.summary())

# Predict the dependent variable with artificial inputs

new_data = pd.DataFrame({

'Continuous_Var1': [1.5],

'Continuous_Var2': [0.8],

})

# Add a constant term to the new data

new_data = sm.add_constant(new_data)

# Make predictions using the trained model

logit_prediction = logit_model.predict(new_data)

print("\nLogistic Regression Prediction:")

print(logit_prediction)

```

Explanation of the code:

1. Generate Example Dataset:

- A binary outcome variable (`Dependent_Category`) is generated with random 0s and 1s.

- Two continuous variables (`Continuous_Var1` and `Continuous_Var2`) are also generated.

2. Fit the Logistic Regression Model:

- The `Logit` class from `statsmodels.api` is used to specify and fit the Logistic Regression model.

- `sm.add_constant` is used to add a constant term to the independent variables.

3. Describe the Results:

- `summary()` is called on the fitted model to display detailed statistics and information about the logistic regression results.

4. Predict the Dependent Variable with Artificial Inputs:

- A new DataFrame (`new_data`) is created with artificial values for the continuous variables.

- A constant term is added to the new data using `sm.add_constant`.

- Predictions are made using the `predict` method.

5. Print the Logistic Regression Prediction:

- The predicted probabilities for the dependent variable are printed based on the artificial input values.

Note: Ensure that you have the required libraries installed (`statsmodels`, `pandas`, `numpy`) before running the code. Also, in logistic
regression, the dependent variable should be binary (0 or 1). Adjust the dataset and variable names as needed for your specific case.

Interpretation
The output of the Logistic Regression model summary provides information about the coefficients, odds ratios, and statistical significance.
Here's an interpretation of some key parts of the output:

1. **Coefficients:**

- **Intercept:** Represents the log-odds of the dependent variable being 1 when all independent variables are zero.

- **Continuous_Var1 and Continuous_Var2:** Represent the change in the log-odds of the dependent variable associated with a one-
unit increase in each respective continuous variable.

2. **Odds Ratios:**

- Exponentiating the coefficients gives the odds ratios.

- **Odds Ratio for Continuous_Var1:** It represents the multiplicative change in odds for a one-unit increase in Continuous_Var1.

- **Odds Ratio for Continuous_Var2:** It represents the multiplicative change in odds for a one-unit increase in Continuous_Var2.

3. **P-values:**

- Test the null hypothesis that each coefficient is equal to zero (no effect).

- Low p-values suggest that the corresponding independent variable is statistically significant.
4. **Log-Likelihood, AIC, and BIC:**

- Log-Likelihood measures how well the model explains the observed data.

- AIC (Akaike Information Criterion) and BIC (Bayesian Information Criterion) are measures of model fit, considering complexity.

5. Wald Test (z-value):

- It tests the hypothesis that each coefficient is equal to zero.

6. **Pseudo R-squared:**

- Provides a measure of model fit. In logistic regression, pseudo R-squared values are used as they are different from traditional R-
squared used in linear regression.

Let's interpret an example output:

```plaintext

Logit Regression Results

==============================================================================

Dep. Variable: Dependent_Category No. Observations: 100

Model: Logit Df Residuals: 97

Method: MLE Df Model: 2

Date: Wed, 27 Dec 2023 Pseudo R-squ.: 0.05495

Time: 10:34:17 Log-Likelihood: -69.948

converged: True LL-Null: -73.920

Covariance Type: nonrobust LLR p-value: 6.911e-03

==================================================================================

coef std err z P>|z| [0.025 0.975]

----------------------------------------------------------------------------------

Intercept -0.0216 0.314 -0.069 0.945 -0.637 0.594

Continuous_Var1 -0.4842 0.276 -1.751 0.080 -1.027 0.058

Continuous_Var2 0.2923 0.282 1.038 0.299 -0.260 0.844

==================================================================================

```

Interpretation:

- The intercept (-0.0216) represents the log-odds of the dependent variable being 1 when both continuous variables are zero.

- The coefficient for Continuous_Var1 (-0.4842) indicates that a one-unit increase in Continuous_Var1 is associated with a decrease in the
log-odds of the dependent variable being 1. The odds ratio can be obtained by exponentiating this coefficient.

- The coefficient for Continuous_Var2 (0.2923) indicates that a one-unit increase in Continuous_Var2 is associated with an increase in the
log-odds of the dependent variable being 1. The odds ratio can be obtained by exponentiating this coefficient.

- Pseudo R-squared is approximately 0.055, suggesting a limited ability of the model to explain the variation in the dependent variable.
- The Wald test (z-values) tests the significance of each coefficient. For example, the p-value for Continuous_Var1 is 0.080, suggesting it
may not be statistically significant at a conventional significance level (e.g., 0.05).

Remember that the interpretation of coefficients and odds ratios depends on the context of your specific problem and the nature of the
variables involved.

DAPv9d Mac2011
No ratings yet
DAPv9d Mac2011
36 pages
Statistics: a QuickStudy Laminated Reference Guide
From Everand
Statistics: a QuickStudy Laminated Reference Guide
BarCharts Publishing, Inc.
No ratings yet
DA Manual - Part B
No ratings yet
DA Manual - Part B
13 pages
Data Science Lab Manual
No ratings yet
Data Science Lab Manual
32 pages
Statistics
No ratings yet
Statistics
163 pages
TYCS Practical
No ratings yet
TYCS Practical
26 pages
Date Preparation and Exploration:: Titanic Data - CSV
No ratings yet
Date Preparation and Exploration:: Titanic Data - CSV
5 pages
Commands for Data Analysis using R
No ratings yet
Commands for Data Analysis using R
11 pages
DSC project 442
No ratings yet
DSC project 442
12 pages
DALab Part-B BCU&BU
No ratings yet
DALab Part-B BCU&BU
12 pages
Data Science Practical Book - Ipynb
No ratings yet
Data Science Practical Book - Ipynb
21 pages
DM Assignment
No ratings yet
DM Assignment
17 pages
Coding Activity 3.ipynb - Colaboratory
No ratings yet
Coding Activity 3.ipynb - Colaboratory
7 pages
Data science and analtics Laboratory
No ratings yet
Data science and analtics Laboratory
21 pages
ML-Lab-A1-A4
No ratings yet
ML-Lab-A1-A4
6 pages
Research in Computing
No ratings yet
Research in Computing
40 pages
DVA Lab Manual
No ratings yet
DVA Lab Manual
20 pages
Data Science
No ratings yet
Data Science
18 pages
Python For Data Sceince l1 Hands On
No ratings yet
Python For Data Sceince l1 Hands On
5 pages
r Cheat Sheet
No ratings yet
r Cheat Sheet
9 pages
R Code
No ratings yet
R Code
13 pages
Introduction To STATA: Introduction To STATA About STATA Basic Operations Regression Analysis Panel Data Analysis
No ratings yet
Introduction To STATA: Introduction To STATA About STATA Basic Operations Regression Analysis Panel Data Analysis
27 pages
Aditya Garg DMDW
No ratings yet
Aditya Garg DMDW
40 pages
Group 5 - Applied Statistics and Experimental 152611
No ratings yet
Group 5 - Applied Statistics and Experimental 152611
28 pages
Statistics Cheatsheet 1703847367
No ratings yet
Statistics Cheatsheet 1703847367
8 pages
Python Solution
No ratings yet
Python Solution
30 pages
Hypothesis Testing - Cheatsheet
No ratings yet
Hypothesis Testing - Cheatsheet
10 pages
DAV Practical
No ratings yet
DAV Practical
12 pages
Solutions Modernstatistics
No ratings yet
Solutions Modernstatistics
144 pages
Continuous Assessment
No ratings yet
Continuous Assessment
4 pages
Final Cost Practical
No ratings yet
Final Cost Practical
29 pages
Industrial Statistics - A Computer Based Approach With Python
No ratings yet
Industrial Statistics - A Computer Based Approach With Python
140 pages
Hypothesis Testing PDF
No ratings yet
Hypothesis Testing PDF
9 pages
Oup 9
No ratings yet
Oup 9
26 pages
Data Sci
No ratings yet
Data Sci
29 pages
Hariks
No ratings yet
Hariks
5 pages
1728086737277
No ratings yet
1728086737277
26 pages
5_Data Summaries and Visualization (4)
No ratings yet
5_Data Summaries and Visualization (4)
87 pages
Pratical 11 python DP
No ratings yet
Pratical 11 python DP
5 pages
Experiment No 9
No ratings yet
Experiment No 9
13 pages
module 5 bivariate analysis
No ratings yet
module 5 bivariate analysis
81 pages
609008987-EDA-Lab-Manual
No ratings yet
609008987-EDA-Lab-Manual
93 pages
EDA Lab Manual
100% (2)
EDA Lab Manual
93 pages
Germany Credit Analysis
No ratings yet
Germany Credit Analysis
41 pages
Session 08 2024
No ratings yet
Session 08 2024
27 pages
R Console
No ratings yet
R Console
6 pages
Module - 4 (R Training) - Basic Stats & Modeling
No ratings yet
Module - 4 (R Training) - Basic Stats & Modeling
15 pages
Week2 lab
No ratings yet
Week2 lab
8 pages
STAT-2450 Assignment 1: Name:, Student ID: B00
No ratings yet
STAT-2450 Assignment 1: Name:, Student ID: B00
9 pages
Report
No ratings yet
Report
24 pages
ML Final Prac
No ratings yet
ML Final Prac
47 pages
Advanced Statistical Methods Using R
No ratings yet
Advanced Statistical Methods Using R
32 pages
Project paarth (1) (1)
No ratings yet
Project paarth (1) (1)
21 pages
Dav practicals
No ratings yet
Dav practicals
33 pages
Cost Practical
No ratings yet
Cost Practical
13 pages
Introduction To Business Statistics Through R Software: Software
From Everand
Introduction To Business Statistics Through R Software: Software
Editor IJSMI
No ratings yet
Python for Data Science: Data Science Mastery by Nikhil Khan, #1
From Everand
Python for Data Science: Data Science Mastery by Nikhil Khan, #1
Nikhil Khan
No ratings yet
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
MCS-011: Problem Solving and Programming
From Everand
MCS-011: Problem Solving and Programming
Dr. DK Sukhani
No ratings yet
Question 1: What Is Machine Learning Answer 1
No ratings yet
Question 1: What Is Machine Learning Answer 1
23 pages
Problem Set 10 (With Instructions) : Regression Analysis
No ratings yet
Problem Set 10 (With Instructions) : Regression Analysis
6 pages
Linear Regression Model
No ratings yet
Linear Regression Model
3 pages
Water Driven Published Paper
No ratings yet
Water Driven Published Paper
14 pages
Knowledge Discovery in Databases (KDD) : An Overview
No ratings yet
Knowledge Discovery in Databases (KDD) : An Overview
4 pages
CSRof 500 Companies
No ratings yet
CSRof 500 Companies
16 pages
Undergraduate Econometric
No ratings yet
Undergraduate Econometric
15 pages
Solution Manual for Statistical Inference, Second Edition, George Casella, Roger L. Bergerinstant download
100% (7)
Solution Manual for Statistical Inference, Second Edition, George Casella, Roger L. Bergerinstant download
45 pages
Machine Learning Approachto Improve Satellite Orbit Prediction Accuracy Using Publicly Available
No ratings yet
Machine Learning Approachto Improve Satellite Orbit Prediction Accuracy Using Publicly Available
32 pages
FinalProject Instruction
No ratings yet
FinalProject Instruction
5 pages
Courses of Study Bachelor in Development Studies
No ratings yet
Courses of Study Bachelor in Development Studies
12 pages
IDRISI Taiga GIS Image Processing Brochure
100% (1)
IDRISI Taiga GIS Image Processing Brochure
8 pages
Determinacion de Capsaicinoides
No ratings yet
Determinacion de Capsaicinoides
9 pages
"Role of Customer Attitude, Satisfaction, Trust & Commitment On Customer Future Purchase Intentions" (A Study of Movie Cinemas in Yamuna Nagar)
No ratings yet
"Role of Customer Attitude, Satisfaction, Trust & Commitment On Customer Future Purchase Intentions" (A Study of Movie Cinemas in Yamuna Nagar)
6 pages
Problem Set 2 SOLUTIONS
No ratings yet
Problem Set 2 SOLUTIONS
9 pages
(Maa 4.4) Linear Regression
No ratings yet
(Maa 4.4) Linear Regression
16 pages
Claims Reserving With R
0% (1)
Claims Reserving With R
55 pages
Doder, D. Et Al.: Impacts and Prediction Validity of Morphological Acta Kinesiologica 3 (2009) 2: 104 109
No ratings yet
Doder, D. Et Al.: Impacts and Prediction Validity of Morphological Acta Kinesiologica 3 (2009) 2: 104 109
6 pages
Caselet Summery
No ratings yet
Caselet Summery
12 pages
TR 4 2014
No ratings yet
TR 4 2014
64 pages
ECS4863 - Solutions To Activity 1.3
No ratings yet
ECS4863 - Solutions To Activity 1.3
16 pages
2020 Taming The Factor Zoo A Test of New Factors
No ratings yet
2020 Taming The Factor Zoo A Test of New Factors
44 pages
Quantitative Methods in Accounting and Finance - DFI4005
100% (1)
Quantitative Methods in Accounting and Finance - DFI4005
10 pages
The Local Impact of Mining On Poverty and Inequality
No ratings yet
The Local Impact of Mining On Poverty and Inequality
16 pages
The Mathematics Of Machine Learning Lectures On Supervised Methods And Beyond 1st Edition Maria Han Veiga download
100% (1)
The Mathematics Of Machine Learning Lectures On Supervised Methods And Beyond 1st Edition Maria Han Veiga download
84 pages
Aml Lab 2
No ratings yet
Aml Lab 2
2 pages
Data Mining Techniques For Weather Prediction A Review
No ratings yet
Data Mining Techniques For Weather Prediction A Review
6 pages
Clinical Prediction Models A Practical Approach to Development, Validation, and Updating 2nd Edition Readable PDF Download
100% (5)
Clinical Prediction Models A Practical Approach to Development, Validation, and Updating 2nd Edition Readable PDF Download
16 pages
Predicting Probability of Debt Default A Study of Corporate Debt Market in India and Other Countries
No ratings yet
Predicting Probability of Debt Default A Study of Corporate Debt Market in India and Other Countries
321 pages
Bawamenewi Et Al. - 2024 - The Influence of Work Discipline On Employee Performance at The Berkat Kasih Imanuel Jakarta Foundation
No ratings yet
Bawamenewi Et Al. - 2024 - The Influence of Work Discipline On Employee Performance at The Berkat Kasih Imanuel Jakarta Foundation
6 pages