0% found this document useful (0 votes)
40 views

Scikit-Learn Cheatsheet For Machine Learning

This document provides a cheat sheet summarizing key machine learning concepts in scikit-learn including data preprocessing techniques, supervised and unsupervised learning algorithms, model evaluation metrics, and model tuning. It lists common classification and regression algorithms like linear regression, support vector machines, naive bayes. It also covers preprocessing steps like standardization, normalization, encoding, imputation and dimensionality reduction using PCA. Model evaluation metrics include accuracy, classification report, MSE, R2 score. Model tuning is demonstrated using GridSearchCV.

Uploaded by

Muhammad Junaid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views

Scikit-Learn Cheatsheet For Machine Learning

This document provides a cheat sheet summarizing key machine learning concepts in scikit-learn including data preprocessing techniques, supervised and unsupervised learning algorithms, model evaluation metrics, and model tuning. It lists common classification and regression algorithms like linear regression, support vector machines, naive bayes. It also covers preprocessing steps like standardization, normalization, encoding, imputation and dimensionality reduction using PCA. Model evaluation metrics include accuracy, classification report, MSE, R2 score. Model tuning is demonstrated using GridSearchCV.

Uploaded by

Muhammad Junaid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

Visit KDnuggets.

com for more


cheatsheets and additional Standardization Unsupervised Learning Model lr.score(X_test, y_test)
learning resources.
from sklearn.preprocessing import StandardScaler from sklearn.metrics import accuracy_score
scaler = StandardScaler() accuracy_score(y_test, y_pred)
scaled_X_train = scaler.fit_transform(X_train) Principal Component Analysis

Scikit-learn CheatSheet scaled_X_test = scaler.transform(X_test)


from sklearn.decomposition import PCA
Classification Report

Normalization pca = PCA(n_components=2) from sklearn.metrics import classification_report


print(classification_report(y_test, y_pred))
from sklearn.preprocessing import Normalizer K Means
norm = Normalizer() Mean Squared Error
norm_X_train = norm.fit_transform(X_train) from sklearn.cluster import KMeans
norm_X_test = norm.transform(X_test) kmeans = KMeans(n_clusters=5, random_state=0) from sklearn.metrics import mean_squared_error
mean_squared_error(y_test, y_pred)
Binarization
R2 Score
from sklearn.preprocessing import Binarizer
Scikit-learn is an open-source Python library for all kinds binary = Binarizer(threshold=0.0) Model Fitting from sklearn.metrics import r2_score
of predictive data analysis. You can perform binary_X = binary.fit_transform(X) r2_score(y_test, y_pred)
classification, regression, clustering, dimensionality
reduction, model tuning, and data preprocessing tasks. Encoding Categorical Features Adjusted Rand Index
Supervised Learning
from sklearn.preprocessing import LabelEncoder from sklearn.metrics import adjusted_rand_score
lab_enc = LabelEncoder() lr.fit(X_train, y_train) adjusted_rand_score(y_test, y_pred)
y = lab_enc.fit_transform(y) svm_svc.fit(X_train, y_train)
Loading the Data
Imputer Unsupervised Learning
Cross-Validation
from sklearn.impute import SimpleImputer model = pca.fit_transform(X_train)
Classification imp_mean = SimpleImputer(missing_values=0, kmeans.fit(X_train)
strategy='mean')
from sklearn import datasets imp_mean.fit_transform(X_train)
X, y = datasets.load_wine(return_X_y=True) from sklearn.model_selection import cross_val_score
cross_val_score( lr, X, y, cv=5, scoring='f1_macro')
Regression Prediction
diabetes = datasets.load_diabetes()
Supervised Learning Model
X, y = diabetes.data, diabetes.target
Supervised Learning Model Tuning
Linear Regression y_pred = lr.predict_proba(X_test)
y_pred = svm_svc.predict(X_test)

Training And Test Data from sklearn.linear_model import LinearRegression


Unsupervised Learning
from sklearn.model_selection import GridSearchCV
lr = LinearRegression() parameters = {'kernel':('linear', 'rbf'), 'C':[1, 10]}
model = GridSearchCV(svm_svc, parameters)
Support Vector Machines y_pred = kmeans.predict(X_test) model.fit(X_train, y_train)
print(model.best_score_)
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC print(model.best_estimator_)
X_train, X_test, y_train, y_test = train_test_split(
svm_svc = SVC(kernel='linear')
X, y, random_state=0
) Evaluation
Naive Bayes
Subscribe to KDnuggets News
from sklearn.naive_bayes import GaussianNB
gnb = GaussianNB()
Accuracy Score
Preprocessing the Data

Abid Ali Awan | 2022

You might also like