Interpretation of SHAP charts for the Titanic case (Feature Selection Techniques)

April 9, 2020

090420202257 https://github.com/slundberg/shap In [1]: import pandas as pd df = pd.read_csv(‘/home/wojciech/Pulpit/1/tit_train.csv’, na_values=”-1″) df.head(2) Out[1]:   Unnamed: 0 PassengerId Survived Pclass Name Sex Age SibSp Parch […]

Homemade loop to search for the best functions for the regression model (Feature Selection Techniques)

April 9, 2020

090420201150 In [1]: import pandas as pd df = pd.read_csv(‘/home/wojciech/Pulpit/1/tit_train.csv’, na_values=”-1″) df.head(2) Out[1]: Unnamed: 0 PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare […]

How to calculate the probability of survival of the Titanic catastrophe_080420201050

April 9, 2020

080420201050 practical use: predict_proba In [1]: import numpy as np import pandas as pd from sklearn.model_selection import train_test_split from sklearn.ensemble import RandomForestClassifier In [2]: from catboost.datasets import […]

April 8, 2020

CatBoost Step 1. CatBoostClassifier (cat_features)

April 3, 2020

030420200928 In [1]: ## colorful prints def black(text): print(’33[30m’, text, ’33[0m’, sep=”) def red(text): print(’33[31m’, text, ’33[0m’, sep=”) def green(text): print(’33[32m’, text, ’33[0m’, sep=”) def […]

Feature Selection Techniques [categorical result] – Step Forward Selection

April 1, 2020

010420201017 Forward selection is an iterative method in which we start with no function in the model. In each iteration, we add a function that […]

Feature Selection Techniques – Recursive Feature Elimination and cross-validated selection (RFECV)

March 30, 2020

300320202100 RFECV differs from Recursive Feature Elimination (RFE) in the function selection process in that it indicates the OPTIMAL NUMBER OF VARIABLES and not the […]

Feature Selection Techniques – Embedded Method (Lasso)

March 30, 2020

300320202027 Embedded methods are iterative in a sense that takes care of each iteration of the model training process and carefully extract those features which […]

Feature Selection Techniques – Recursive Feature Elimination (RFE)

March 30, 2020

300320201719 It is a greedy optimization algorithm which aims to find the best performing feature subset. It repeatedly creates models and keeps aside the best […]

Feature Selection Techniques – Backward Elimination

March 30, 2020

300320201313 In backward elimination, we start with all the features and removes the least significant feature at each iteration which improves the performance of the […]

Feature Selection Techniques [numerical result] – Step Forward Selection

March 30, 2020

300320201248 Forward selection is an iterative method in which we start with no function in the model. In each iteration, we add a function that […]

Feature Selection Techniques – Variance Inflation Factor (VIF)

March 29, 2020

290320202006 Collinearity is the state where two variables are highly correlated and contain similar information about the variance within a given dataset. The Variance Inflation […]

Feature Selection Techniques – Pearson correlation

March 29, 2020

290320201454 In [1]: import numpy as np import pandas as pd import seaborn as sns import matplotlib.pyplot as plt from sklearn.preprocessing import LabelEncoder, OneHotEncoder import warnings […]

Feature Selection Techniques (by filter methods): numerical_ input, categorical output

March 28, 2020

280320200940 Source of data: https://archive.ics.uci.edu/ml/datasets/Air+Quality In this case, statistical methods are used: We always have continuous and discrete variables in the data set. This procedure […]

March 28, 2020