Feature Selection Techniques – Embedded Method (Lasso)
300320202027 Embedded methods are iterative in a sense that takes care of each iteration of the model training process and carefully extract those features which […]
300320202027 Embedded methods are iterative in a sense that takes care of each iteration of the model training process and carefully extract those features which […]
300320201719 It is a greedy optimization algorithm which aims to find the best performing feature subset. It repeatedly creates models and keeps aside the best […]
300320201313 In backward elimination, we start with all the features and removes the least significant feature at each iteration which improves the performance of the […]
300320201248 Forward selection is an iterative method in which we start with no function in the model. In each iteration, we add a function that […]
290320202006 Collinearity is the state where two variables are highly correlated and contain similar information about the variance within a given dataset. The Variance Inflation […]
290320201454 In [1]: import numpy as np import pandas as pd import seaborn as sns import matplotlib.pyplot as plt from sklearn.preprocessing import LabelEncoder, OneHotEncoder import warnings […]
280320200940 Source of data: https://archive.ics.uci.edu/ml/datasets/Air+Quality In this case, statistical methods are used: We always have continuous and discrete variables in the data set. This procedure […]
Feel free to read the code on GitHub data source: https://archive.ics.uci.edu/ml/datasets/Air+Quality In [1]: import numpy as np import pandas as pd import seaborn as sns import […]
categorical input – categorical output 260320201223 In this case, statistical methods are used: We always have continuous and discrete variables in the data set. This […]
part 1: Determining the depth of trees by visualization using visualization¶ 230320201052 In [1]: import numpy as np import matplotlib.pyplot as plt import seaborn as […]
230320200907 Principal component analysis (PCA) https://jakevdp.github.io/PythonDataScienceHandbook/05.08-random-forests.html https://www.geeksforgeeks.org/principal-component-analysis-with-python/ In [1]: import pandas as pd import numpy as np import seaborn as sns import matplotlib.pyplot as plt df= […]
200320200904 In this case, the method did not improve the model. However, there are models in which the PCA method is a very important reason […]
200320200724 In [1]: import pandas as pd df = pd.read_csv(’/home/wojciech/Pulpit/1/kaggletrain.csv’) df = df.dropna(how=’any’) df.dtypes Out[1]: Unnamed: 0 int64 PassengerId int64 Survived int64 Pclass int64 Name […]
https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.kruskal.html Unless you have a large sample size and can clearly demonstrate that your data are normal, you should routinely use Kruskal–Wallis; they think it […]
In [1]: import time start_time = time.time() ## pomiar czasu: start pomiaru czasu print(time.ctime()) Mon Mar 9 09:36:05 2020 In [2]: import torch import torch.nn as […]
Copyright © 2026 | WordPress Theme by MH Themes