060420202152
https://github.com/catboost/tutorials/blob/master/python_tutorial.ipynb
CatBoostClassifier sam koduje sobie zmienne tekstowe kategoryczne na zmienne kategoryczne wyrażone numerycznie. Jeżeli sami przeprowadzimy codowanie i zakodujemy zmienne kategoryczne na format cyfrowy, wyniki naszych modeli będą takie same (przynajmniej takie jest moje doświadczenie). Aby przeprowadzić eksperyment i przetestować model CatBoostClassifier bez wskazania na zmienne kategoryczne (cat_features) oraz ze wskazaniem na zmienne musimy sami zakodoać tekstowe zmienne kategoryczne na format cyfrowy. W przeciwnym razie gdy będziemy mieli zmienne tekstowe a nie wskarzemy CatBoostClassifier że to zmienne kategoryczne, wyskoczy nam błąd.
## colorful prints
def black(text):
print('33[30m', text, '33[0m', sep='')
def red(text):
print('33[31m', text, '33[0m', sep='')
def green(text):
print('33[32m', text, '33[0m', sep='')
def yellow(text):
print('33[33m', text, '33[0m', sep='')
def blue(text):
print('33[34m', text, '33[0m', sep='')
def magenta(text):
print('33[35m', text, '33[0m', sep='')
def cyan(text):
print('33[36m', text, '33[0m', sep='')
def gray(text):
print('33[90m', text, '33[0m', sep='')
1.2 Załadowanie danych¶
inny sposób na załadowanie tych samych danych o Tytaniku.
from catboost.datasets import titanic
import numpy as np
import pandas as pd
train_df, test_df = titanic()
train_df.head()
Sprawdzam kompletność zbioru¶
metoda pokazuje tylko te zmienna, w których brakuje danych.
null_value_stats = train_df.isnull().sum(axis=0)
null_value_stats[null_value_stats != 0]
W miejcu gdzie były puste rekordy wstawiana jest wartość -777¶
train_df.fillna(-777, inplace=True)
train_df.fillna(-777, inplace=True)
Dzielimy na zmienne opisujące i wynikowe¶
X = train_df.drop('Survived', axis=1)
y = train_df.Survived
Szukamy zmiennych kategorycznych¶
Zostały wybrane takie kolumny jako kolumny zmiennych kategorycznych.
print(X.dtypes)
categorical_features_indices = np.where(X.dtypes != np.float)[0]
categorical_features_indices
Wyświetlamy co to za kolumny¶
PPS = categorical_features_indices
KOT_MIC = dict(zip(train_df, PPS))
KOT_sorted_keys_MIC = sorted(KOT_MIC, key=KOT_MIC.get, reverse=True)
for r in KOT_sorted_keys_MIC:
print (r, KOT_MIC[r])
Można też użyć mojego sposobu na identyfikację zmiennych kategorycznych. Tutaj mamy nazwiska i kabiny więc ten sposób idetyfikacji zmiennych kategorycznych nie będzie właściwy.
import numpy as np
categorical_fuX = np.where(train_df.nunique() <8) [0]
categorical_fuX
Dzielimy zbiór na zbiory treningowe i testowe¶
from sklearn.model_selection import train_test_split
X_train, X_validation, y_train, y_validation = train_test_split(X, y, train_size=0.75, random_state=42)
X_train.head(3)
Poziom zbilansowania zbioru wynikowego¶
y_train.value_counts(dropna = False, normalize=True).plot(kind='pie')
2.1 Szkolenie modelowe¶
Teraz stwórzmy sam model: poszlibyśmy tutaj z parametrami domyślnymi (ponieważ zapewniają one naprawdę dobrą linię bazową prawie przez cały czas), jedyną rzeczą, którą chcielibyśmy tutaj określić, jest parametr custom_loss, ponieważ dałoby to nam możliwość zobaczenia co się dzieje pod względem tego wskaźnika konkurencji – dokładności, a także możliwości obserwowania utraty logów, ponieważ byłoby to bardziej płynne w przypadku zestawu danych o takim rozmiarze.
- custom_loss metryka użyta podczas szkolenia, wybrane: [„accuracy”] https://catboost.ai/docs/search/?query=%27Accuracy%27
- random_seed = 42 Losowe nasiona użyte do treningu. Te losowe wartości są za każdym razem takie same.
- logging_level = ‘Silent’ Poziom logowania, aby przejść do standardowego wyjścia. „Cichy” – nie wysyłaj żadnych danych logowania na standardowe wyjście. „Verbose” – wyślij następujące dane na standardowe wyjście, a następnie pokaże w modelu. Dopasuj całą ścieżkę uczenia się. „Informacje” lub „Debugowanie” – wyświetlanie dodatkowych informacji i liczby drzew.
from catboost import CatBoostClassifier, Pool, cv
from sklearn.metrics import accuracy_score
Zdefiniowanie modelu bez deklarowania zmiennych kategorycznych ¶
Optymalizacja pod kontem powierzchni AUC.
model = CatBoostClassifier(
custom_loss=['Accuracy'],
random_seed=42,
logging_level='Silent'
)
model.fit(
X_train, y_train,
cat_features=categorical_features_indices,
eval_set=(X_validation, y_validation),
# logging_level='Verbose', # you can uncomment this for text output
plot=True
);
‘;
}
this.layout = $(‘
‘
‘
‘‘ +
‘
‘‘ +
‘
‘
‘ +
‘
‘
‘ +
‘
‘‘ +
‘‘ +
‘‘ +
‘‘ +
‘
‘‘ +
‘‘ +
‘‘ +
‘‘ +
‘
‘ +
cvAreaControls +
‘
‘ +
‘
‘ +
‘
‘
‘
‘
‘ +
‘
‘);
$(parent).append(this.layout);
this.addTabEvents();
this.addControlEvents();
};
CatboostIpython.prototype.addTabEvents = function() {
var self = this;
$(‘.catboost-graph__tabs’, this.layout).click(function(e) {
if (!$(e.target).is(‘.catboost-graph__tab:not(.catboost-graph__tab_active)’)) {
return;
}
var id = $(e.target).attr(‘tabid’);
self.activeTab = id;
$(‘.catboost-graph__tab_active’, self.layout).removeClass(‘catboost-graph__tab_active’);
$(‘.catboost-graph__chart_active’, self.layout).removeClass(‘catboost-graph__chart_active’);
$(‘.catboost-graph__tab[tabid=”‘ + id + ‘”]’, self.layout).addClass(‘catboost-graph__tab_active’);
$(‘.catboost-graph__chart[tabid=”‘ + id + ‘”]’, self.layout).addClass(‘catboost-graph__chart_active’);
self.cleanSeries();
self.redrawActiveChart();
self.resizeCharts();
});
};
CatboostIpython.prototype.addControlEvents = function() {
var self = this;
$(‘#catboost-control-learn’ + this.index, this.layout).click(function() {
self.layoutDisabled.learn = !$(this)[0].checked;
$(‘.catboost-panel__series’, self.layout).toggleClass(‘catboost-panel__series_learn_disabled’, self.layoutDisabled.learn);
self.redrawActiveChart();
});
$(‘#catboost-control-test’ + this.index, this.layout).click(function() {
self.layoutDisabled.test = !$(this)[0].checked;
$(‘.catboost-panel__series’, self.layout).toggleClass(‘catboost-panel__series_test_disabled’, self.layoutDisabled.test);
self.redrawActiveChart();
});
$(‘#catboost-control2-clickmode’ + this.index, this.layout).click(function() {
self.clickMode = $(this)[0].checked;
});
$(‘#catboost-control2-log’ + this.index, this.layout).click(function() {
self.logarithmMode = $(this)[0].checked ? ‘log’ : ‘linear’;
self.forEveryLayout(function(layout) {
layout.yaxis = {type: self.logarithmMode};
});
self.redrawActiveChart();
});
var slider = $(‘#catboost-control2-slider’ + this.index),
sliderValue = $(‘#catboost-control2-slidervalue’ + this.index);
$(‘#catboost-control2-smooth’ + this.index, this.layout).click(function() {
var enabled = $(this)[0].checked;
self.setSmoothness(enabled ? self.lastSmooth : -1);
slider.prop(‘disabled’, !enabled);
sliderValue.prop(‘disabled’, !enabled);
self.redrawActiveChart();
});
$(‘#catboost-control2-cvstddev’ + this.index, this.layout).click(function() {
var enabled = $(this)[0].checked;
self.setStddev(enabled);
self.redrawActiveChart();
});
slider.on(‘input change’, function() {
var smooth = Number($(this).val());
sliderValue.val(isNaN(smooth) ? 0 : smooth);
self.setSmoothness(smooth);
self.lastSmooth = smooth;
self.redrawActiveChart();
});
sliderValue.on(‘input change’, function() {
var smooth = Number($(this).val());
slider.val(isNaN(smooth) ? 0 : smooth);
self.setSmoothness(smooth);
self.lastSmooth = smooth;
self.redrawActiveChart();
});
};
CatboostIpython.prototype.setTraceVisibility = function(trace, visibility) {
if (trace) {
trace.visible = visibility;
}
};
CatboostIpython.prototype.updateTracesVisibility = function() {
var tracesHash = this.groupTraces(),
traces,
smoothDisabled = this.getSmoothness() === -1,
self = this;
for (var train in tracesHash) {
if (tracesHash.hasOwnProperty(train)) {
traces = tracesHash[train].traces;
if (this.layoutDisabled.traces[train]) {
traces.forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
} else {
traces.forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
if (this.hasCVMode) {
if (this.stddevEnabled) {
self.filterTracesOne(traces, {type: ‘learn’}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesOne(traces, {type: ‘test’}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘learn’, cv_avg: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘learn’, cv_avg: true, smoothed: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true, smoothed: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true, best_point: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesOne(traces, {cv_stddev_first: true}).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesOne(traces, {cv_stddev_last: true}).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
} else {
self.filterTracesOne(traces, {cv_stddev_first: true}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesOne(traces, {cv_stddev_last: true}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘learn’, cv_avg: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘learn’, cv_avg: true, smoothed: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true, smoothed: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true, best_point: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
}
}
if (smoothDisabled) {
self.filterTracesOne(traces, {smoothed: true}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
}
if (this.layoutDisabled[‘learn’]) {
self.filterTracesOne(traces, {type: ‘learn’}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
}
if (this.layoutDisabled[‘test’]) {
self.filterTracesOne(traces, {type: ‘test’}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
}
}
}
}
};
CatboostIpython.prototype.getSmoothness = function() {
return this.smoothness && this.smoothness > -1 ? this.smoothness : -1;
};
CatboostIpython.prototype.setSmoothness = function(weight) {
if (weight 1) {
return;
}
this.smoothness = weight;
};
CatboostIpython.prototype.setStddev = function(enabled) {
this.stddevEnabled = enabled;
};
CatboostIpython.prototype.redrawActiveChart = function() {
this.chartsToRedraw[this.activeTab] = true;
this.redrawAll();
};
CatboostIpython.prototype.redraw = function() {
if (this.chartsToRedraw[this.activeTab]) {
this.chartsToRedraw[this.activeTab] = false;
this.updateTracesVisibility();
this.updateTracesCV();
this.updateTracesBest();
this.updateTracesValues();
this.updateTracesSmoothness();
this.plotly.redraw(this.traces[this.activeTab].parent);
}
this.drawTraces();
};
CatboostIpython.prototype.addRedrawFunc = function() {
this.redrawFunc = throttle(this.redraw, 400, false, this);
};
CatboostIpython.prototype.redrawAll = function() {
if (!this.redrawFunc) {
this.addRedrawFunc();
}
this.redrawFunc();
};
CatboostIpython.prototype.addPoints = function(parent, data) {
var self = this;
data.chunks.forEach(function(item) {
if (typeof item.remaining_time !== ‘undefined’ && typeof item.passed_time !== ‘undefined’) {
if (!self.timeLeft[data.path]) {
self.timeLeft[data.path] = [];
}
self.timeLeft[data.path][item.iteration] = [item.remaining_time, item.passed_time];
}
[‘test’, ‘learn’].forEach(function(type) {
var sets = self.meta[data.path][type + ‘_sets’],
metrics = self.meta[data.path][type + ‘_metrics’];
for (var i = 0; i ‘ + parameter + ‘ : ‘ + valueOfParameter;
}
}
}
if (!hovertextParametersAdded && type === ‘test’) {
hovertextParametersAdded = true;
trace.hovertext[pointIndex] += self.hovertextParameters[pointIndex];
}
smoothedTrace.x[pointIndex] = pointIndex;
}
if (bestValueTrace) {
bestValueTrace.x[pointIndex] = pointIndex;
bestValueTrace.y[pointIndex] = self.lossFuncs[nameOfMetric];
}
if (launchMode === ‘CV’ && !cvAdded) {
cvAdded = true;
self.getTrace(parent, $.extend({cv_stddev_first: true}, params));
self.getTrace(parent, $.extend({cv_stddev_last: true}, params));
self.getTrace(parent, $.extend({cv_stddev_first: true, smoothed: true}, params));
self.getTrace(parent, $.extend({cv_stddev_last: true, smoothed: true}, params));
self.getTrace(parent, $.extend({cv_avg: true}, params));
self.getTrace(parent, $.extend({cv_avg: true, smoothed: true}, params));
if (type === ‘test’) {
self.getTrace(parent, $.extend({cv_avg: true, best_point: true}, params));
}
}
}
self.chartsToRedraw[key.chartId] = true;
self.redrawAll();
}
});
});
};
CatboostIpython.prototype.getLaunchMode = function(path) {
return this.meta[path].launch_mode;
};
CatboostIpython.prototype.getChartNode = function(params, active) {
var node = $(‘
if (active) {
node.addClass(‘catboost-graph__chart_active’);
}
return node;
};
CatboostIpython.prototype.getChartTab = function(params, active) {
var node = $(‘
‘);
if (active) {
node.addClass(‘catboost-graph__tab_active’);
}
return node;
};
CatboostIpython.prototype.forEveryChart = function(callback) {
for (var name in this.traces) {
if (this.traces.hasOwnProperty(name)) {
callback(this.traces[name]);
}
}
};
CatboostIpython.prototype.forEveryLayout = function(callback) {
this.forEveryChart(function(chart) {
callback(chart.layout);
});
};
CatboostIpython.prototype.getChart = function(parent, params) {
var id = params.id,
self = this;
if (this.charts[id]) {
return this.charts[id];
}
this.addLayout(parent);
var active = this.activeTab === params.id,
chartNode = this.getChartNode(params, active),
chartTab = this.getChartTab(params, active);
$(‘.catboost-graph__charts’, this.layout).append(chartNode);
$(‘.catboost-graph__tabs’, this.layout).append(chartTab);
this.traces[id] = {
id: params.id,
name: params.name,
parent: chartNode[0],
traces: [],
layout: {
xaxis: {
range: [0, Number(this.meta[params.path].iteration_count)],
type: ‘linear’,
tickmode: ‘auto’,
showspikes: true,
spikethickness: 1,
spikedash: ‘longdashdot’,
spikemode: ‘across’,
zeroline: false,
showgrid: false
},
yaxis: {
zeroline: false
//showgrid: false
//hoverformat : ‘.7f’
},
separators: ‘. ‘,
//hovermode: ‘x’,
margin: {l: 38, r: 0, t: 35, b: 30},
autosize: true,
showlegend: false
},
options: {
scrollZoom: false,
modeBarButtonsToRemove: [‘toggleSpikelines’],
displaylogo: false
}
};
this.charts[id] = this.plotly.plot(chartNode[0], this.traces[id].traces, this.traces[id].layout, this.traces[id].options);
chartNode[0].on(‘plotly_hover’, function(e) {
self.updateTracesValues(e.points[0].x);
});
chartNode[0].on(‘plotly_click’, function(e) {
self.updateTracesValues(e.points[0].x, true);
});
return this.charts[id];
};
CatboostIpython.prototype.getTrace = function(parent, params) {
var key = this.getKey(params),
chartSeries = [];
if (this.traces[key.chartId]) {
chartSeries = this.traces[key.chartId].traces.filter(function(trace) {
return trace.name === key.traceName;
});
}
if (chartSeries.length) {
return chartSeries[0];
} else {
this.getChart(parent, {id: key.chartId, name: params.chartName, path: params.path});
var plotParams = {
color: this.getNextColor(params.path, params.smoothed ? 0.2 : 1),
fillsmoothcolor: this.getNextColor(params.path, 0.1),
fillcolor: this.getNextColor(params.path, 0.4),
hoverinfo: params.cv_avg ? ‘skip’ : ‘text+x’,
width: params.cv_avg ? 2 : 1,
dash: params.type === ‘test’ ? ‘solid’ : ‘dot’
},
trace = {
name: key.traceName,
_params: params,
x: [],
y: [],
hovertext: [],
hoverinfo: plotParams.hoverinfo,
line: {
width: plotParams.width,
dash: plotParams.dash,
color: plotParams.color
},
mode: ‘lines’,
hoveron: ‘points’,
connectgaps: true
};
if (params.best_point) {
trace = {
name: key.traceName,
_params: params,
x: [],
y: [],
marker: {
width: 2,
color: plotParams.color
},
hovertext: [],
hoverinfo: ‘text’,
mode: ‘markers’,
type: ‘scatter’
};
}
if (params.best_value) {
trace = {
name: key.traceName,
_params: params,
x: [],
y: [],
line: {
width: 1,
dash: ‘dash’,
color: ‘#CCCCCC’
},
mode: ‘lines’,
connectgaps: true,
hoverinfo: ‘skip’
};
}
if (params.cv_stddev_last) {
trace.fill = ‘tonexty’;
}
trace._params.plotParams = plotParams;
this.traces[key.chartId].traces.push(trace);
return trace;
}
};
CatboostIpython.prototype.getKey = function(params) {
var traceName = [
params.train,
params.type,
params.indexOfSet,
(params.smoothed ? ‘smoothed’ : ”),
(params.best_point ? ‘best_pount’ : ”),
(params.best_value ? ‘best_value’ : ”),
(params.cv_avg ? ‘cv_avg’ : ”),
(params.cv_stddev_first ? ‘cv_stddev_first’ : ”),
(params.cv_stddev_last ? ‘cv_stddev_last’ : ”)
].join(‘;’);
return {
chartId: params.chartName,
traceName: traceName,
colorId: params.train
};
};
CatboostIpython.prototype.filterTracesEvery = function(traces, filter) {
traces = traces || this.traces[this.activeTab].traces;
return traces.filter(function(trace) {
for (var prop in filter) {
if (filter.hasOwnProperty(prop)) {
if (filter[prop] !== trace._params[prop]) {
return false;
}
}
}
return true;
});
};
CatboostIpython.prototype.filterTracesOne = function(traces, filter) {
traces = traces || this.traces[this.activeTab].traces;
return traces.filter(function(trace) {
for (var prop in filter) {
if (filter.hasOwnProperty(prop)) {
if (filter[prop] === trace._params[prop]) {
return true;
}
}
}
return false;
});
};
CatboostIpython.prototype.cleanSeries = function() {
$(‘.catboost-panel__series’, this.layout).html(”);
};
CatboostIpython.prototype.groupTraces = function() {
var traces = this.traces[this.activeTab].traces,
index = 0,
tracesHash = {};
traces.map(function(trace) {
var train = trace._params.train;
if (!tracesHash[train]) {
tracesHash[train] = {
index: index,
traces: [],
info: {
path: trace._params.path,
color: trace._params.plotParams.color
}
};
index++;
}
tracesHash[train].traces.push(trace);
});
return tracesHash;
};
CatboostIpython.prototype.drawTraces = function() {
if ($(‘.catboost-panel__series .catboost-panel__serie’, this.layout).length) {
return;
}
var html = ”,
tracesHash = this.groupTraces();
for (var train in tracesHash) {
if (tracesHash.hasOwnProperty(train)) {
html += this.drawTrace(train, tracesHash[train]);
}
}
$(‘.catboost-panel__series’, this.layout).html(html);
this.updateTracesValues();
this.addTracesEvents();
};
CatboostIpython.prototype.getTraceDefParams = function(params) {
var defParams = {
smoothed: undefined,
best_point: undefined,
best_value: undefined,
cv_avg: undefined,
cv_stddev_first: undefined,
cv_stddev_last: undefined
};
if (params) {
return $.extend(defParams, params);
} else {
return defParams;
}
};
CatboostIpython.prototype.drawTrace = function(train, hash) {
var info = hash.info,
id = ‘catboost-serie-‘ + this.index + ‘-‘ + hash.index,
traces = {
learn: this.filterTracesEvery(hash.traces, this.getTraceDefParams({type: ‘learn’})),
test: this.filterTracesEvery(hash.traces, this.getTraceDefParams({type: ‘test’}))
},
items = {
learn: {
middle: ”,
bottom: ”
},
test: {
middle: ”,
bottom: ”
}
},
tracesNames = ”;
[‘learn’, ‘test’].forEach(function(type) {
traces[type].forEach(function(trace) {
items[type].middle += ‘
‘
items[type].bottom += ‘
‘
tracesNames += ‘
‘
‘;
});
});
var timeSpendHtml = ‘
‘
‘
‘;
var html = ‘
‘
‘‘ +
‘
(this.getLaunchMode(info.path) !== ‘Eval’ ? timeSpendHtml : ”) +
‘
‘ +
‘
‘ +
‘
‘ +
‘
‘
‘
‘
tracesNames +
‘
‘ +
‘
items.learn.middle +
items.test.middle +
‘
‘ +
‘
items.learn.bottom +
items.test.bottom +
‘
‘ +
‘
‘ +
‘
‘;
return html;
};
CatboostIpython.prototype.updateTracesValues = function(iteration, click) {
var tracesHash = this.groupTraces();
for (var train in tracesHash) {
if (tracesHash.hasOwnProperty(train) && !this.layoutDisabled.traces[train]) {
this.updateTraceValues(train, tracesHash[train], iteration, click);
}
}
};
CatboostIpython.prototype.updateTracesBest = function() {
var tracesHash = this.groupTraces();
for (var train in tracesHash) {
if (tracesHash.hasOwnProperty(train) && !this.layoutDisabled.traces[train]) {
this.updateTraceBest(train, tracesHash[train]);
}
}
};
CatboostIpython.prototype.getBestValue = function(data) {
if (!data.length) {
return {
best: undefined,
index: -1
};
}
var best = data[0],
index = 0,
func = this.lossFuncs[this.traces[this.activeTab].name],
bestDiff = typeof func === ‘number’ ? Math.abs(data[0] – func) : 0;
for (var i = 1, l = data.length; i best) {
best = data[i];
index = i;
}
if (typeof func === ‘number’ && Math.abs(data[i] – func) maxLength) {
maxLength = origTrace.y.length;
}
});
for (var i = 0; i 0) {
avgTrace.x[i] = i;
avgTrace.y[i] = sum / count;
}
}
};
CatboostIpython.prototype.updateTracesCVStdDev = function() {
var tracesHash = this.groupTraces(),
firstTraces = this.filterTracesOne(tracesHash.traces, {cv_stddev_first: true}),
self = this;
firstTraces.forEach(function(trace) {
var origTraces = self.filterTracesEvery(tracesHash.traces, self.getTraceDefParams({
train: trace._params.train,
type: trace._params.type,
smoothed: trace._params.smoothed
})),
lastTraces = self.filterTracesEvery(tracesHash.traces, self.getTraceDefParams({
train: trace._params.train,
type: trace._params.type,
smoothed: trace._params.smoothed,
cv_stddev_last: true
}));
if (origTraces.length && lastTraces.length === 1) {
self.cvStdDevFunc(origTraces, trace, lastTraces[0]);
}
});
};
CatboostIpython.prototype.cvStdDevFunc = function(origTraces, firstTrace, lastTrace) {
var maxCount = origTraces.length,
maxLength = -1,
count,
sum,
i, j;
origTraces.forEach(function(origTrace) {
if (origTrace.y.length > maxLength) {
maxLength = origTrace.y.length;
}
});
for (i = 0; i i) {
firstTrace.hovertext[i] += this.hovertextParameters[i];
lastTrace.hovertext[i] += this.hovertextParameters[i];
}
}
};
CatboostIpython.prototype.updateTracesSmoothness = function() {
var tracesHash = this.groupTraces(),
smoothedTraces = this.filterTracesOne(tracesHash.traces, {smoothed: true}),
enabled = this.getSmoothness() > -1,
self = this;
smoothedTraces.forEach(function(trace) {
var origTraces = self.filterTracesEvery(tracesHash.traces, self.getTraceDefParams({
train: trace._params.train,
type: trace._params.type,
indexOfSet: trace._params.indexOfSet,
cv_avg: trace._params.cv_avg,
cv_stddev_first: trace._params.cv_stddev_first,
cv_stddev_last: trace._params.cv_stddev_last
})),
colorFlag = false;
if (origTraces.length === 1) {
origTraces = origTraces[0];
if (origTraces.visible) {
if (enabled) {
self.smoothFunc(origTraces, trace);
colorFlag = true;
}
self.highlightSmoothedTrace(origTraces, trace, colorFlag);
}
}
});
};
CatboostIpython.prototype.highlightSmoothedTrace = function(trace, smoothedTrace, flag) {
if (flag) {
smoothedTrace.line.color = trace._params.plotParams.color;
trace.line.color = smoothedTrace._params.plotParams.color;
trace.hoverinfo = ‘skip’;
if (trace._params.cv_stddev_last) {
trace.fillcolor = trace._params.plotParams.fillsmoothcolor;
}
} else {
trace.line.color = trace._params.plotParams.color;
trace.hoverinfo = trace._params.plotParams.hoverinfo;
if (trace._params.cv_stddev_last) {
trace.fillcolor = trace._params.plotParams.fillcolor;
}
}
};
CatboostIpython.prototype.smoothFunc = function(origTrace, smoothedTrace) {
var data = origTrace.y,
smoothedPoints = this.smooth(data, this.getSmoothness()),
smoothedIndex = 0,
self = this;
if (smoothedPoints.length) {
data.forEach(function (d, index) {
if (!smoothedTrace.x[index]) {
smoothedTrace.x[index] = index;
}
var nameOfSet = smoothedTrace._params.nameOfSet;
if (smoothedTrace._params.cv_stddev_first || smoothedTrace._params.cv_stddev_last) {
nameOfSet = smoothedTrace._params.type + ‘ std’;
}
smoothedTrace.y[index] = smoothedPoints[smoothedIndex];
smoothedTrace.hovertext[index] = nameOfSet + ‘`: ‘ + smoothedPoints[smoothedIndex].toPrecision(7);
if (self.hovertextParameters.length > index) {
smoothedTrace.hovertext[index] += self.hovertextParameters[index];
}
smoothedIndex++;
});
}
};
CatboostIpython.prototype.formatItemValue = function(value, index, suffix) {
if (typeof value === ‘undefined’) {
return ”;
}
suffix = suffix || ”;
return ‘‘ + value + ‘‘;
};
CatboostIpython.prototype.updateTraceBest = function(train, hash) {
var traces = this.filterTracesOne(hash.traces, {best_point: true}),
self = this;
traces.forEach(function(trace) {
var testTrace = self.filterTracesEvery(hash.traces, self.getTraceDefParams({
train: trace._params.train,
type: ‘test’,
indexOfSet: trace._params.indexOfSet
}));
if (self.hasCVMode) {
testTrace = self.filterTracesEvery(hash.traces, self.getTraceDefParams({
train: trace._params.train,
type: ‘test’,
cv_avg: true
}));
}
var bestValue = self.getBestValue(testTrace.length === 1 ? testTrace[0].y : []);
if (bestValue.index !== -1) {
trace.x[0] = bestValue.index;
trace.y[0] = bestValue.best;
trace.hovertext[0] = bestValue.func + ‘ (‘ + (self.hasCVMode ? ‘avg’ : trace._params.nameOfSet) + ‘): ‘ + bestValue.index + ‘ ‘ + bestValue.best;
}
});
};
CatboostIpython.prototype.updateTraceValues = function(name, hash, iteration, click) {
var id = ‘catboost-serie-‘ + this.index + ‘-‘ + hash.index,
traces = {
learn: this.filterTracesEvery(hash.traces, this.getTraceDefParams({type: ‘learn’})),
test: this.filterTracesEvery(hash.traces, this.getTraceDefParams({type: ‘test’}))
},
path = hash.info.path,
self = this;
[‘learn’, ‘test’].forEach(function(type) {
traces[type].forEach(function(trace) {
var data = trace.y || [],
index = typeof iteration !== ‘undefined’ && iteration -1 ? bestValue.index : ”);
$(‘#’ + id + ‘ .catboost-panel__serie_best_test_value[data-index=’ + trace._params.indexOfSet + ‘]’, self.layout)
.html(self.formatItemValue(bestValue.best, bestValue.index, ‘best ‘ + trace._params.nameOfSet + ‘ ‘));
}
});
});
if (this.hasCVMode) {
var testTrace = this.filterTracesEvery(hash.traces, this.getTraceDefParams({
type: ‘test’,
cv_avg: true
})),
bestValue = this.getBestValue(testTrace.length === 1 ? testTrace[0].y : []);
$(‘#’ + id + ‘ .catboost-panel__serie_best_iteration’, this.layout).html(bestValue.index > -1 ? bestValue.index : ”);
}
if (click) {
this.clickMode = true;
$(‘#catboost-control2-clickmode’ + this.index, this.layout)[0].checked = true;
}
};
CatboostIpython.prototype.addTracesEvents = function() {
var self = this;
$(‘.catboost-panel__serie_checkbox’, this.layout).click(function() {
var name = $(this).data(‘seriename’);
self.layoutDisabled.traces[name] = !$(this)[0].checked;
self.redrawActiveChart();
});
};
CatboostIpython.prototype.getNextColor = function(path, opacity) {
var color;
if (this.colorsByPath[path]) {
color = this.colorsByPath[path];
} else {
color = this.colors[this.colorIndex];
this.colorsByPath[path] = color;
this.colorIndex++;
if (this.colorIndex > this.colors.length – 1) {
this.colorIndex = 0;
}
}
return this.hexToRgba(color, opacity);
};
CatboostIpython.prototype.hexToRgba = function(value, opacity) {
if (value.length 0) {
out += hours + ‘h ‘;
seconds = 0;
millis = 0;
}
if (minutes && minutes > 0) {
out += minutes + ‘m ‘;
millis = 0;
}
if (seconds && seconds > 0) {
out += seconds + ‘s ‘;
}
if (millis && millis > 0) {
out += millis + ‘ms’;
}
return out.trim();
};
CatboostIpython.prototype.mean = function(values, valueof) {
var n = values.length,
m = n,
i = -1,
value,
sum = 0,
number = function(x) {
return x === null ? NaN : +x;
};
if (valueof === null) {
while (++i
Jak widać, można zobaczyć, jak nasz model uczy się na podstawie pełnych wyników lub ładnych wykresów (osobiście zdecydowanie wybrałbym drugą opcję – po prostu sprawdź te wykresy: możesz na przykład powiększyć obszary zainteresowania!)
Dzięki temu możemy zobaczyć, że najlepsza wartość dokładności 0,8340 (na zestawie walidacyjnym) została osiągnięta na 157 etapie wzmocnienia.
Żeby to zobaczyć trzeba kliknąć na Accuracy i stanąć myszą na linii ciągłej (oznaczającej zmienne testowe) nie linii przerywanej(dane treningowe)Wartość accurace wysokości 0.834 osiąga u mnie przy 451 petli. To miejsce gdzie jest kropka!
Co to jest loglost?¶
Jeśli tylko przewidujesz prawdopodobieństwo dla klasy dodatniej, to funkcję utraty logarytmicznej można obliczyć dla jednej prognozy klasyfikacji binarnej ( yhat ) w porównaniu do oczekiwanego prawdopodobieństwa ( y ) w następujący sposób:
LogLoss = – ((1 – y) log (1 – yhat) + y log (yhat))¶
Obliczamy predykcję modelu¶
yhatA = model.predict(X_validation)
print(yhatA[:12])
y_validation[4]
y_train[12]
Sprawdzenie tego modelu klasyfikacji¶
# Classification Assessment
def Classification_Assessment(model ,Xtrain, ytrain, Xtest, ytest, y_pred):
import numpy as np
import matplotlib.pyplot as plt
from sklearn import metrics
from sklearn.metrics import classification_report, confusion_matrix
from sklearn.metrics import confusion_matrix, log_loss, auc, roc_curve, roc_auc_score, recall_score, precision_recall_curve
from sklearn.metrics import make_scorer, precision_score, fbeta_score, f1_score, classification_report
def green(text):
print('33[32m', text, '33[0m', sep='')
def blue(text):
print('33[34m', text, '33[0m', sep='')
print("Recall Training data: ", np.round(recall_score(ytrain, model.predict(Xtrain)), decimals=4))
print("Precision Training data: ", np.round(precision_score(ytrain, model.predict(Xtrain)), decimals=4))
print("----------------------------------------------------------------------")
print("Recall Test data: ", np.round(recall_score(ytest, model.predict(Xtest)), decimals=4))
print("Precision Test data: ", np.round(precision_score(ytest, model.predict(Xtest)), decimals=4))
print("----------------------------------------------------------------------")
print("Confusion Matrix Test data")
print(confusion_matrix(ytest, model.predict(Xtest)))
print("----------------------------------------------------------------------")
green('Valuation for test data only:')
print(classification_report(ytest, model.predict(Xtest)))
green('Valuation for test data only:')
y_pred_proba = model.predict_proba(Xtest)[::,1]
fpr, tpr, _ = metrics.roc_curve(ytest, y_pred)
auc = metrics.roc_auc_score(ytest, y_pred)
plt.plot(fpr, tpr, label='ROC (roc_auc = %0.2f)' % auc)
plt.xlabel('False Positive Rate',color='grey', fontsize = 13)
plt.ylabel('True Positive Rate',color='grey', fontsize = 13)
plt.title('Receiver operating characteristic')
plt.legend(loc="lower right")
plt.legend(loc=4)
plt.plot([0, 1], [0, 1],'r--')
plt.show()
print('roc_auc %.3f' % auc)
blue('---------------------')
AUC_train_1 = metrics.roc_auc_score(ytrain,model.predict_proba(Xtrain)[:,1])
blue('AUC_train: %.3f' % AUC_train_1)
AUC_test_1 = metrics.roc_auc_score(ytest,model.predict_proba(Xtest)[:,1])
blue('AUC_test: %.3f' % AUC_test_1)
blue('---------------------')
print("Accuracy Training data: ", np.round(accuracy_score(ytrain, model.predict(Xtrain)), decimals=4))
green("----------------------------------------------------------------------")
print("Accuracy Test data: ", np.round(accuracy_score(ytest, model.predict(Xtest)), decimals=4))
green("----------------------------------------------------------------------")
## colorful prints
def black(text):
print('33[30m', text, '33[0m', sep='')
def red(text):
print('33[31m', text, '33[0m', sep='')
def green(text):
print('33[32m', text, '33[0m', sep='')
def yellow(text):
print('33[33m', text, '33[0m', sep='')
def blue(text):
print('33[34m', text, '33[0m', sep='')
def magenta(text):
print('33[35m', text, '33[0m', sep='')
def cyan(text):
print('33[36m', text, '33[0m', sep='')
def gray(text):
print('33[90m', text, '33[0m', sep='')
blue(X_train.shape)
green(y_train.shape)
blue(X_validation.shape)
green(y_validation.shape)
Classification_Assessment(model,X_train, y_train, X_validation, y_validation, yhatA)
2.2 Walidacja krzyżowa modelu¶
Dobrze jest zweryfikować swój model, ale zweryfikować go – nawet lepiej. A także z działkami! Bez słów:
Pokazuje parametry modelu ¶
cv_params = model.get_params()
cv_params
Dodaje jeszcze jeden parametr do moedlu ¶
cv_params.update({'loss_function': 'Logloss'})
Nie wiem co to jest ¶
cv_data = cv(
Pool(X, y, cat_features=categorical_features_indices),
cv_params,
plot=True)
‘;
}
this.layout = $(‘
‘
‘
‘‘ +
‘
‘‘ +
‘
‘
‘ +
‘
‘
‘ +
‘
‘‘ +
‘
‘‘ +
‘
‘
‘‘ +
‘
‘‘ +
‘‘ +
‘
‘ +
cvAreaControls +
‘
‘ +
‘
‘ +
‘
‘
‘
‘
‘ +
‘
‘);
$(parent).append(this.layout);
this.addTabEvents();
this.addControlEvents();
};
CatboostIpython.prototype.addTabEvents = function() {
var self = this;
$(‘.catboost-graph__tabs’, this.layout).click(function(e) {
if (!$(e.target).is(‘.catboost-graph__tab:not(.catboost-graph__tab_active)’)) {
return;
}
var id = $(e.target).attr(‘tabid’);
self.activeTab = id;
$(‘.catboost-graph__tab_active’, self.layout).removeClass(‘catboost-graph__tab_active’);
$(‘.catboost-graph__chart_active’, self.layout).removeClass(‘catboost-graph__chart_active’);
$(‘.catboost-graph__tab[tabid=”‘ + id + ‘”]’, self.layout).addClass(‘catboost-graph__tab_active’);
$(‘.catboost-graph__chart[tabid=”‘ + id + ‘”]’, self.layout).addClass(‘catboost-graph__chart_active’);
self.cleanSeries();
self.redrawActiveChart();
self.resizeCharts();
});
};
CatboostIpython.prototype.addControlEvents = function() {
var self = this;
$(‘#catboost-control-learn’ + this.index, this.layout).click(function() {
self.layoutDisabled.learn = !$(this)[0].checked;
$(‘.catboost-panel__series’, self.layout).toggleClass(‘catboost-panel__series_learn_disabled’, self.layoutDisabled.learn);
self.redrawActiveChart();
});
$(‘#catboost-control-test’ + this.index, this.layout).click(function() {
self.layoutDisabled.test = !$(this)[0].checked;
$(‘.catboost-panel__series’, self.layout).toggleClass(‘catboost-panel__series_test_disabled’, self.layoutDisabled.test);
self.redrawActiveChart();
});
$(‘#catboost-control2-clickmode’ + this.index, this.layout).click(function() {
self.clickMode = $(this)[0].checked;
});
$(‘#catboost-control2-log’ + this.index, this.layout).click(function() {
self.logarithmMode = $(this)[0].checked ? ‘log’ : ‘linear’;
self.forEveryLayout(function(layout) {
layout.yaxis = {type: self.logarithmMode};
});
self.redrawActiveChart();
});
var slider = $(‘#catboost-control2-slider’ + this.index),
sliderValue = $(‘#catboost-control2-slidervalue’ + this.index);
$(‘#catboost-control2-smooth’ + this.index, this.layout).click(function() {
var enabled = $(this)[0].checked;
self.setSmoothness(enabled ? self.lastSmooth : -1);
slider.prop(‘disabled’, !enabled);
sliderValue.prop(‘disabled’, !enabled);
self.redrawActiveChart();
});
$(‘#catboost-control2-cvstddev’ + this.index, this.layout).click(function() {
var enabled = $(this)[0].checked;
self.setStddev(enabled);
self.redrawActiveChart();
});
slider.on(‘input change’, function() {
var smooth = Number($(this).val());
sliderValue.val(isNaN(smooth) ? 0 : smooth);
self.setSmoothness(smooth);
self.lastSmooth = smooth;
self.redrawActiveChart();
});
sliderValue.on(‘input change’, function() {
var smooth = Number($(this).val());
slider.val(isNaN(smooth) ? 0 : smooth);
self.setSmoothness(smooth);
self.lastSmooth = smooth;
self.redrawActiveChart();
});
};
CatboostIpython.prototype.setTraceVisibility = function(trace, visibility) {
if (trace) {
trace.visible = visibility;
}
};
CatboostIpython.prototype.updateTracesVisibility = function() {
var tracesHash = this.groupTraces(),
traces,
smoothDisabled = this.getSmoothness() === -1,
self = this;
for (var train in tracesHash) {
if (tracesHash.hasOwnProperty(train)) {
traces = tracesHash[train].traces;
if (this.layoutDisabled.traces[train]) {
traces.forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
} else {
traces.forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
if (this.hasCVMode) {
if (this.stddevEnabled) {
self.filterTracesOne(traces, {type: ‘learn’}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesOne(traces, {type: ‘test’}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘learn’, cv_avg: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘learn’, cv_avg: true, smoothed: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true, smoothed: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true, best_point: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesOne(traces, {cv_stddev_first: true}).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesOne(traces, {cv_stddev_last: true}).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
} else {
self.filterTracesOne(traces, {cv_stddev_first: true}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesOne(traces, {cv_stddev_last: true}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘learn’, cv_avg: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘learn’, cv_avg: true, smoothed: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true, smoothed: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true, best_point: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
}
}
if (smoothDisabled) {
self.filterTracesOne(traces, {smoothed: true}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
}
if (this.layoutDisabled[‘learn’]) {
self.filterTracesOne(traces, {type: ‘learn’}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
}
if (this.layoutDisabled[‘test’]) {
self.filterTracesOne(traces, {type: ‘test’}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
}
}
}
}
};
CatboostIpython.prototype.getSmoothness = function() {
return this.smoothness && this.smoothness > -1 ? this.smoothness : -1;
};
CatboostIpython.prototype.setSmoothness = function(weight) {
if (weight 1) {
return;
}
this.smoothness = weight;
};
CatboostIpython.prototype.setStddev = function(enabled) {
this.stddevEnabled = enabled;
};
CatboostIpython.prototype.redrawActiveChart = function() {
this.chartsToRedraw[this.activeTab] = true;
this.redrawAll();
};
CatboostIpython.prototype.redraw = function() {
if (this.chartsToRedraw[this.activeTab]) {
this.chartsToRedraw[this.activeTab] = false;
this.updateTracesVisibility();
this.updateTracesCV();
this.updateTracesBest();
this.updateTracesValues();
this.updateTracesSmoothness();
this.plotly.redraw(this.traces[this.activeTab].parent);
}
this.drawTraces();
};
CatboostIpython.prototype.addRedrawFunc = function() {
this.redrawFunc = throttle(this.redraw, 400, false, this);
};
CatboostIpython.prototype.redrawAll = function() {
if (!this.redrawFunc) {
this.addRedrawFunc();
}
this.redrawFunc();
};
CatboostIpython.prototype.addPoints = function(parent, data) {
var self = this;
data.chunks.forEach(function(item) {
if (typeof item.remaining_time !== ‘undefined’ && typeof item.passed_time !== ‘undefined’) {
if (!self.timeLeft[data.path]) {
self.timeLeft[data.path] = [];
}
self.timeLeft[data.path][item.iteration] = [item.remaining_time, item.passed_time];
}
[‘test’, ‘learn’].forEach(function(type) {
var sets = self.meta[data.path][type + ‘_sets’],
metrics = self.meta[data.path][type + ‘_metrics’];
for (var i = 0; i ‘ + parameter + ‘ : ‘ + valueOfParameter;
}
}
}
if (!hovertextParametersAdded && type === ‘test’) {
hovertextParametersAdded = true;
trace.hovertext[pointIndex] += self.hovertextParameters[pointIndex];
}
smoothedTrace.x[pointIndex] = pointIndex;
}
if (bestValueTrace) {
bestValueTrace.x[pointIndex] = pointIndex;
bestValueTrace.y[pointIndex] = self.lossFuncs[nameOfMetric];
}
if (launchMode === ‘CV’ && !cvAdded) {
cvAdded = true;
self.getTrace(parent, $.extend({cv_stddev_first: true}, params));
self.getTrace(parent, $.extend({cv_stddev_last: true}, params));
self.getTrace(parent, $.extend({cv_stddev_first: true, smoothed: true}, params));
self.getTrace(parent, $.extend({cv_stddev_last: true, smoothed: true}, params));
self.getTrace(parent, $.extend({cv_avg: true}, params));
self.getTrace(parent, $.extend({cv_avg: true, smoothed: true}, params));
if (type === ‘test’) {
self.getTrace(parent, $.extend({cv_avg: true, best_point: true}, params));
}
}
}
self.chartsToRedraw[key.chartId] = true;
self.redrawAll();
}
});
});
};
CatboostIpython.prototype.getLaunchMode = function(path) {
return this.meta[path].launch_mode;
};
CatboostIpython.prototype.getChartNode = function(params, active) {
var node = $(‘
if (active) {
node.addClass(‘catboost-graph__chart_active’);
}
return node;
};
CatboostIpython.prototype.getChartTab = function(params, active) {
var node = $(‘
‘);
if (active) {
node.addClass(‘catboost-graph__tab_active’);
}
return node;
};
CatboostIpython.prototype.forEveryChart = function(callback) {
for (var name in this.traces) {
if (this.traces.hasOwnProperty(name)) {
callback(this.traces[name]);
}
}
};
CatboostIpython.prototype.forEveryLayout = function(callback) {
this.forEveryChart(function(chart) {
callback(chart.layout);
});
};
CatboostIpython.prototype.getChart = function(parent, params) {
var id = params.id,
self = this;
if (this.charts[id]) {
return this.charts[id];
}
this.addLayout(parent);
var active = this.activeTab === params.id,
chartNode = this.getChartNode(params, active),
chartTab = this.getChartTab(params, active);
$(‘.catboost-graph__charts’, this.layout).append(chartNode);
$(‘.catboost-graph__tabs’, this.layout).append(chartTab);
this.traces[id] = {
id: params.id,
name: params.name,
parent: chartNode[0],
traces: [],
layout: {
xaxis: {
range: [0, Number(this.meta[params.path].iteration_count)],
type: ‘linear’,
tickmode: ‘auto’,
showspikes: true,
spikethickness: 1,
spikedash: ‘longdashdot’,
spikemode: ‘across’,
zeroline: false,
showgrid: false
},
yaxis: {
zeroline: false
//showgrid: false
//hoverformat : ‘.7f’
},
separators: ‘. ‘,
//hovermode: ‘x’,
margin: {l: 38, r: 0, t: 35, b: 30},
autosize: true,
showlegend: false
},
options: {
scrollZoom: false,
modeBarButtonsToRemove: [‘toggleSpikelines’],
displaylogo: false
}
};
this.charts[id] = this.plotly.plot(chartNode[0], this.traces[id].traces, this.traces[id].layout, this.traces[id].options);
chartNode[0].on(‘plotly_hover’, function(e) {
self.updateTracesValues(e.points[0].x);
});
chartNode[0].on(‘plotly_click’, function(e) {
self.updateTracesValues(e.points[0].x, true);
});
return this.charts[id];
};
CatboostIpython.prototype.getTrace = function(parent, params) {
var key = this.getKey(params),
chartSeries = [];
if (this.traces[key.chartId]) {
chartSeries = this.traces[key.chartId].traces.filter(function(trace) {
return trace.name === key.traceName;
});
}
if (chartSeries.length) {
return chartSeries[0];
} else {
this.getChart(parent, {id: key.chartId, name: params.chartName, path: params.path});
var plotParams = {
color: this.getNextColor(params.path, params.smoothed ? 0.2 : 1),
fillsmoothcolor: this.getNextColor(params.path, 0.1),
fillcolor: this.getNextColor(params.path, 0.4),
hoverinfo: params.cv_avg ? ‘skip’ : ‘text+x’,
width: params.cv_avg ? 2 : 1,
dash: params.type === ‘test’ ? ‘solid’ : ‘dot’
},
trace = {
name: key.traceName,
_params: params,
x: [],
y: [],
hovertext: [],
hoverinfo: plotParams.hoverinfo,
line: {
width: plotParams.width,
dash: plotParams.dash,
color: plotParams.color
},
mode: ‘lines’,
hoveron: ‘points’,
connectgaps: true
};
if (params.best_point) {
trace = {
name: key.traceName,
_params: params,
x: [],
y: [],
marker: {
width: 2,
color: plotParams.color
},
hovertext: [],
hoverinfo: ‘text’,
mode: ‘markers’,
type: ‘scatter’
};
}
if (params.best_value) {
trace = {
name: key.traceName,
_params: params,
x: [],
y: [],
line: {
width: 1,
dash: ‘dash’,
color: ‘#CCCCCC’
},
mode: ‘lines’,
connectgaps: true,
hoverinfo: ‘skip’
};
}
if (params.cv_stddev_last) {
trace.fill = ‘tonexty’;
}
trace._params.plotParams = plotParams;
this.traces[key.chartId].traces.push(trace);
return trace;
}
};
CatboostIpython.prototype.getKey = function(params) {
var traceName = [
params.train,
params.type,
params.indexOfSet,
(params.smoothed ? ‘smoothed’ : ”),
(params.best_point ? ‘best_pount’ : ”),
(params.best_value ? ‘best_value’ : ”),
(params.cv_avg ? ‘cv_avg’ : ”),
(params.cv_stddev_first ? ‘cv_stddev_first’ : ”),
(params.cv_stddev_last ? ‘cv_stddev_last’ : ”)
].join(‘;’);
return {
chartId: params.chartName,
traceName: traceName,
colorId: params.train
};
};
CatboostIpython.prototype.filterTracesEvery = function(traces, filter) {
traces = traces || this.traces[this.activeTab].traces;
return traces.filter(function(trace) {
for (var prop in filter) {
if (filter.hasOwnProperty(prop)) {
if (filter[prop] !== trace._params[prop]) {
return false;
}
}
}
return true;
});
};
CatboostIpython.prototype.filterTracesOne = function(traces, filter) {
traces = traces || this.traces[this.activeTab].traces;
return traces.filter(function(trace) {
for (var prop in filter) {
if (filter.hasOwnProperty(prop)) {
if (filter[prop] === trace._params[prop]) {
return true;
}
}
}
return false;
});
};
CatboostIpython.prototype.cleanSeries = function() {
$(‘.catboost-panel__series’, this.layout).html(”);
};
CatboostIpython.prototype.groupTraces = function() {
var traces = this.traces[this.activeTab].traces,
index = 0,
tracesHash = {};
traces.map(function(trace) {
var train = trace._params.train;
if (!tracesHash[train]) {
tracesHash[train] = {
index: index,
traces: [],
info: {
path: trace._params.path,
color: trace._params.plotParams.color
}
};
index++;
}
tracesHash[train].traces.push(trace);
});
return tracesHash;
};
CatboostIpython.prototype.drawTraces = function() {
if ($(‘.catboost-panel__series .catboost-panel__serie’, this.layout).length) {
return;
}
var html = ”,
tracesHash = this.groupTraces();
for (var train in tracesHash) {
if (tracesHash.hasOwnProperty(train)) {
html += this.drawTrace(train, tracesHash[train]);
}
}
$(‘.catboost-panel__series’, this.layout).html(html);
this.updateTracesValues();
this.addTracesEvents();
};
CatboostIpython.prototype.getTraceDefParams = function(params) {
var defParams = {
smoothed: undefined,
best_point: undefined,
best_value: undefined,
cv_avg: undefined,
cv_stddev_first: undefined,
cv_stddev_last: undefined
};
if (params) {
return $.extend(defParams, params);
} else {
return defParams;
}
};
CatboostIpython.prototype.drawTrace = function(train, hash) {
var info = hash.info,
id = ‘catboost-serie-‘ + this.index + ‘-‘ + hash.index,
traces = {
learn: this.filterTracesEvery(hash.traces, this.getTraceDefParams({type: ‘learn’})),
test: this.filterTracesEvery(hash.traces, this.getTraceDefParams({type: ‘test’}))
},
items = {
learn: {
middle: ”,
bottom: ”
},
test: {
middle: ”,
bottom: ”
}
},
tracesNames = ”;
[‘learn’, ‘test’].forEach(function(type) {
traces[type].forEach(function(trace) {
items[type].middle += ‘
‘
items[type].bottom += ‘
‘
tracesNames += ‘
‘
‘;
});
});
var timeSpendHtml = ‘
‘
‘
‘;
var html = ‘
‘
‘‘ +
‘
(this.getLaunchMode(info.path) !== ‘Eval’ ? timeSpendHtml : ”) +
‘
‘ +
‘
‘ +
‘
‘ +
‘
‘
‘
‘
tracesNames +
‘
‘ +
‘
items.learn.middle +
items.test.middle +
‘
‘ +
‘
items.learn.bottom +
items.test.bottom +
‘
‘ +
‘
‘ +
‘
‘;
return html;
};
CatboostIpython.prototype.updateTracesValues = function(iteration, click) {
var tracesHash = this.groupTraces();
for (var train in tracesHash) {
if (tracesHash.hasOwnProperty(train) && !this.layoutDisabled.traces[train]) {
this.updateTraceValues(train, tracesHash[train], iteration, click);
}
}
};
CatboostIpython.prototype.updateTracesBest = function() {
var tracesHash = this.groupTraces();
for (var train in tracesHash) {
if (tracesHash.hasOwnProperty(train) && !this.layoutDisabled.traces[train]) {
this.updateTraceBest(train, tracesHash[train]);
}
}
};
CatboostIpython.prototype.getBestValue = function(data) {
if (!data.length) {
return {
best: undefined,
index: -1
};
}
var best = data[0],
index = 0,
func = this.lossFuncs[this.traces[this.activeTab].name],
bestDiff = typeof func === ‘number’ ? Math.abs(data[0] – func) : 0;
for (var i = 1, l = data.length; i best) {
best = data[i];
index = i;
}
if (typeof func === ‘number’ && Math.abs(data[i] – func) maxLength) {
maxLength = origTrace.y.length;
}
});
for (var i = 0; i 0) {
avgTrace.x[i] = i;
avgTrace.y[i] = sum / count;
}
}
};
CatboostIpython.prototype.updateTracesCVStdDev = function() {
var tracesHash = this.groupTraces(),
firstTraces = this.filterTracesOne(tracesHash.traces, {cv_stddev_first: true}),
self = this;
firstTraces.forEach(function(trace) {
var origTraces = self.filterTracesEvery(tracesHash.traces, self.getTraceDefParams({
train: trace._params.train,
type: trace._params.type,
smoothed: trace._params.smoothed
})),
lastTraces = self.filterTracesEvery(tracesHash.traces, self.getTraceDefParams({
train: trace._params.train,
type: trace._params.type,
smoothed: trace._params.smoothed,
cv_stddev_last: true
}));
if (origTraces.length && lastTraces.length === 1) {
self.cvStdDevFunc(origTraces, trace, lastTraces[0]);
}
});
};
CatboostIpython.prototype.cvStdDevFunc = function(origTraces, firstTrace, lastTrace) {
var maxCount = origTraces.length,
maxLength = -1,
count,
sum,
i, j;
origTraces.forEach(function(origTrace) {
if (origTrace.y.length > maxLength) {
maxLength = origTrace.y.length;
}
});
for (i = 0; i i) {
firstTrace.hovertext[i] += this.hovertextParameters[i];
lastTrace.hovertext[i] += this.hovertextParameters[i];
}
}
};
CatboostIpython.prototype.updateTracesSmoothness = function() {
var tracesHash = this.groupTraces(),
smoothedTraces = this.filterTracesOne(tracesHash.traces, {smoothed: true}),
enabled = this.getSmoothness() > -1,
self = this;
smoothedTraces.forEach(function(trace) {
var origTraces = self.filterTracesEvery(tracesHash.traces, self.getTraceDefParams({
train: trace._params.train,
type: trace._params.type,
indexOfSet: trace._params.indexOfSet,
cv_avg: trace._params.cv_avg,
cv_stddev_first: trace._params.cv_stddev_first,
cv_stddev_last: trace._params.cv_stddev_last
})),
colorFlag = false;
if (origTraces.length === 1) {
origTraces = origTraces[0];
if (origTraces.visible) {
if (enabled) {
self.smoothFunc(origTraces, trace);
colorFlag = true;
}
self.highlightSmoothedTrace(origTraces, trace, colorFlag);
}
}
});
};
CatboostIpython.prototype.highlightSmoothedTrace = function(trace, smoothedTrace, flag) {
if (flag) {
smoothedTrace.line.color = trace._params.plotParams.color;
trace.line.color = smoothedTrace._params.plotParams.color;
trace.hoverinfo = ‘skip’;
if (trace._params.cv_stddev_last) {
trace.fillcolor = trace._params.plotParams.fillsmoothcolor;
}
} else {
trace.line.color = trace._params.plotParams.color;
trace.hoverinfo = trace._params.plotParams.hoverinfo;
if (trace._params.cv_stddev_last) {
trace.fillcolor = trace._params.plotParams.fillcolor;
}
}
};
CatboostIpython.prototype.smoothFunc = function(origTrace, smoothedTrace) {
var data = origTrace.y,
smoothedPoints = this.smooth(data, this.getSmoothness()),
smoothedIndex = 0,
self = this;
if (smoothedPoints.length) {
data.forEach(function (d, index) {
if (!smoothedTrace.x[index]) {
smoothedTrace.x[index] = index;
}
var nameOfSet = smoothedTrace._params.nameOfSet;
if (smoothedTrace._params.cv_stddev_first || smoothedTrace._params.cv_stddev_last) {
nameOfSet = smoothedTrace._params.type + ‘ std’;
}
smoothedTrace.y[index] = smoothedPoints[smoothedIndex];
smoothedTrace.hovertext[index] = nameOfSet + ‘`: ‘ + smoothedPoints[smoothedIndex].toPrecision(7);
if (self.hovertextParameters.length > index) {
smoothedTrace.hovertext[index] += self.hovertextParameters[index];
}
smoothedIndex++;
});
}
};
CatboostIpython.prototype.formatItemValue = function(value, index, suffix) {
if (typeof value === ‘undefined’) {
return ”;
}
suffix = suffix || ”;
return ‘‘ + value + ‘‘;
};
CatboostIpython.prototype.updateTraceBest = function(train, hash) {
var traces = this.filterTracesOne(hash.traces, {best_point: true}),
self = this;
traces.forEach(function(trace) {
var testTrace = self.filterTracesEvery(hash.traces, self.getTraceDefParams({
train: trace._params.train,
type: ‘test’,
indexOfSet: trace._params.indexOfSet
}));
if (self.hasCVMode) {
testTrace = self.filterTracesEvery(hash.traces, self.getTraceDefParams({
train: trace._params.train,
type: ‘test’,
cv_avg: true
}));
}
var bestValue = self.getBestValue(testTrace.length === 1 ? testTrace[0].y : []);
if (bestValue.index !== -1) {
trace.x[0] = bestValue.index;
trace.y[0] = bestValue.best;
trace.hovertext[0] = bestValue.func + ‘ (‘ + (self.hasCVMode ? ‘avg’ : trace._params.nameOfSet) + ‘): ‘ + bestValue.index + ‘ ‘ + bestValue.best;
}
});
};
CatboostIpython.prototype.updateTraceValues = function(name, hash, iteration, click) {
var id = ‘catboost-serie-‘ + this.index + ‘-‘ + hash.index,
traces = {
learn: this.filterTracesEvery(hash.traces, this.getTraceDefParams({type: ‘learn’})),
test: this.filterTracesEvery(hash.traces, this.getTraceDefParams({type: ‘test’}))
},
path = hash.info.path,
self = this;
[‘learn’, ‘test’].forEach(function(type) {
traces[type].forEach(function(trace) {
var data = trace.y || [],
index = typeof iteration !== ‘undefined’ && iteration -1 ? bestValue.index : ”);
$(‘#’ + id + ‘ .catboost-panel__serie_best_test_value[data-index=’ + trace._params.indexOfSet + ‘]’, self.layout)
.html(self.formatItemValue(bestValue.best, bestValue.index, ‘best ‘ + trace._params.nameOfSet + ‘ ‘));
}
});
});
if (this.hasCVMode) {
var testTrace = this.filterTracesEvery(hash.traces, this.getTraceDefParams({
type: ‘test’,
cv_avg: true
})),
bestValue = this.getBestValue(testTrace.length === 1 ? testTrace[0].y : []);
$(‘#’ + id + ‘ .catboost-panel__serie_best_iteration’, this.layout).html(bestValue.index > -1 ? bestValue.index : ”);
}
if (click) {
this.clickMode = true;
$(‘#catboost-control2-clickmode’ + this.index, this.layout)[0].checked = true;
}
};
CatboostIpython.prototype.addTracesEvents = function() {
var self = this;
$(‘.catboost-panel__serie_checkbox’, this.layout).click(function() {
var name = $(this).data(‘seriename’);
self.layoutDisabled.traces[name] = !$(this)[0].checked;
self.redrawActiveChart();
});
};
CatboostIpython.prototype.getNextColor = function(path, opacity) {
var color;
if (this.colorsByPath[path]) {
color = this.colorsByPath[path];
} else {
color = this.colors[this.colorIndex];
this.colorsByPath[path] = color;
this.colorIndex++;
if (this.colorIndex > this.colors.length – 1) {
this.colorIndex = 0;
}
}
return this.hexToRgba(color, opacity);
};
CatboostIpython.prototype.hexToRgba = function(value, opacity) {
if (value.length 0) {
out += hours + ‘h ‘;
seconds = 0;
millis = 0;
}
if (minutes && minutes > 0) {
out += minutes + ‘m ‘;
millis = 0;
}
if (seconds && seconds > 0) {
out += seconds + ‘s ‘;
}
if (millis && millis > 0) {
out += millis + ‘ms’;
}
return out.trim();
};
CatboostIpython.prototype.mean = function(values, valueof) {
var n = values.length,
m = n,
i = -1,
value,
sum = 0,
number = function(x) {
return x === null ? NaN : +x;
};
if (valueof === null) {
while (++i
to nie działa¶
yhatCV = cv_data.predict(X_validation)
Teraz mamy wartości naszych funkcji strat na każdym etapie wzmocnienia uśrednione 3-krotnie, co powinno zapewnić nam dokładniejsze oszacowanie wydajności naszego modelu:
cv_data.head(2)
print('Best validation accuracy score: {:.2f}±{:.2f} on step {}'.format(
np.max(cv_data['test-Accuracy-mean']),
cv_data['test-Accuracy-std'][np.argmax(cv_data['test-Accuracy-mean'])],
np.argmax(cv_data['test-Accuracy-mean'])
))
np.max(cv_data['test-Accuracy-mean'])
np.argmax(cv_data['test-Accuracy-mean'])
print('Precise validation accuracy score: {}'.format(np.max(cv_data['test-Accuracy-mean'])))
Jak widzimy, nasze wstępne oszacowanie wydajności przy pojedynczym foldowaniu sprawdzania poprawności było zbyt optymistyczne – dlatego tak ważna jest krzyżowa weryfikacja!
odpuszczam – nie rozumiem tej sekcji¶
2.3 Stosowanie modelu¶
Wszystko, co musisz zrobić, aby uzyskać prognozy, to
predictions = model.predict(X_validation)
predictions[:15]
predictions_probs = model.predict_proba(X_validation)
predictions_probs[:5]
Ale spróbujmy uzyskać lepsze prognozy, a funkcje Catboost nam w tym pomogą.
Być może zauważyłeś, że na etapie tworzenia modelu podałem nie tylko parametr custom_loss, ale także parametr random_seed. Zostało to zrobione, aby ten notatnik był odtwarzalny – domyślnie catboost wybiera losową wartość dla seed:
model_without_seed = CatBoostClassifier(iterations=10, logging_level='Silent')
model_without_seed.fit(X, y, cat_features=categorical_features_indices)
print('Random seed assigned for this model: {}'.format(model_without_seed.random_seed_))
Zdefiniujmy niektóre parametry i utwórz Pool dla większej wygody. Pool Przechowuje wszystkie informacje o zbiorze danych (cechy, etykiety, wskaźniki cech jakościowych, wagi i wiele innych).
To taki zbiornik z parametrami modelu
params = {
'iterations': 500,
'learning_rate': 0.1,
'eval_metric': 'Accuracy',
'random_seed': 42,
'logging_level': 'Silent',
'use_best_model': False
}
Pool dla zmiennych treningowych – to nie model, to taki zbiornik
train_pool = Pool(X_train, y_train, cat_features=categorical_features_indices)
train_pool
Pool dla zmiennych testowych – to nie model, to taki zbiornik
validate_pool = Pool(X_validation, y_validation, cat_features=categorical_features_indices)
validate_pool
3.1 Korzystanie z najlepszego modelu¶
Jeśli zasadniczo masz zestaw sprawdzania poprawności, zawsze lepiej jest używać parametru use_best_model podczas treningu. Domyślnie ten parametr jest włączony. Jeśli jest włączony, wynikowy zestaw drzew zmniejsza się do najlepszej iteracji.
## -------linijka jak wywołać najlepsze parametry modelu ---------------------
model = CatBoostClassifier(**params)
model.fit(train_pool, eval_set=validate_pool)
best_model_params = params.copy()
best_model_params.update({ ## <- tutaj model wkłada 'use_best_model'
'use_best_model': True ## to nie są lepsze parametry tylko ten jeden nowy parametr
})
### ----------------------------------------------------------------------------
best_model = CatBoostClassifier(**best_model_params)
best_model.fit(train_pool, eval_set=validate_pool);
print('Simple model validation accuracy: {:.4}'.format(
accuracy_score(y_validation, model.predict(X_validation))
))
print('')
print('Best model validation accuracy: {:.4}'.format(
accuracy_score(y_validation, best_model.predict(X_validation))
))
Rozbieram powyższe na czynniki pierwsze¶
params
wyświetlam stare parametry modelu
params
wyświetlam nowe parametry modelu
best_model_params
best_model = CatBoostClassifier(**best_model_params)
best_model.fit(train_pool, eval_set=validate_pool,plot=True )
‘;
}
this.layout = $(‘
‘
‘
‘‘ +
‘
‘‘ +
‘
‘
‘ +
‘
‘
‘ +
‘
‘‘ +
‘
‘‘ +
‘
‘
‘‘ +
‘
‘‘ +
‘‘ +
‘
‘ +
cvAreaControls +
‘
‘ +
‘
‘ +
‘
‘
‘
‘
‘ +
‘
‘);
$(parent).append(this.layout);
this.addTabEvents();
this.addControlEvents();
};
CatboostIpython.prototype.addTabEvents = function() {
var self = this;
$(‘.catboost-graph__tabs’, this.layout).click(function(e) {
if (!$(e.target).is(‘.catboost-graph__tab:not(.catboost-graph__tab_active)’)) {
return;
}
var id = $(e.target).attr(‘tabid’);
self.activeTab = id;
$(‘.catboost-graph__tab_active’, self.layout).removeClass(‘catboost-graph__tab_active’);
$(‘.catboost-graph__chart_active’, self.layout).removeClass(‘catboost-graph__chart_active’);
$(‘.catboost-graph__tab[tabid=”‘ + id + ‘”]’, self.layout).addClass(‘catboost-graph__tab_active’);
$(‘.catboost-graph__chart[tabid=”‘ + id + ‘”]’, self.layout).addClass(‘catboost-graph__chart_active’);
self.cleanSeries();
self.redrawActiveChart();
self.resizeCharts();
});
};
CatboostIpython.prototype.addControlEvents = function() {
var self = this;
$(‘#catboost-control-learn’ + this.index, this.layout).click(function() {
self.layoutDisabled.learn = !$(this)[0].checked;
$(‘.catboost-panel__series’, self.layout).toggleClass(‘catboost-panel__series_learn_disabled’, self.layoutDisabled.learn);
self.redrawActiveChart();
});
$(‘#catboost-control-test’ + this.index, this.layout).click(function() {
self.layoutDisabled.test = !$(this)[0].checked;
$(‘.catboost-panel__series’, self.layout).toggleClass(‘catboost-panel__series_test_disabled’, self.layoutDisabled.test);
self.redrawActiveChart();
});
$(‘#catboost-control2-clickmode’ + this.index, this.layout).click(function() {
self.clickMode = $(this)[0].checked;
});
$(‘#catboost-control2-log’ + this.index, this.layout).click(function() {
self.logarithmMode = $(this)[0].checked ? ‘log’ : ‘linear’;
self.forEveryLayout(function(layout) {
layout.yaxis = {type: self.logarithmMode};
});
self.redrawActiveChart();
});
var slider = $(‘#catboost-control2-slider’ + this.index),
sliderValue = $(‘#catboost-control2-slidervalue’ + this.index);
$(‘#catboost-control2-smooth’ + this.index, this.layout).click(function() {
var enabled = $(this)[0].checked;
self.setSmoothness(enabled ? self.lastSmooth : -1);
slider.prop(‘disabled’, !enabled);
sliderValue.prop(‘disabled’, !enabled);
self.redrawActiveChart();
});
$(‘#catboost-control2-cvstddev’ + this.index, this.layout).click(function() {
var enabled = $(this)[0].checked;
self.setStddev(enabled);
self.redrawActiveChart();
});
slider.on(‘input change’, function() {
var smooth = Number($(this).val());
sliderValue.val(isNaN(smooth) ? 0 : smooth);
self.setSmoothness(smooth);
self.lastSmooth = smooth;
self.redrawActiveChart();
});
sliderValue.on(‘input change’, function() {
var smooth = Number($(this).val());
slider.val(isNaN(smooth) ? 0 : smooth);
self.setSmoothness(smooth);
self.lastSmooth = smooth;
self.redrawActiveChart();
});
};
CatboostIpython.prototype.setTraceVisibility = function(trace, visibility) {
if (trace) {
trace.visible = visibility;
}
};
CatboostIpython.prototype.updateTracesVisibility = function() {
var tracesHash = this.groupTraces(),
traces,
smoothDisabled = this.getSmoothness() === -1,
self = this;
for (var train in tracesHash) {
if (tracesHash.hasOwnProperty(train)) {
traces = tracesHash[train].traces;
if (this.layoutDisabled.traces[train]) {
traces.forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
} else {
traces.forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
if (this.hasCVMode) {
if (this.stddevEnabled) {
self.filterTracesOne(traces, {type: ‘learn’}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesOne(traces, {type: ‘test’}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘learn’, cv_avg: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘learn’, cv_avg: true, smoothed: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true, smoothed: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true, best_point: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesOne(traces, {cv_stddev_first: true}).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesOne(traces, {cv_stddev_last: true}).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
} else {
self.filterTracesOne(traces, {cv_stddev_first: true}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesOne(traces, {cv_stddev_last: true}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘learn’, cv_avg: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘learn’, cv_avg: true, smoothed: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true, smoothed: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true, best_point: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
}
}
if (smoothDisabled) {
self.filterTracesOne(traces, {smoothed: true}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
}
if (this.layoutDisabled[‘learn’]) {
self.filterTracesOne(traces, {type: ‘learn’}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
}
if (this.layoutDisabled[‘test’]) {
self.filterTracesOne(traces, {type: ‘test’}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
}
}
}
}
};
CatboostIpython.prototype.getSmoothness = function() {
return this.smoothness && this.smoothness > -1 ? this.smoothness : -1;
};
CatboostIpython.prototype.setSmoothness = function(weight) {
if (weight 1) {
return;
}
this.smoothness = weight;
};
CatboostIpython.prototype.setStddev = function(enabled) {
this.stddevEnabled = enabled;
};
CatboostIpython.prototype.redrawActiveChart = function() {
this.chartsToRedraw[this.activeTab] = true;
this.redrawAll();
};
CatboostIpython.prototype.redraw = function() {
if (this.chartsToRedraw[this.activeTab]) {
this.chartsToRedraw[this.activeTab] = false;
this.updateTracesVisibility();
this.updateTracesCV();
this.updateTracesBest();
this.updateTracesValues();
this.updateTracesSmoothness();
this.plotly.redraw(this.traces[this.activeTab].parent);
}
this.drawTraces();
};
CatboostIpython.prototype.addRedrawFunc = function() {
this.redrawFunc = throttle(this.redraw, 400, false, this);
};
CatboostIpython.prototype.redrawAll = function() {
if (!this.redrawFunc) {
this.addRedrawFunc();
}
this.redrawFunc();
};
CatboostIpython.prototype.addPoints = function(parent, data) {
var self = this;
data.chunks.forEach(function(item) {
if (typeof item.remaining_time !== ‘undefined’ && typeof item.passed_time !== ‘undefined’) {
if (!self.timeLeft[data.path]) {
self.timeLeft[data.path] = [];
}
self.timeLeft[data.path][item.iteration] = [item.remaining_time, item.passed_time];
}
[‘test’, ‘learn’].forEach(function(type) {
var sets = self.meta[data.path][type + ‘_sets’],
metrics = self.meta[data.path][type + ‘_metrics’];
for (var i = 0; i ‘ + parameter + ‘ : ‘ + valueOfParameter;
}
}
}
if (!hovertextParametersAdded && type === ‘test’) {
hovertextParametersAdded = true;
trace.hovertext[pointIndex] += self.hovertextParameters[pointIndex];
}
smoothedTrace.x[pointIndex] = pointIndex;
}
if (bestValueTrace) {
bestValueTrace.x[pointIndex] = pointIndex;
bestValueTrace.y[pointIndex] = self.lossFuncs[nameOfMetric];
}
if (launchMode === ‘CV’ && !cvAdded) {
cvAdded = true;
self.getTrace(parent, $.extend({cv_stddev_first: true}, params));
self.getTrace(parent, $.extend({cv_stddev_last: true}, params));
self.getTrace(parent, $.extend({cv_stddev_first: true, smoothed: true}, params));
self.getTrace(parent, $.extend({cv_stddev_last: true, smoothed: true}, params));
self.getTrace(parent, $.extend({cv_avg: true}, params));
self.getTrace(parent, $.extend({cv_avg: true, smoothed: true}, params));
if (type === ‘test’) {
self.getTrace(parent, $.extend({cv_avg: true, best_point: true}, params));
}
}
}
self.chartsToRedraw[key.chartId] = true;
self.redrawAll();
}
});
});
};
CatboostIpython.prototype.getLaunchMode = function(path) {
return this.meta[path].launch_mode;
};
CatboostIpython.prototype.getChartNode = function(params, active) {
var node = $(‘
if (active) {
node.addClass(‘catboost-graph__chart_active’);
}
return node;
};
CatboostIpython.prototype.getChartTab = function(params, active) {
var node = $(‘
‘);
if (active) {
node.addClass(‘catboost-graph__tab_active’);
}
return node;
};
CatboostIpython.prototype.forEveryChart = function(callback) {
for (var name in this.traces) {
if (this.traces.hasOwnProperty(name)) {
callback(this.traces[name]);
}
}
};
CatboostIpython.prototype.forEveryLayout = function(callback) {
this.forEveryChart(function(chart) {
callback(chart.layout);
});
};
CatboostIpython.prototype.getChart = function(parent, params) {
var id = params.id,
self = this;
if (this.charts[id]) {
return this.charts[id];
}
this.addLayout(parent);
var active = this.activeTab === params.id,
chartNode = this.getChartNode(params, active),
chartTab = this.getChartTab(params, active);
$(‘.catboost-graph__charts’, this.layout).append(chartNode);
$(‘.catboost-graph__tabs’, this.layout).append(chartTab);
this.traces[id] = {
id: params.id,
name: params.name,
parent: chartNode[0],
traces: [],
layout: {
xaxis: {
range: [0, Number(this.meta[params.path].iteration_count)],
type: ‘linear’,
tickmode: ‘auto’,
showspikes: true,
spikethickness: 1,
spikedash: ‘longdashdot’,
spikemode: ‘across’,
zeroline: false,
showgrid: false
},
yaxis: {
zeroline: false
//showgrid: false
//hoverformat : ‘.7f’
},
separators: ‘. ‘,
//hovermode: ‘x’,
margin: {l: 38, r: 0, t: 35, b: 30},
autosize: true,
showlegend: false
},
options: {
scrollZoom: false,
modeBarButtonsToRemove: [‘toggleSpikelines’],
displaylogo: false
}
};
this.charts[id] = this.plotly.plot(chartNode[0], this.traces[id].traces, this.traces[id].layout, this.traces[id].options);
chartNode[0].on(‘plotly_hover’, function(e) {
self.updateTracesValues(e.points[0].x);
});
chartNode[0].on(‘plotly_click’, function(e) {
self.updateTracesValues(e.points[0].x, true);
});
return this.charts[id];
};
CatboostIpython.prototype.getTrace = function(parent, params) {
var key = this.getKey(params),
chartSeries = [];
if (this.traces[key.chartId]) {
chartSeries = this.traces[key.chartId].traces.filter(function(trace) {
return trace.name === key.traceName;
});
}
if (chartSeries.length) {
return chartSeries[0];
} else {
this.getChart(parent, {id: key.chartId, name: params.chartName, path: params.path});
var plotParams = {
color: this.getNextColor(params.path, params.smoothed ? 0.2 : 1),
fillsmoothcolor: this.getNextColor(params.path, 0.1),
fillcolor: this.getNextColor(params.path, 0.4),
hoverinfo: params.cv_avg ? ‘skip’ : ‘text+x’,
width: params.cv_avg ? 2 : 1,
dash: params.type === ‘test’ ? ‘solid’ : ‘dot’
},
trace = {
name: key.traceName,
_params: params,
x: [],
y: [],
hovertext: [],
hoverinfo: plotParams.hoverinfo,
line: {
width: plotParams.width,
dash: plotParams.dash,
color: plotParams.color
},
mode: ‘lines’,
hoveron: ‘points’,
connectgaps: true
};
if (params.best_point) {
trace = {
name: key.traceName,
_params: params,
x: [],
y: [],
marker: {
width: 2,
color: plotParams.color
},
hovertext: [],
hoverinfo: ‘text’,
mode: ‘markers’,
type: ‘scatter’
};
}
if (params.best_value) {
trace = {
name: key.traceName,
_params: params,
x: [],
y: [],
line: {
width: 1,
dash: ‘dash’,
color: ‘#CCCCCC’
},
mode: ‘lines’,
connectgaps: true,
hoverinfo: ‘skip’
};
}
if (params.cv_stddev_last) {
trace.fill = ‘tonexty’;
}
trace._params.plotParams = plotParams;
this.traces[key.chartId].traces.push(trace);
return trace;
}
};
CatboostIpython.prototype.getKey = function(params) {
var traceName = [
params.train,
params.type,
params.indexOfSet,
(params.smoothed ? ‘smoothed’ : ”),
(params.best_point ? ‘best_pount’ : ”),
(params.best_value ? ‘best_value’ : ”),
(params.cv_avg ? ‘cv_avg’ : ”),
(params.cv_stddev_first ? ‘cv_stddev_first’ : ”),
(params.cv_stddev_last ? ‘cv_stddev_last’ : ”)
].join(‘;’);
return {
chartId: params.chartName,
traceName: traceName,
colorId: params.train
};
};
CatboostIpython.prototype.filterTracesEvery = function(traces, filter) {
traces = traces || this.traces[this.activeTab].traces;
return traces.filter(function(trace) {
for (var prop in filter) {
if (filter.hasOwnProperty(prop)) {
if (filter[prop] !== trace._params[prop]) {
return false;
}
}
}
return true;
});
};
CatboostIpython.prototype.filterTracesOne = function(traces, filter) {
traces = traces || this.traces[this.activeTab].traces;
return traces.filter(function(trace) {
for (var prop in filter) {
if (filter.hasOwnProperty(prop)) {
if (filter[prop] === trace._params[prop]) {
return true;
}
}
}
return false;
});
};
CatboostIpython.prototype.cleanSeries = function() {
$(‘.catboost-panel__series’, this.layout).html(”);
};
CatboostIpython.prototype.groupTraces = function() {
var traces = this.traces[this.activeTab].traces,
index = 0,
tracesHash = {};
traces.map(function(trace) {
var train = trace._params.train;
if (!tracesHash[train]) {
tracesHash[train] = {
index: index,
traces: [],
info: {
path: trace._params.path,
color: trace._params.plotParams.color
}
};
index++;
}
tracesHash[train].traces.push(trace);
});
return tracesHash;
};
CatboostIpython.prototype.drawTraces = function() {
if ($(‘.catboost-panel__series .catboost-panel__serie’, this.layout).length) {
return;
}
var html = ”,
tracesHash = this.groupTraces();
for (var train in tracesHash) {
if (tracesHash.hasOwnProperty(train)) {
html += this.drawTrace(train, tracesHash[train]);
}
}
$(‘.catboost-panel__series’, this.layout).html(html);
this.updateTracesValues();
this.addTracesEvents();
};
CatboostIpython.prototype.getTraceDefParams = function(params) {
var defParams = {
smoothed: undefined,
best_point: undefined,
best_value: undefined,
cv_avg: undefined,
cv_stddev_first: undefined,
cv_stddev_last: undefined
};
if (params) {
return $.extend(defParams, params);
} else {
return defParams;
}
};
CatboostIpython.prototype.drawTrace = function(train, hash) {
var info = hash.info,
id = ‘catboost-serie-‘ + this.index + ‘-‘ + hash.index,
traces = {
learn: this.filterTracesEvery(hash.traces, this.getTraceDefParams({type: ‘learn’})),
test: this.filterTracesEvery(hash.traces, this.getTraceDefParams({type: ‘test’}))
},
items = {
learn: {
middle: ”,
bottom: ”
},
test: {
middle: ”,
bottom: ”
}
},
tracesNames = ”;
[‘learn’, ‘test’].forEach(function(type) {
traces[type].forEach(function(trace) {
items[type].middle += ‘
‘
items[type].bottom += ‘
‘
tracesNames += ‘
‘
‘;
});
});
var timeSpendHtml = ‘
‘
‘
‘;
var html = ‘
‘
‘‘ +
‘
(this.getLaunchMode(info.path) !== ‘Eval’ ? timeSpendHtml : ”) +
‘
‘ +
‘
‘ +
‘
‘ +
‘
‘
‘
‘
tracesNames +
‘
‘ +
‘
items.learn.middle +
items.test.middle +
‘
‘ +
‘
items.learn.bottom +
items.test.bottom +
‘
‘ +
‘
‘ +
‘
‘;
return html;
};
CatboostIpython.prototype.updateTracesValues = function(iteration, click) {
var tracesHash = this.groupTraces();
for (var train in tracesHash) {
if (tracesHash.hasOwnProperty(train) && !this.layoutDisabled.traces[train]) {
this.updateTraceValues(train, tracesHash[train], iteration, click);
}
}
};
CatboostIpython.prototype.updateTracesBest = function() {
var tracesHash = this.groupTraces();
for (var train in tracesHash) {
if (tracesHash.hasOwnProperty(train) && !this.layoutDisabled.traces[train]) {
this.updateTraceBest(train, tracesHash[train]);
}
}
};
CatboostIpython.prototype.getBestValue = function(data) {
if (!data.length) {
return {
best: undefined,
index: -1
};
}
var best = data[0],
index = 0,
func = this.lossFuncs[this.traces[this.activeTab].name],
bestDiff = typeof func === ‘number’ ? Math.abs(data[0] – func) : 0;
for (var i = 1, l = data.length; i best) {
best = data[i];
index = i;
}
if (typeof func === ‘number’ && Math.abs(data[i] – func) maxLength) {
maxLength = origTrace.y.length;
}
});
for (var i = 0; i 0) {
avgTrace.x[i] = i;
avgTrace.y[i] = sum / count;
}
}
};
CatboostIpython.prototype.updateTracesCVStdDev = function() {
var tracesHash = this.groupTraces(),
firstTraces = this.filterTracesOne(tracesHash.traces, {cv_stddev_first: true}),
self = this;
firstTraces.forEach(function(trace) {
var origTraces = self.filterTracesEvery(tracesHash.traces, self.getTraceDefParams({
train: trace._params.train,
type: trace._params.type,
smoothed: trace._params.smoothed
})),
lastTraces = self.filterTracesEvery(tracesHash.traces, self.getTraceDefParams({
train: trace._params.train,
type: trace._params.type,
smoothed: trace._params.smoothed,
cv_stddev_last: true
}));
if (origTraces.length && lastTraces.length === 1) {
self.cvStdDevFunc(origTraces, trace, lastTraces[0]);
}
});
};
CatboostIpython.prototype.cvStdDevFunc = function(origTraces, firstTrace, lastTrace) {
var maxCount = origTraces.length,
maxLength = -1,
count,
sum,
i, j;
origTraces.forEach(function(origTrace) {
if (origTrace.y.length > maxLength) {
maxLength = origTrace.y.length;
}
});
for (i = 0; i i) {
firstTrace.hovertext[i] += this.hovertextParameters[i];
lastTrace.hovertext[i] += this.hovertextParameters[i];
}
}
};
CatboostIpython.prototype.updateTracesSmoothness = function() {
var tracesHash = this.groupTraces(),
smoothedTraces = this.filterTracesOne(tracesHash.traces, {smoothed: true}),
enabled = this.getSmoothness() > -1,
self = this;
smoothedTraces.forEach(function(trace) {
var origTraces = self.filterTracesEvery(tracesHash.traces, self.getTraceDefParams({
train: trace._params.train,
type: trace._params.type,
indexOfSet: trace._params.indexOfSet,
cv_avg: trace._params.cv_avg,
cv_stddev_first: trace._params.cv_stddev_first,
cv_stddev_last: trace._params.cv_stddev_last
})),
colorFlag = false;
if (origTraces.length === 1) {
origTraces = origTraces[0];
if (origTraces.visible) {
if (enabled) {
self.smoothFunc(origTraces, trace);
colorFlag = true;
}
self.highlightSmoothedTrace(origTraces, trace, colorFlag);
}
}
});
};
CatboostIpython.prototype.highlightSmoothedTrace = function(trace, smoothedTrace, flag) {
if (flag) {
smoothedTrace.line.color = trace._params.plotParams.color;
trace.line.color = smoothedTrace._params.plotParams.color;
trace.hoverinfo = ‘skip’;
if (trace._params.cv_stddev_last) {
trace.fillcolor = trace._params.plotParams.fillsmoothcolor;
}
} else {
trace.line.color = trace._params.plotParams.color;
trace.hoverinfo = trace._params.plotParams.hoverinfo;
if (trace._params.cv_stddev_last) {
trace.fillcolor = trace._params.plotParams.fillcolor;
}
}
};
CatboostIpython.prototype.smoothFunc = function(origTrace, smoothedTrace) {
var data = origTrace.y,
smoothedPoints = this.smooth(data, this.getSmoothness()),
smoothedIndex = 0,
self = this;
if (smoothedPoints.length) {
data.forEach(function (d, index) {
if (!smoothedTrace.x[index]) {
smoothedTrace.x[index] = index;
}
var nameOfSet = smoothedTrace._params.nameOfSet;
if (smoothedTrace._params.cv_stddev_first || smoothedTrace._params.cv_stddev_last) {
nameOfSet = smoothedTrace._params.type + ‘ std’;
}
smoothedTrace.y[index] = smoothedPoints[smoothedIndex];
smoothedTrace.hovertext[index] = nameOfSet + ‘`: ‘ + smoothedPoints[smoothedIndex].toPrecision(7);
if (self.hovertextParameters.length > index) {
smoothedTrace.hovertext[index] += self.hovertextParameters[index];
}
smoothedIndex++;
});
}
};
CatboostIpython.prototype.formatItemValue = function(value, index, suffix) {
if (typeof value === ‘undefined’) {
return ”;
}
suffix = suffix || ”;
return ‘‘ + value + ‘‘;
};
CatboostIpython.prototype.updateTraceBest = function(train, hash) {
var traces = this.filterTracesOne(hash.traces, {best_point: true}),
self = this;
traces.forEach(function(trace) {
var testTrace = self.filterTracesEvery(hash.traces, self.getTraceDefParams({
train: trace._params.train,
type: ‘test’,
indexOfSet: trace._params.indexOfSet
}));
if (self.hasCVMode) {
testTrace = self.filterTracesEvery(hash.traces, self.getTraceDefParams({
train: trace._params.train,
type: ‘test’,
cv_avg: true
}));
}
var bestValue = self.getBestValue(testTrace.length === 1 ? testTrace[0].y : []);
if (bestValue.index !== -1) {
trace.x[0] = bestValue.index;
trace.y[0] = bestValue.best;
trace.hovertext[0] = bestValue.func + ‘ (‘ + (self.hasCVMode ? ‘avg’ : trace._params.nameOfSet) + ‘): ‘ + bestValue.index + ‘ ‘ + bestValue.best;
}
});
};
CatboostIpython.prototype.updateTraceValues = function(name, hash, iteration, click) {
var id = ‘catboost-serie-‘ + this.index + ‘-‘ + hash.index,
traces = {
learn: this.filterTracesEvery(hash.traces, this.getTraceDefParams({type: ‘learn’})),
test: this.filterTracesEvery(hash.traces, this.getTraceDefParams({type: ‘test’}))
},
path = hash.info.path,
self = this;
[‘learn’, ‘test’].forEach(function(type) {
traces[type].forEach(function(trace) {
var data = trace.y || [],
index = typeof iteration !== ‘undefined’ && iteration -1 ? bestValue.index : ”);
$(‘#’ + id + ‘ .catboost-panel__serie_best_test_value[data-index=’ + trace._params.indexOfSet + ‘]’, self.layout)
.html(self.formatItemValue(bestValue.best, bestValue.index, ‘best ‘ + trace._params.nameOfSet + ‘ ‘));
}
});
});
if (this.hasCVMode) {
var testTrace = this.filterTracesEvery(hash.traces, this.getTraceDefParams({
type: ‘test’,
cv_avg: true
})),
bestValue = this.getBestValue(testTrace.length === 1 ? testTrace[0].y : []);
$(‘#’ + id + ‘ .catboost-panel__serie_best_iteration’, this.layout).html(bestValue.index > -1 ? bestValue.index : ”);
}
if (click) {
this.clickMode = true;
$(‘#catboost-control2-clickmode’ + this.index, this.layout)[0].checked = true;
}
};
CatboostIpython.prototype.addTracesEvents = function() {
var self = this;
$(‘.catboost-panel__serie_checkbox’, this.layout).click(function() {
var name = $(this).data(‘seriename’);
self.layoutDisabled.traces[name] = !$(this)[0].checked;
self.redrawActiveChart();
});
};
CatboostIpython.prototype.getNextColor = function(path, opacity) {
var color;
if (this.colorsByPath[path]) {
color = this.colorsByPath[path];
} else {
color = this.colors[this.colorIndex];
this.colorsByPath[path] = color;
this.colorIndex++;
if (this.colorIndex > this.colors.length – 1) {
this.colorIndex = 0;
}
}
return this.hexToRgba(color, opacity);
};
CatboostIpython.prototype.hexToRgba = function(value, opacity) {
if (value.length 0) {
out += hours + ‘h ‘;
seconds = 0;
millis = 0;
}
if (minutes && minutes > 0) {
out += minutes + ‘m ‘;
millis = 0;
}
if (seconds && seconds > 0) {
out += seconds + ‘s ‘;
}
if (millis && millis > 0) {
out += millis + ‘ms’;
}
return out.trim();
};
CatboostIpython.prototype.mean = function(values, valueof) {
var n = values.length,
m = n,
i = -1,
value,
sum = 0,
number = function(x) {
return x === null ? NaN : +x;
};
if (valueof === null) {
while (++i
print('ZWYKŁY MODEL - Na zbiorze testowym accurace: ', accuracy_score(y_validation, model.predict(X_validation)))
print('NAJLEPSZY MODEL - Na zbiorze testowym accurace: ', accuracy_score(y_validation, best_model.predict(X_validation)))
SPRAWDZAM jaki jest ten najlepszy model najlepszy¶
y_bestPred = best_model.predict(X_validation)
Classification_Assessment(best_model,X_train, y_train, X_validation, y_validation, y_bestPred)
3.2 Wczesne zatrzymanie¶
Jeśli zasadniczo masz zestaw sprawdzania poprawności, zawsze łatwiej i lepiej jest skorzystać z wczesnego zatrzymania. Ta funkcja jest podobna do poprzedniej, ale oprócz poprawy jakości wciąż oszczędza czas.
Czas robienia modelu bez ‘earlystop’
%%time
model = CatBoostClassifier(**params)
model.fit(train_pool, eval_set=validate_pool)
Czas robienia modelu z ‘earlystop’
%%time
earlystop_params = params.copy() #<-- tradycyjne dodawanie parametrów
earlystop_params.update({
'od_type': 'Iter',
'od_wait': 40
})
earlystop_model = CatBoostClassifier(**earlystop_params)
earlystop_model.fit(train_pool, eval_set=validate_pool)
Nowe parametry ‘earlystop’:
earlystop_params
print('Simple model tree count: {}'.format(model.tree_count_))
print('Simple model validation accuracy: {:.4}'.format(
accuracy_score(y_validation, model.predict(X_validation))
))
print('')
print('Early-stopped model tree count: {}'.format(earlystop_model.tree_count_))
print('Early-stopped model validation accuracy: {:.4}'.format(
accuracy_score(y_validation, earlystop_model.predict(X_validation))
))
Dzięki temu uzyskujemy lepszą jakość w krótszym czasie.
Chociaż, jak pokazano wcześniej, prosty schemat sprawdzania poprawności nie opisuje dokładnie wyniku poza zmiennymi treningowymi (może być tendencyjny z powodu podziału zestawu danych), nadal dobrze jest śledzić dynamikę ulepszeń modelu – a zatem, jak widać z tego przykładu, jest to naprawdę dobrze jest wcześniej zatrzymać proces wzmacniania (zanim rozpocznie się nadmierne dopasowanie)
Rozkładam to na czynniki pierwsze¶
Czyli model kończy się na najlepszym uzyskanym wyniku ‘accuracy’
earlystop_model.fit(train_pool, eval_set=validate_pool, plot=True)
‘;
}
this.layout = $(‘
‘
‘
‘‘ +
‘
‘‘ +
‘
‘
‘ +
‘
‘
‘ +
‘
‘‘ +
‘
‘‘ +
‘
‘
‘‘ +
‘
‘‘ +
‘‘ +
‘
‘ +
cvAreaControls +
‘
‘ +
‘
‘ +
‘
‘
‘
‘
‘ +
‘
‘);
$(parent).append(this.layout);
this.addTabEvents();
this.addControlEvents();
};
CatboostIpython.prototype.addTabEvents = function() {
var self = this;
$(‘.catboost-graph__tabs’, this.layout).click(function(e) {
if (!$(e.target).is(‘.catboost-graph__tab:not(.catboost-graph__tab_active)’)) {
return;
}
var id = $(e.target).attr(‘tabid’);
self.activeTab = id;
$(‘.catboost-graph__tab_active’, self.layout).removeClass(‘catboost-graph__tab_active’);
$(‘.catboost-graph__chart_active’, self.layout).removeClass(‘catboost-graph__chart_active’);
$(‘.catboost-graph__tab[tabid=”‘ + id + ‘”]’, self.layout).addClass(‘catboost-graph__tab_active’);
$(‘.catboost-graph__chart[tabid=”‘ + id + ‘”]’, self.layout).addClass(‘catboost-graph__chart_active’);
self.cleanSeries();
self.redrawActiveChart();
self.resizeCharts();
});
};
CatboostIpython.prototype.addControlEvents = function() {
var self = this;
$(‘#catboost-control-learn’ + this.index, this.layout).click(function() {
self.layoutDisabled.learn = !$(this)[0].checked;
$(‘.catboost-panel__series’, self.layout).toggleClass(‘catboost-panel__series_learn_disabled’, self.layoutDisabled.learn);
self.redrawActiveChart();
});
$(‘#catboost-control-test’ + this.index, this.layout).click(function() {
self.layoutDisabled.test = !$(this)[0].checked;
$(‘.catboost-panel__series’, self.layout).toggleClass(‘catboost-panel__series_test_disabled’, self.layoutDisabled.test);
self.redrawActiveChart();
});
$(‘#catboost-control2-clickmode’ + this.index, this.layout).click(function() {
self.clickMode = $(this)[0].checked;
});
$(‘#catboost-control2-log’ + this.index, this.layout).click(function() {
self.logarithmMode = $(this)[0].checked ? ‘log’ : ‘linear’;
self.forEveryLayout(function(layout) {
layout.yaxis = {type: self.logarithmMode};
});
self.redrawActiveChart();
});
var slider = $(‘#catboost-control2-slider’ + this.index),
sliderValue = $(‘#catboost-control2-slidervalue’ + this.index);
$(‘#catboost-control2-smooth’ + this.index, this.layout).click(function() {
var enabled = $(this)[0].checked;
self.setSmoothness(enabled ? self.lastSmooth : -1);
slider.prop(‘disabled’, !enabled);
sliderValue.prop(‘disabled’, !enabled);
self.redrawActiveChart();
});
$(‘#catboost-control2-cvstddev’ + this.index, this.layout).click(function() {
var enabled = $(this)[0].checked;
self.setStddev(enabled);
self.redrawActiveChart();
});
slider.on(‘input change’, function() {
var smooth = Number($(this).val());
sliderValue.val(isNaN(smooth) ? 0 : smooth);
self.setSmoothness(smooth);
self.lastSmooth = smooth;
self.redrawActiveChart();
});
sliderValue.on(‘input change’, function() {
var smooth = Number($(this).val());
slider.val(isNaN(smooth) ? 0 : smooth);
self.setSmoothness(smooth);
self.lastSmooth = smooth;
self.redrawActiveChart();
});
};
CatboostIpython.prototype.setTraceVisibility = function(trace, visibility) {
if (trace) {
trace.visible = visibility;
}
};
CatboostIpython.prototype.updateTracesVisibility = function() {
var tracesHash = this.groupTraces(),
traces,
smoothDisabled = this.getSmoothness() === -1,
self = this;
for (var train in tracesHash) {
if (tracesHash.hasOwnProperty(train)) {
traces = tracesHash[train].traces;
if (this.layoutDisabled.traces[train]) {
traces.forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
} else {
traces.forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
if (this.hasCVMode) {
if (this.stddevEnabled) {
self.filterTracesOne(traces, {type: ‘learn’}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesOne(traces, {type: ‘test’}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘learn’, cv_avg: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘learn’, cv_avg: true, smoothed: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true, smoothed: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true, best_point: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesOne(traces, {cv_stddev_first: true}).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesOne(traces, {cv_stddev_last: true}).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
} else {
self.filterTracesOne(traces, {cv_stddev_first: true}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesOne(traces, {cv_stddev_last: true}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘learn’, cv_avg: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘learn’, cv_avg: true, smoothed: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true, smoothed: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true, best_point: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
}
}
if (smoothDisabled) {
self.filterTracesOne(traces, {smoothed: true}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
}
if (this.layoutDisabled[‘learn’]) {
self.filterTracesOne(traces, {type: ‘learn’}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
}
if (this.layoutDisabled[‘test’]) {
self.filterTracesOne(traces, {type: ‘test’}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
}
}
}
}
};
CatboostIpython.prototype.getSmoothness = function() {
return this.smoothness && this.smoothness > -1 ? this.smoothness : -1;
};
CatboostIpython.prototype.setSmoothness = function(weight) {
if (weight 1) {
return;
}
this.smoothness = weight;
};
CatboostIpython.prototype.setStddev = function(enabled) {
this.stddevEnabled = enabled;
};
CatboostIpython.prototype.redrawActiveChart = function() {
this.chartsToRedraw[this.activeTab] = true;
this.redrawAll();
};
CatboostIpython.prototype.redraw = function() {
if (this.chartsToRedraw[this.activeTab]) {
this.chartsToRedraw[this.activeTab] = false;
this.updateTracesVisibility();
this.updateTracesCV();
this.updateTracesBest();
this.updateTracesValues();
this.updateTracesSmoothness();
this.plotly.redraw(this.traces[this.activeTab].parent);
}
this.drawTraces();
};
CatboostIpython.prototype.addRedrawFunc = function() {
this.redrawFunc = throttle(this.redraw, 400, false, this);
};
CatboostIpython.prototype.redrawAll = function() {
if (!this.redrawFunc) {
this.addRedrawFunc();
}
this.redrawFunc();
};
CatboostIpython.prototype.addPoints = function(parent, data) {
var self = this;
data.chunks.forEach(function(item) {
if (typeof item.remaining_time !== ‘undefined’ && typeof item.passed_time !== ‘undefined’) {
if (!self.timeLeft[data.path]) {
self.timeLeft[data.path] = [];
}
self.timeLeft[data.path][item.iteration] = [item.remaining_time, item.passed_time];
}
[‘test’, ‘learn’].forEach(function(type) {
var sets = self.meta[data.path][type + ‘_sets’],
metrics = self.meta[data.path][type + ‘_metrics’];
for (var i = 0; i ‘ + parameter + ‘ : ‘ + valueOfParameter;
}
}
}
if (!hovertextParametersAdded && type === ‘test’) {
hovertextParametersAdded = true;
trace.hovertext[pointIndex] += self.hovertextParameters[pointIndex];
}
smoothedTrace.x[pointIndex] = pointIndex;
}
if (bestValueTrace) {
bestValueTrace.x[pointIndex] = pointIndex;
bestValueTrace.y[pointIndex] = self.lossFuncs[nameOfMetric];
}
if (launchMode === ‘CV’ && !cvAdded) {
cvAdded = true;
self.getTrace(parent, $.extend({cv_stddev_first: true}, params));
self.getTrace(parent, $.extend({cv_stddev_last: true}, params));
self.getTrace(parent, $.extend({cv_stddev_first: true, smoothed: true}, params));
self.getTrace(parent, $.extend({cv_stddev_last: true, smoothed: true}, params));
self.getTrace(parent, $.extend({cv_avg: true}, params));
self.getTrace(parent, $.extend({cv_avg: true, smoothed: true}, params));
if (type === ‘test’) {
self.getTrace(parent, $.extend({cv_avg: true, best_point: true}, params));
}
}
}
self.chartsToRedraw[key.chartId] = true;
self.redrawAll();
}
});
});
};
CatboostIpython.prototype.getLaunchMode = function(path) {
return this.meta[path].launch_mode;
};
CatboostIpython.prototype.getChartNode = function(params, active) {
var node = $(‘
if (active) {
node.addClass(‘catboost-graph__chart_active’);
}
return node;
};
CatboostIpython.prototype.getChartTab = function(params, active) {
var node = $(‘
‘);
if (active) {
node.addClass(‘catboost-graph__tab_active’);
}
return node;
};
CatboostIpython.prototype.forEveryChart = function(callback) {
for (var name in this.traces) {
if (this.traces.hasOwnProperty(name)) {
callback(this.traces[name]);
}
}
};
CatboostIpython.prototype.forEveryLayout = function(callback) {
this.forEveryChart(function(chart) {
callback(chart.layout);
});
};
CatboostIpython.prototype.getChart = function(parent, params) {
var id = params.id,
self = this;
if (this.charts[id]) {
return this.charts[id];
}
this.addLayout(parent);
var active = this.activeTab === params.id,
chartNode = this.getChartNode(params, active),
chartTab = this.getChartTab(params, active);
$(‘.catboost-graph__charts’, this.layout).append(chartNode);
$(‘.catboost-graph__tabs’, this.layout).append(chartTab);
this.traces[id] = {
id: params.id,
name: params.name,
parent: chartNode[0],
traces: [],
layout: {
xaxis: {
range: [0, Number(this.meta[params.path].iteration_count)],
type: ‘linear’,
tickmode: ‘auto’,
showspikes: true,
spikethickness: 1,
spikedash: ‘longdashdot’,
spikemode: ‘across’,
zeroline: false,
showgrid: false
},
yaxis: {
zeroline: false
//showgrid: false
//hoverformat : ‘.7f’
},
separators: ‘. ‘,
//hovermode: ‘x’,
margin: {l: 38, r: 0, t: 35, b: 30},
autosize: true,
showlegend: false
},
options: {
scrollZoom: false,
modeBarButtonsToRemove: [‘toggleSpikelines’],
displaylogo: false
}
};
this.charts[id] = this.plotly.plot(chartNode[0], this.traces[id].traces, this.traces[id].layout, this.traces[id].options);
chartNode[0].on(‘plotly_hover’, function(e) {
self.updateTracesValues(e.points[0].x);
});
chartNode[0].on(‘plotly_click’, function(e) {
self.updateTracesValues(e.points[0].x, true);
});
return this.charts[id];
};
CatboostIpython.prototype.getTrace = function(parent, params) {
var key = this.getKey(params),
chartSeries = [];
if (this.traces[key.chartId]) {
chartSeries = this.traces[key.chartId].traces.filter(function(trace) {
return trace.name === key.traceName;
});
}
if (chartSeries.length) {
return chartSeries[0];
} else {
this.getChart(parent, {id: key.chartId, name: params.chartName, path: params.path});
var plotParams = {
color: this.getNextColor(params.path, params.smoothed ? 0.2 : 1),
fillsmoothcolor: this.getNextColor(params.path, 0.1),
fillcolor: this.getNextColor(params.path, 0.4),
hoverinfo: params.cv_avg ? ‘skip’ : ‘text+x’,
width: params.cv_avg ? 2 : 1,
dash: params.type === ‘test’ ? ‘solid’ : ‘dot’
},
trace = {
name: key.traceName,
_params: params,
x: [],
y: [],
hovertext: [],
hoverinfo: plotParams.hoverinfo,
line: {
width: plotParams.width,
dash: plotParams.dash,
color: plotParams.color
},
mode: ‘lines’,
hoveron: ‘points’,
connectgaps: true
};
if (params.best_point) {
trace = {
name: key.traceName,
_params: params,
x: [],
y: [],
marker: {
width: 2,
color: plotParams.color
},
hovertext: [],
hoverinfo: ‘text’,
mode: ‘markers’,
type: ‘scatter’
};
}
if (params.best_value) {
trace = {
name: key.traceName,
_params: params,
x: [],
y: [],
line: {
width: 1,
dash: ‘dash’,
color: ‘#CCCCCC’
},
mode: ‘lines’,
connectgaps: true,
hoverinfo: ‘skip’
};
}
if (params.cv_stddev_last) {
trace.fill = ‘tonexty’;
}
trace._params.plotParams = plotParams;
this.traces[key.chartId].traces.push(trace);
return trace;
}
};
CatboostIpython.prototype.getKey = function(params) {
var traceName = [
params.train,
params.type,
params.indexOfSet,
(params.smoothed ? ‘smoothed’ : ”),
(params.best_point ? ‘best_pount’ : ”),
(params.best_value ? ‘best_value’ : ”),
(params.cv_avg ? ‘cv_avg’ : ”),
(params.cv_stddev_first ? ‘cv_stddev_first’ : ”),
(params.cv_stddev_last ? ‘cv_stddev_last’ : ”)
].join(‘;’);
return {
chartId: params.chartName,
traceName: traceName,
colorId: params.train
};
};
CatboostIpython.prototype.filterTracesEvery = function(traces, filter) {
traces = traces || this.traces[this.activeTab].traces;
return traces.filter(function(trace) {
for (var prop in filter) {
if (filter.hasOwnProperty(prop)) {
if (filter[prop] !== trace._params[prop]) {
return false;
}
}
}
return true;
});
};
CatboostIpython.prototype.filterTracesOne = function(traces, filter) {
traces = traces || this.traces[this.activeTab].traces;
return traces.filter(function(trace) {
for (var prop in filter) {
if (filter.hasOwnProperty(prop)) {
if (filter[prop] === trace._params[prop]) {
return true;
}
}
}
return false;
});
};
CatboostIpython.prototype.cleanSeries = function() {
$(‘.catboost-panel__series’, this.layout).html(”);
};
CatboostIpython.prototype.groupTraces = function() {
var traces = this.traces[this.activeTab].traces,
index = 0,
tracesHash = {};
traces.map(function(trace) {
var train = trace._params.train;
if (!tracesHash[train]) {
tracesHash[train] = {
index: index,
traces: [],
info: {
path: trace._params.path,
color: trace._params.plotParams.color
}
};
index++;
}
tracesHash[train].traces.push(trace);
});
return tracesHash;
};
CatboostIpython.prototype.drawTraces = function() {
if ($(‘.catboost-panel__series .catboost-panel__serie’, this.layout).length) {
return;
}
var html = ”,
tracesHash = this.groupTraces();
for (var train in tracesHash) {
if (tracesHash.hasOwnProperty(train)) {
html += this.drawTrace(train, tracesHash[train]);
}
}
$(‘.catboost-panel__series’, this.layout).html(html);
this.updateTracesValues();
this.addTracesEvents();
};
CatboostIpython.prototype.getTraceDefParams = function(params) {
var defParams = {
smoothed: undefined,
best_point: undefined,
best_value: undefined,
cv_avg: undefined,
cv_stddev_first: undefined,
cv_stddev_last: undefined
};
if (params) {
return $.extend(defParams, params);
} else {
return defParams;
}
};
CatboostIpython.prototype.drawTrace = function(train, hash) {
var info = hash.info,
id = ‘catboost-serie-‘ + this.index + ‘-‘ + hash.index,
traces = {
learn: this.filterTracesEvery(hash.traces, this.getTraceDefParams({type: ‘learn’})),
test: this.filterTracesEvery(hash.traces, this.getTraceDefParams({type: ‘test’}))
},
items = {
learn: {
middle: ”,
bottom: ”
},
test: {
middle: ”,
bottom: ”
}
},
tracesNames = ”;
[‘learn’, ‘test’].forEach(function(type) {
traces[type].forEach(function(trace) {
items[type].middle += ‘
‘
items[type].bottom += ‘
‘
tracesNames += ‘
‘
‘;
});
});
var timeSpendHtml = ‘
‘
‘
‘;
var html = ‘
‘
‘‘ +
‘
(this.getLaunchMode(info.path) !== ‘Eval’ ? timeSpendHtml : ”) +
‘
‘ +
‘
‘ +
‘
‘ +
‘
‘
‘
‘
tracesNames +
‘
‘ +
‘
items.learn.middle +
items.test.middle +
‘
‘ +
‘
items.learn.bottom +
items.test.bottom +
‘
‘ +
‘
‘ +
‘
‘;
return html;
};
CatboostIpython.prototype.updateTracesValues = function(iteration, click) {
var tracesHash = this.groupTraces();
for (var train in tracesHash) {
if (tracesHash.hasOwnProperty(train) && !this.layoutDisabled.traces[train]) {
this.updateTraceValues(train, tracesHash[train], iteration, click);
}
}
};
CatboostIpython.prototype.updateTracesBest = function() {
var tracesHash = this.groupTraces();
for (var train in tracesHash) {
if (tracesHash.hasOwnProperty(train) && !this.layoutDisabled.traces[train]) {
this.updateTraceBest(train, tracesHash[train]);
}
}
};
CatboostIpython.prototype.getBestValue = function(data) {
if (!data.length) {
return {
best: undefined,
index: -1
};
}
var best = data[0],
index = 0,
func = this.lossFuncs[this.traces[this.activeTab].name],
bestDiff = typeof func === ‘number’ ? Math.abs(data[0] – func) : 0;
for (var i = 1, l = data.length; i best) {
best = data[i];
index = i;
}
if (typeof func === ‘number’ && Math.abs(data[i] – func) maxLength) {
maxLength = origTrace.y.length;
}
});
for (var i = 0; i 0) {
avgTrace.x[i] = i;
avgTrace.y[i] = sum / count;
}
}
};
CatboostIpython.prototype.updateTracesCVStdDev = function() {
var tracesHash = this.groupTraces(),
firstTraces = this.filterTracesOne(tracesHash.traces, {cv_stddev_first: true}),
self = this;
firstTraces.forEach(function(trace) {
var origTraces = self.filterTracesEvery(tracesHash.traces, self.getTraceDefParams({
train: trace._params.train,
type: trace._params.type,
smoothed: trace._params.smoothed
})),
lastTraces = self.filterTracesEvery(tracesHash.traces, self.getTraceDefParams({
train: trace._params.train,
type: trace._params.type,
smoothed: trace._params.smoothed,
cv_stddev_last: true
}));
if (origTraces.length && lastTraces.length === 1) {
self.cvStdDevFunc(origTraces, trace, lastTraces[0]);
}
});
};
CatboostIpython.prototype.cvStdDevFunc = function(origTraces, firstTrace, lastTrace) {
var maxCount = origTraces.length,
maxLength = -1,
count,
sum,
i, j;
origTraces.forEach(function(origTrace) {
if (origTrace.y.length > maxLength) {
maxLength = origTrace.y.length;
}
});
for (i = 0; i i) {
firstTrace.hovertext[i] += this.hovertextParameters[i];
lastTrace.hovertext[i] += this.hovertextParameters[i];
}
}
};
CatboostIpython.prototype.updateTracesSmoothness = function() {
var tracesHash = this.groupTraces(),
smoothedTraces = this.filterTracesOne(tracesHash.traces, {smoothed: true}),
enabled = this.getSmoothness() > -1,
self = this;
smoothedTraces.forEach(function(trace) {
var origTraces = self.filterTracesEvery(tracesHash.traces, self.getTraceDefParams({
train: trace._params.train,
type: trace._params.type,
indexOfSet: trace._params.indexOfSet,
cv_avg: trace._params.cv_avg,
cv_stddev_first: trace._params.cv_stddev_first,
cv_stddev_last: trace._params.cv_stddev_last
})),
colorFlag = false;
if (origTraces.length === 1) {
origTraces = origTraces[0];
if (origTraces.visible) {
if (enabled) {
self.smoothFunc(origTraces, trace);
colorFlag = true;
}
self.highlightSmoothedTrace(origTraces, trace, colorFlag);
}
}
});
};
CatboostIpython.prototype.highlightSmoothedTrace = function(trace, smoothedTrace, flag) {
if (flag) {
smoothedTrace.line.color = trace._params.plotParams.color;
trace.line.color = smoothedTrace._params.plotParams.color;
trace.hoverinfo = ‘skip’;
if (trace._params.cv_stddev_last) {
trace.fillcolor = trace._params.plotParams.fillsmoothcolor;
}
} else {
trace.line.color = trace._params.plotParams.color;
trace.hoverinfo = trace._params.plotParams.hoverinfo;
if (trace._params.cv_stddev_last) {
trace.fillcolor = trace._params.plotParams.fillcolor;
}
}
};
CatboostIpython.prototype.smoothFunc = function(origTrace, smoothedTrace) {
var data = origTrace.y,
smoothedPoints = this.smooth(data, this.getSmoothness()),
smoothedIndex = 0,
self = this;
if (smoothedPoints.length) {
data.forEach(function (d, index) {
if (!smoothedTrace.x[index]) {
smoothedTrace.x[index] = index;
}
var nameOfSet = smoothedTrace._params.nameOfSet;
if (smoothedTrace._params.cv_stddev_first || smoothedTrace._params.cv_stddev_last) {
nameOfSet = smoothedTrace._params.type + ‘ std’;
}
smoothedTrace.y[index] = smoothedPoints[smoothedIndex];
smoothedTrace.hovertext[index] = nameOfSet + ‘`: ‘ + smoothedPoints[smoothedIndex].toPrecision(7);
if (self.hovertextParameters.length > index) {
smoothedTrace.hovertext[index] += self.hovertextParameters[index];
}
smoothedIndex++;
});
}
};
CatboostIpython.prototype.formatItemValue = function(value, index, suffix) {
if (typeof value === ‘undefined’) {
return ”;
}
suffix = suffix || ”;
return ‘‘ + value + ‘‘;
};
CatboostIpython.prototype.updateTraceBest = function(train, hash) {
var traces = this.filterTracesOne(hash.traces, {best_point: true}),
self = this;
traces.forEach(function(trace) {
var testTrace = self.filterTracesEvery(hash.traces, self.getTraceDefParams({
train: trace._params.train,
type: ‘test’,
indexOfSet: trace._params.indexOfSet
}));
if (self.hasCVMode) {
testTrace = self.filterTracesEvery(hash.traces, self.getTraceDefParams({
train: trace._params.train,
type: ‘test’,
cv_avg: true
}));
}
var bestValue = self.getBestValue(testTrace.length === 1 ? testTrace[0].y : []);
if (bestValue.index !== -1) {
trace.x[0] = bestValue.index;
trace.y[0] = bestValue.best;
trace.hovertext[0] = bestValue.func + ‘ (‘ + (self.hasCVMode ? ‘avg’ : trace._params.nameOfSet) + ‘): ‘ + bestValue.index + ‘ ‘ + bestValue.best;
}
});
};
CatboostIpython.prototype.updateTraceValues = function(name, hash, iteration, click) {
var id = ‘catboost-serie-‘ + this.index + ‘-‘ + hash.index,
traces = {
learn: this.filterTracesEvery(hash.traces, this.getTraceDefParams({type: ‘learn’})),
test: this.filterTracesEvery(hash.traces, this.getTraceDefParams({type: ‘test’}))
},
path = hash.info.path,
self = this;
[‘learn’, ‘test’].forEach(function(type) {
traces[type].forEach(function(trace) {
var data = trace.y || [],
index = typeof iteration !== ‘undefined’ && iteration -1 ? bestValue.index : ”);
$(‘#’ + id + ‘ .catboost-panel__serie_best_test_value[data-index=’ + trace._params.indexOfSet + ‘]’, self.layout)
.html(self.formatItemValue(bestValue.best, bestValue.index, ‘best ‘ + trace._params.nameOfSet + ‘ ‘));
}
});
});
if (this.hasCVMode) {
var testTrace = this.filterTracesEvery(hash.traces, this.getTraceDefParams({
type: ‘test’,
cv_avg: true
})),
bestValue = this.getBestValue(testTrace.length === 1 ? testTrace[0].y : []);
$(‘#’ + id + ‘ .catboost-panel__serie_best_iteration’, this.layout).html(bestValue.index > -1 ? bestValue.index : ”);
}
if (click) {
this.clickMode = true;
$(‘#catboost-control2-clickmode’ + this.index, this.layout)[0].checked = true;
}
};
CatboostIpython.prototype.addTracesEvents = function() {
var self = this;
$(‘.catboost-panel__serie_checkbox’, this.layout).click(function() {
var name = $(this).data(‘seriename’);
self.layoutDisabled.traces[name] = !$(this)[0].checked;
self.redrawActiveChart();
});
};
CatboostIpython.prototype.getNextColor = function(path, opacity) {
var color;
if (this.colorsByPath[path]) {
color = this.colorsByPath[path];
} else {
color = this.colors[this.colorIndex];
this.colorsByPath[path] = color;
this.colorIndex++;
if (this.colorIndex > this.colors.length – 1) {
this.colorIndex = 0;
}
}
return this.hexToRgba(color, opacity);
};
CatboostIpython.prototype.hexToRgba = function(value, opacity) {
if (value.length 0) {
out += hours + ‘h ‘;
seconds = 0;
millis = 0;
}
if (minutes && minutes > 0) {
out += minutes + ‘m ‘;
millis = 0;
}
if (seconds && seconds > 0) {
out += seconds + ‘s ‘;
}
if (millis && millis > 0) {
out += millis + ‘ms’;
}
return out.trim();
};
CatboostIpython.prototype.mean = function(values, valueof) {
var n = values.length,
m = n,
i = -1,
value,
sum = 0,
number = function(x) {
return x === null ? NaN : +x;
};
if (valueof === null) {
while (++i
y_earlystop = earlystop_model.predict(X_validation)
Classification_Assessment(earlystop_model,X_train, y_train, X_validation, y_validation, y_earlystop)
3.3 Korzystanie z linii bazowej¶
Możliwe jest wykorzystanie wyników przedtreningowych (wyjściowych) do treningu.
Nie wiem po co to jest – daje słabe wyniki itd
current_params = params.copy()
current_params.update({
'iterations': 10
})
model = CatBoostClassifier(**current_params).fit(X_train, y_train, categorical_features_indices)
# Get baseline (only with prediction_type='RawFormulaVal')
baseline = model.predict(X_train, prediction_type='RawFormulaVal')
# Fit new model
model.fit(X_train, y_train, categorical_features_indices, baseline=baseline)
Znowu zmieniam parapetry dodaje jakiś parametr:’iterations’: 10
model_cp = CatBoostClassifier(**current_params)
model_cp = model_cp.fit(X_train, y_train, categorical_features_indices)
Uzyskaj linię bazową (tylko z prediction_type = ‘RawFormulaVal’)
baseline = model_cp.predict(X_train, prediction_type='RawFormulaVal')
Fit new model
model_cp.fit(X_train, y_train, categorical_features_indices, baseline=baseline)
y_pred_cp = model_cp.predict(X_validation)
Classification_Assessment(model_cp,X_train, y_train, X_validation, y_validation, y_pred_cp)
3.4 Obsługa migawek¶
Catboost obsługuje migawki. Możesz go użyć do odzyskania treningu po przerwie lub do rozpoczęcia treningu z wcześniejszymi wynikami.
params_with_snapshot = params.copy() #<-- tradycyjnie dodajemy nowe parametry
params_with_snapshot.update({
'iterations': 5,
'learning_rate': 0.5,
'logging_level': 'Verbose'
})
model = CatBoostClassifier(**params_with_snapshot).fit(train_pool, eval_set=validate_pool, save_snapshot=True)
params_with_snapshot.update({ #<-- zmieniamy ustawienia migawki
'iterations': 10,
'learning_rate': 0.1,
})
model = CatBoostClassifier(**params_with_snapshot).fit(train_pool, eval_set=validate_pool, save_snapshot=True)
3.5 Funkcja celu zdefiniowana przez użytkownika¶
Możliwe jest stworzenie własnej funkcji celu. Utwórzmy funkcję celu logloss.
przybliżenia, cele, wagi są indeksowanymi pojemnikami pływaków
(pojemniki, które mają zdefiniowane tylko len i getitem).
parametrem wag może być Brak.
Aby zrozumieć, co oznaczają te parametry, załóż, że istnieje podzbiór zestawu danych, który jest obecnie przetwarzany. Program przybliża zawiera bieżące prognozy dla tego podzbioru, cele zawierają wartości docelowe podane w zestawie danych.
Ta funkcja powinna zwrócić listę par (der1, der2), gdzie
der1 jest pierwszą pochodną funkcji straty w odniesieniu do
do przewidywanej wartości, a der2 jest drugą pochodną.
W naszym przypadku logloss jest definiowany za pomocą następującej formuły:
cel log (sigmoid (w przybliżeniu)) + (1 – cel) (1 – sigmoid (w przybliżeniu))
gdzie sigmoid (x) = 1 / (1 + e ^ (- x)).
class LoglossObjective(object):
def calc_ders_range(self, approxes, targets, weights):
# approxes, targets, weights are indexed containers of floats
# (containers which have only __len__ and __getitem__ defined).
# weights parameter can be None.
#
# To understand what these parameters mean, assume that there is
# a subset of your dataset that is currently being processed.
# approxes contains current predictions for this subset,
# targets contains target values you provided with the dataset.
#
# This function should return a list of pairs (der1, der2), where
# der1 is the first derivative of the loss function with respect
# to the predicted value, and der2 is the second derivative.
#
# In our case, logloss is defined by the following formula:
# target * log(sigmoid(approx)) + (1 - target) * (1 - sigmoid(approx))
# where sigmoid(x) = 1 / (1 + e^(-x)).
assert len(approxes) == len(targets)
if weights is not None:
assert len(weights) == len(approxes)
result = []
for index in range(len(targets)):
e = np.exp(approxes[index])
p = e / (1 + e)
der1 = (1 - p) if targets[index] > 0.0 else -p
der2 = -p * (1 - p)
if weights is not None:
der1 *= weights[index]
der2 *= weights[index]
result.append((der1, der2))
return result
model = CatBoostClassifier(
iterations=10,
random_seed=42,
loss_function=LoglossObjective(), ##<-- dodajemy własnoręcznie wymyśloną funkcję
eval_metric="Logloss"
)
# Fit model
model.fit(train_pool)
# Only prediction_type='RawFormulaVal' is allowed with custom `loss_function`
preds_raw = model.predict(X_validation, prediction_type='RawFormulaVal')
3.6 Funkcja metryczna zdefiniowana przez użytkownika¶
Możliwe jest również utworzenie własnej funkcji metrycznej. Utwórzmy funkcję metryczną logloss.
class LoglossMetric(object):
def get_final_error(self, error, weight):
return error / (weight + 1e-38)
def is_max_optimal(self):
return False
def evaluate(self, approxes, target, weight):
# approxes is a list of indexed containers
# (containers with only __len__ and __getitem__ defined),
# one container per approx dimension.
# Each container contains floats.
# weight is a one dimensional indexed container.
# target is float.
# weight parameter can be None.
# Returns pair (error, weights sum)
assert len(approxes) == 1
assert len(target) == len(approxes[0])
approx = approxes[0]
error_sum = 0.0
weight_sum = 0.0
for i in range(len(approx)):
w = 1.0 if weight is None else weight[i]
weight_sum += w
error_sum += -w * (target[i] * approx[i] - np.log(1 + np.exp(approx[i])))
return error_sum, weight_sum
model = CatBoostClassifier(
iterations=10,
random_seed=42,
loss_function="Logloss",
eval_metric=LoglossMetric()
)
# Fit model
model.fit(train_pool)
# Only prediction_type='RawFormulaVal' is allowed with custom `loss_function`
preds_raw = model.predict(X_validation, prediction_type='RawFormulaVal')
3.7 Przewidywane etapy¶
Model CatBoost ma metodę staged_predict. Pozwala iteracyjnie uzyskać prognozy dla danego zakresu drzew.
model_kot = CatBoostClassifier(iterations=10, random_seed=42, logging_level='Silent').fit(train_pool)
model_kot
określami ilość drzew
ntree_start, ntree_end, eval_period = 3, 9, 2
predictions_iterator
predictions_iterator = model.staged_predict(validate_pool, 'Probability', ntree_start, ntree_end, eval_period)
nie wiem co on teraz robi
for preds, tree_count in zip(predictions_iterator, range(ntree_start, ntree_end, eval_period)):
print('First class probabilities using the first {} trees: {}'.format(tree_count, preds[:5, 1]))
3.8 Najważniejsze cechy¶
Czasami bardzo ważne jest, aby zrozumieć, która funkcja miała największy wpływ na końcowy wynik. Aby to zrobić, model CatBoost ma metodę get_feature_importance.
Tworzy się taki model szkieletowy jak w poprzedniej metodzie
model = CatBoostClassifier(iterations=50, random_seed=42, logging_level='Silent').fit(train_pool)
Mówi się modelowi aby: ‘get_feature_importance’
feature_importances = model.get_feature_importance(train_pool)
feature_names = X_train.columns
feature_names
for score, name in sorted(zip(feature_importances, feature_names), reverse=True):
print('{}: {}'.format(name, score))
To pokazuje, że funkcje Sex i Pclass miały największy wpływ na wynik.
co ciekawe wcale zadeklarowałem w ‘train_pool’, w które zmienne są dyskretne a model sam je sobie zcyfryzował
X_train['Ticket']
3.9 Wskaźniki oceny¶¶
CatBoost ma metodę eval_metrics, która pozwala obliczyć dane metryki dla danego zestawu danych. I oczywiście je narysować 🙂
model = CatBoostClassifier(iterations=50, random_seed=42, logging_level='Silent').fit(train_pool)
eval_metrics = model.eval_metrics(validate_pool, ['AUC'], plot=True)
‘;
}
this.layout = $(‘
‘
‘
‘‘ +
‘
‘‘ +
‘
‘
‘ +
‘
‘
‘ +
‘
‘‘ +
‘
‘‘ +
‘
‘
‘‘ +
‘
‘‘ +
‘‘ +
‘
‘ +
cvAreaControls +
‘
‘ +
‘
‘ +
‘
‘
‘
‘
‘ +
‘
‘);
$(parent).append(this.layout);
this.addTabEvents();
this.addControlEvents();
};
CatboostIpython.prototype.addTabEvents = function() {
var self = this;
$(‘.catboost-graph__tabs’, this.layout).click(function(e) {
if (!$(e.target).is(‘.catboost-graph__tab:not(.catboost-graph__tab_active)’)) {
return;
}
var id = $(e.target).attr(‘tabid’);
self.activeTab = id;
$(‘.catboost-graph__tab_active’, self.layout).removeClass(‘catboost-graph__tab_active’);
$(‘.catboost-graph__chart_active’, self.layout).removeClass(‘catboost-graph__chart_active’);
$(‘.catboost-graph__tab[tabid=”‘ + id + ‘”]’, self.layout).addClass(‘catboost-graph__tab_active’);
$(‘.catboost-graph__chart[tabid=”‘ + id + ‘”]’, self.layout).addClass(‘catboost-graph__chart_active’);
self.cleanSeries();
self.redrawActiveChart();
self.resizeCharts();
});
};
CatboostIpython.prototype.addControlEvents = function() {
var self = this;
$(‘#catboost-control-learn’ + this.index, this.layout).click(function() {
self.layoutDisabled.learn = !$(this)[0].checked;
$(‘.catboost-panel__series’, self.layout).toggleClass(‘catboost-panel__series_learn_disabled’, self.layoutDisabled.learn);
self.redrawActiveChart();
});
$(‘#catboost-control-test’ + this.index, this.layout).click(function() {
self.layoutDisabled.test = !$(this)[0].checked;
$(‘.catboost-panel__series’, self.layout).toggleClass(‘catboost-panel__series_test_disabled’, self.layoutDisabled.test);
self.redrawActiveChart();
});
$(‘#catboost-control2-clickmode’ + this.index, this.layout).click(function() {
self.clickMode = $(this)[0].checked;
});
$(‘#catboost-control2-log’ + this.index, this.layout).click(function() {
self.logarithmMode = $(this)[0].checked ? ‘log’ : ‘linear’;
self.forEveryLayout(function(layout) {
layout.yaxis = {type: self.logarithmMode};
});
self.redrawActiveChart();
});
var slider = $(‘#catboost-control2-slider’ + this.index),
sliderValue = $(‘#catboost-control2-slidervalue’ + this.index);
$(‘#catboost-control2-smooth’ + this.index, this.layout).click(function() {
var enabled = $(this)[0].checked;
self.setSmoothness(enabled ? self.lastSmooth : -1);
slider.prop(‘disabled’, !enabled);
sliderValue.prop(‘disabled’, !enabled);
self.redrawActiveChart();
});
$(‘#catboost-control2-cvstddev’ + this.index, this.layout).click(function() {
var enabled = $(this)[0].checked;
self.setStddev(enabled);
self.redrawActiveChart();
});
slider.on(‘input change’, function() {
var smooth = Number($(this).val());
sliderValue.val(isNaN(smooth) ? 0 : smooth);
self.setSmoothness(smooth);
self.lastSmooth = smooth;
self.redrawActiveChart();
});
sliderValue.on(‘input change’, function() {
var smooth = Number($(this).val());
slider.val(isNaN(smooth) ? 0 : smooth);
self.setSmoothness(smooth);
self.lastSmooth = smooth;
self.redrawActiveChart();
});
};
CatboostIpython.prototype.setTraceVisibility = function(trace, visibility) {
if (trace) {
trace.visible = visibility;
}
};
CatboostIpython.prototype.updateTracesVisibility = function() {
var tracesHash = this.groupTraces(),
traces,
smoothDisabled = this.getSmoothness() === -1,
self = this;
for (var train in tracesHash) {
if (tracesHash.hasOwnProperty(train)) {
traces = tracesHash[train].traces;
if (this.layoutDisabled.traces[train]) {
traces.forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
} else {
traces.forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
if (this.hasCVMode) {
if (this.stddevEnabled) {
self.filterTracesOne(traces, {type: ‘learn’}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesOne(traces, {type: ‘test’}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘learn’, cv_avg: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘learn’, cv_avg: true, smoothed: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true, smoothed: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true, best_point: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesOne(traces, {cv_stddev_first: true}).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesOne(traces, {cv_stddev_last: true}).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
} else {
self.filterTracesOne(traces, {cv_stddev_first: true}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesOne(traces, {cv_stddev_last: true}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘learn’, cv_avg: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘learn’, cv_avg: true, smoothed: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true, smoothed: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true, best_point: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
}
}
if (smoothDisabled) {
self.filterTracesOne(traces, {smoothed: true}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
}
if (this.layoutDisabled[‘learn’]) {
self.filterTracesOne(traces, {type: ‘learn’}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
}
if (this.layoutDisabled[‘test’]) {
self.filterTracesOne(traces, {type: ‘test’}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
}
}
}
}
};
CatboostIpython.prototype.getSmoothness = function() {
return this.smoothness && this.smoothness > -1 ? this.smoothness : -1;
};
CatboostIpython.prototype.setSmoothness = function(weight) {
if (weight 1) {
return;
}
this.smoothness = weight;
};
CatboostIpython.prototype.setStddev = function(enabled) {
this.stddevEnabled = enabled;
};
CatboostIpython.prototype.redrawActiveChart = function() {
this.chartsToRedraw[this.activeTab] = true;
this.redrawAll();
};
CatboostIpython.prototype.redraw = function() {
if (this.chartsToRedraw[this.activeTab]) {
this.chartsToRedraw[this.activeTab] = false;
this.updateTracesVisibility();
this.updateTracesCV();
this.updateTracesBest();
this.updateTracesValues();
this.updateTracesSmoothness();
this.plotly.redraw(this.traces[this.activeTab].parent);
}
this.drawTraces();
};
CatboostIpython.prototype.addRedrawFunc = function() {
this.redrawFunc = throttle(this.redraw, 400, false, this);
};
CatboostIpython.prototype.redrawAll = function() {
if (!this.redrawFunc) {
this.addRedrawFunc();
}
this.redrawFunc();
};
CatboostIpython.prototype.addPoints = function(parent, data) {
var self = this;
data.chunks.forEach(function(item) {
if (typeof item.remaining_time !== ‘undefined’ && typeof item.passed_time !== ‘undefined’) {
if (!self.timeLeft[data.path]) {
self.timeLeft[data.path] = [];
}
self.timeLeft[data.path][item.iteration] = [item.remaining_time, item.passed_time];
}
[‘test’, ‘learn’].forEach(function(type) {
var sets = self.meta[data.path][type + ‘_sets’],
metrics = self.meta[data.path][type + ‘_metrics’];
for (var i = 0; i ‘ + parameter + ‘ : ‘ + valueOfParameter;
}
}
}
if (!hovertextParametersAdded && type === ‘test’) {
hovertextParametersAdded = true;
trace.hovertext[pointIndex] += self.hovertextParameters[pointIndex];
}
smoothedTrace.x[pointIndex] = pointIndex;
}
if (bestValueTrace) {
bestValueTrace.x[pointIndex] = pointIndex;
bestValueTrace.y[pointIndex] = self.lossFuncs[nameOfMetric];
}
if (launchMode === ‘CV’ && !cvAdded) {
cvAdded = true;
self.getTrace(parent, $.extend({cv_stddev_first: true}, params));
self.getTrace(parent, $.extend({cv_stddev_last: true}, params));
self.getTrace(parent, $.extend({cv_stddev_first: true, smoothed: true}, params));
self.getTrace(parent, $.extend({cv_stddev_last: true, smoothed: true}, params));
self.getTrace(parent, $.extend({cv_avg: true}, params));
self.getTrace(parent, $.extend({cv_avg: true, smoothed: true}, params));
if (type === ‘test’) {
self.getTrace(parent, $.extend({cv_avg: true, best_point: true}, params));
}
}
}
self.chartsToRedraw[key.chartId] = true;
self.redrawAll();
}
});
});
};
CatboostIpython.prototype.getLaunchMode = function(path) {
return this.meta[path].launch_mode;
};
CatboostIpython.prototype.getChartNode = function(params, active) {
var node = $(‘
if (active) {
node.addClass(‘catboost-graph__chart_active’);
}
return node;
};
CatboostIpython.prototype.getChartTab = function(params, active) {
var node = $(‘
‘);
if (active) {
node.addClass(‘catboost-graph__tab_active’);
}
return node;
};
CatboostIpython.prototype.forEveryChart = function(callback) {
for (var name in this.traces) {
if (this.traces.hasOwnProperty(name)) {
callback(this.traces[name]);
}
}
};
CatboostIpython.prototype.forEveryLayout = function(callback) {
this.forEveryChart(function(chart) {
callback(chart.layout);
});
};
CatboostIpython.prototype.getChart = function(parent, params) {
var id = params.id,
self = this;
if (this.charts[id]) {
return this.charts[id];
}
this.addLayout(parent);
var active = this.activeTab === params.id,
chartNode = this.getChartNode(params, active),
chartTab = this.getChartTab(params, active);
$(‘.catboost-graph__charts’, this.layout).append(chartNode);
$(‘.catboost-graph__tabs’, this.layout).append(chartTab);
this.traces[id] = {
id: params.id,
name: params.name,
parent: chartNode[0],
traces: [],
layout: {
xaxis: {
range: [0, Number(this.meta[params.path].iteration_count)],
type: ‘linear’,
tickmode: ‘auto’,
showspikes: true,
spikethickness: 1,
spikedash: ‘longdashdot’,
spikemode: ‘across’,
zeroline: false,
showgrid: false
},
yaxis: {
zeroline: false
//showgrid: false
//hoverformat : ‘.7f’
},
separators: ‘. ‘,
//hovermode: ‘x’,
margin: {l: 38, r: 0, t: 35, b: 30},
autosize: true,
showlegend: false
},
options: {
scrollZoom: false,
modeBarButtonsToRemove: [‘toggleSpikelines’],
displaylogo: false
}
};
this.charts[id] = this.plotly.plot(chartNode[0], this.traces[id].traces, this.traces[id].layout, this.traces[id].options);
chartNode[0].on(‘plotly_hover’, function(e) {
self.updateTracesValues(e.points[0].x);
});
chartNode[0].on(‘plotly_click’, function(e) {
self.updateTracesValues(e.points[0].x, true);
});
return this.charts[id];
};
CatboostIpython.prototype.getTrace = function(parent, params) {
var key = this.getKey(params),
chartSeries = [];
if (this.traces[key.chartId]) {
chartSeries = this.traces[key.chartId].traces.filter(function(trace) {
return trace.name === key.traceName;
});
}
if (chartSeries.length) {
return chartSeries[0];
} else {
this.getChart(parent, {id: key.chartId, name: params.chartName, path: params.path});
var plotParams = {
color: this.getNextColor(params.path, params.smoothed ? 0.2 : 1),
fillsmoothcolor: this.getNextColor(params.path, 0.1),
fillcolor: this.getNextColor(params.path, 0.4),
hoverinfo: params.cv_avg ? ‘skip’ : ‘text+x’,
width: params.cv_avg ? 2 : 1,
dash: params.type === ‘test’ ? ‘solid’ : ‘dot’
},
trace = {
name: key.traceName,
_params: params,
x: [],
y: [],
hovertext: [],
hoverinfo: plotParams.hoverinfo,
line: {
width: plotParams.width,
dash: plotParams.dash,
color: plotParams.color
},
mode: ‘lines’,
hoveron: ‘points’,
connectgaps: true
};
if (params.best_point) {
trace = {
name: key.traceName,
_params: params,
x: [],
y: [],
marker: {
width: 2,
color: plotParams.color
},
hovertext: [],
hoverinfo: ‘text’,
mode: ‘markers’,
type: ‘scatter’
};
}
if (params.best_value) {
trace = {
name: key.traceName,
_params: params,
x: [],
y: [],
line: {
width: 1,
dash: ‘dash’,
color: ‘#CCCCCC’
},
mode: ‘lines’,
connectgaps: true,
hoverinfo: ‘skip’
};
}
if (params.cv_stddev_last) {
trace.fill = ‘tonexty’;
}
trace._params.plotParams = plotParams;
this.traces[key.chartId].traces.push(trace);
return trace;
}
};
CatboostIpython.prototype.getKey = function(params) {
var traceName = [
params.train,
params.type,
params.indexOfSet,
(params.smoothed ? ‘smoothed’ : ”),
(params.best_point ? ‘best_pount’ : ”),
(params.best_value ? ‘best_value’ : ”),
(params.cv_avg ? ‘cv_avg’ : ”),
(params.cv_stddev_first ? ‘cv_stddev_first’ : ”),
(params.cv_stddev_last ? ‘cv_stddev_last’ : ”)
].join(‘;’);
return {
chartId: params.chartName,
traceName: traceName,
colorId: params.train
};
};
CatboostIpython.prototype.filterTracesEvery = function(traces, filter) {
traces = traces || this.traces[this.activeTab].traces;
return traces.filter(function(trace) {
for (var prop in filter) {
if (filter.hasOwnProperty(prop)) {
if (filter[prop] !== trace._params[prop]) {
return false;
}
}
}
return true;
});
};
CatboostIpython.prototype.filterTracesOne = function(traces, filter) {
traces = traces || this.traces[this.activeTab].traces;
return traces.filter(function(trace) {
for (var prop in filter) {
if (filter.hasOwnProperty(prop)) {
if (filter[prop] === trace._params[prop]) {
return true;
}
}
}
return false;
});
};
CatboostIpython.prototype.cleanSeries = function() {
$(‘.catboost-panel__series’, this.layout).html(”);
};
CatboostIpython.prototype.groupTraces = function() {
var traces = this.traces[this.activeTab].traces,
index = 0,
tracesHash = {};
traces.map(function(trace) {
var train = trace._params.train;
if (!tracesHash[train]) {
tracesHash[train] = {
index: index,
traces: [],
info: {
path: trace._params.path,
color: trace._params.plotParams.color
}
};
index++;
}
tracesHash[train].traces.push(trace);
});
return tracesHash;
};
CatboostIpython.prototype.drawTraces = function() {
if ($(‘.catboost-panel__series .catboost-panel__serie’, this.layout).length) {
return;
}
var html = ”,
tracesHash = this.groupTraces();
for (var train in tracesHash) {
if (tracesHash.hasOwnProperty(train)) {
html += this.drawTrace(train, tracesHash[train]);
}
}
$(‘.catboost-panel__series’, this.layout).html(html);
this.updateTracesValues();
this.addTracesEvents();
};
CatboostIpython.prototype.getTraceDefParams = function(params) {
var defParams = {
smoothed: undefined,
best_point: undefined,
best_value: undefined,
cv_avg: undefined,
cv_stddev_first: undefined,
cv_stddev_last: undefined
};
if (params) {
return $.extend(defParams, params);
} else {
return defParams;
}
};
CatboostIpython.prototype.drawTrace = function(train, hash) {
var info = hash.info,
id = ‘catboost-serie-‘ + this.index + ‘-‘ + hash.index,
traces = {
learn: this.filterTracesEvery(hash.traces, this.getTraceDefParams({type: ‘learn’})),
test: this.filterTracesEvery(hash.traces, this.getTraceDefParams({type: ‘test’}))
},
items = {
learn: {
middle: ”,
bottom: ”
},
test: {
middle: ”,
bottom: ”
}
},
tracesNames = ”;
[‘learn’, ‘test’].forEach(function(type) {
traces[type].forEach(function(trace) {
items[type].middle += ‘
‘
items[type].bottom += ‘
‘
tracesNames += ‘
‘
‘;
});
});
var timeSpendHtml = ‘
‘
‘
‘;
var html = ‘
‘
‘‘ +
‘
(this.getLaunchMode(info.path) !== ‘Eval’ ? timeSpendHtml : ”) +
‘
‘ +
‘
‘ +
‘
‘ +
‘
‘
‘
‘
tracesNames +
‘
‘ +
‘
items.learn.middle +
items.test.middle +
‘
‘ +
‘
items.learn.bottom +
items.test.bottom +
‘
‘ +
‘
‘ +
‘
‘;
return html;
};
CatboostIpython.prototype.updateTracesValues = function(iteration, click) {
var tracesHash = this.groupTraces();
for (var train in tracesHash) {
if (tracesHash.hasOwnProperty(train) && !this.layoutDisabled.traces[train]) {
this.updateTraceValues(train, tracesHash[train], iteration, click);
}
}
};
CatboostIpython.prototype.updateTracesBest = function() {
var tracesHash = this.groupTraces();
for (var train in tracesHash) {
if (tracesHash.hasOwnProperty(train) && !this.layoutDisabled.traces[train]) {
this.updateTraceBest(train, tracesHash[train]);
}
}
};
CatboostIpython.prototype.getBestValue = function(data) {
if (!data.length) {
return {
best: undefined,
index: -1
};
}
var best = data[0],
index = 0,
func = this.lossFuncs[this.traces[this.activeTab].name],
bestDiff = typeof func === ‘number’ ? Math.abs(data[0] – func) : 0;
for (var i = 1, l = data.length; i best) {
best = data[i];
index = i;
}
if (typeof func === ‘number’ && Math.abs(data[i] – func) maxLength) {
maxLength = origTrace.y.length;
}
});
for (var i = 0; i 0) {
avgTrace.x[i] = i;
avgTrace.y[i] = sum / count;
}
}
};
CatboostIpython.prototype.updateTracesCVStdDev = function() {
var tracesHash = this.groupTraces(),
firstTraces = this.filterTracesOne(tracesHash.traces, {cv_stddev_first: true}),
self = this;
firstTraces.forEach(function(trace) {
var origTraces = self.filterTracesEvery(tracesHash.traces, self.getTraceDefParams({
train: trace._params.train,
type: trace._params.type,
smoothed: trace._params.smoothed
})),
lastTraces = self.filterTracesEvery(tracesHash.traces, self.getTraceDefParams({
train: trace._params.train,
type: trace._params.type,
smoothed: trace._params.smoothed,
cv_stddev_last: true
}));
if (origTraces.length && lastTraces.length === 1) {
self.cvStdDevFunc(origTraces, trace, lastTraces[0]);
}
});
};
CatboostIpython.prototype.cvStdDevFunc = function(origTraces, firstTrace, lastTrace) {
var maxCount = origTraces.length,
maxLength = -1,
count,
sum,
i, j;
origTraces.forEach(function(origTrace) {
if (origTrace.y.length > maxLength) {
maxLength = origTrace.y.length;
}
});
for (i = 0; i i) {
firstTrace.hovertext[i] += this.hovertextParameters[i];
lastTrace.hovertext[i] += this.hovertextParameters[i];
}
}
};
CatboostIpython.prototype.updateTracesSmoothness = function() {
var tracesHash = this.groupTraces(),
smoothedTraces = this.filterTracesOne(tracesHash.traces, {smoothed: true}),
enabled = this.getSmoothness() > -1,
self = this;
smoothedTraces.forEach(function(trace) {
var origTraces = self.filterTracesEvery(tracesHash.traces, self.getTraceDefParams({
train: trace._params.train,
type: trace._params.type,
indexOfSet: trace._params.indexOfSet,
cv_avg: trace._params.cv_avg,
cv_stddev_first: trace._params.cv_stddev_first,
cv_stddev_last: trace._params.cv_stddev_last
})),
colorFlag = false;
if (origTraces.length === 1) {
origTraces = origTraces[0];
if (origTraces.visible) {
if (enabled) {
self.smoothFunc(origTraces, trace);
colorFlag = true;
}
self.highlightSmoothedTrace(origTraces, trace, colorFlag);
}
}
});
};
CatboostIpython.prototype.highlightSmoothedTrace = function(trace, smoothedTrace, flag) {
if (flag) {
smoothedTrace.line.color = trace._params.plotParams.color;
trace.line.color = smoothedTrace._params.plotParams.color;
trace.hoverinfo = ‘skip’;
if (trace._params.cv_stddev_last) {
trace.fillcolor = trace._params.plotParams.fillsmoothcolor;
}
} else {
trace.line.color = trace._params.plotParams.color;
trace.hoverinfo = trace._params.plotParams.hoverinfo;
if (trace._params.cv_stddev_last) {
trace.fillcolor = trace._params.plotParams.fillcolor;
}
}
};
CatboostIpython.prototype.smoothFunc = function(origTrace, smoothedTrace) {
var data = origTrace.y,
smoothedPoints = this.smooth(data, this.getSmoothness()),
smoothedIndex = 0,
self = this;
if (smoothedPoints.length) {
data.forEach(function (d, index) {
if (!smoothedTrace.x[index]) {
smoothedTrace.x[index] = index;
}
var nameOfSet = smoothedTrace._params.nameOfSet;
if (smoothedTrace._params.cv_stddev_first || smoothedTrace._params.cv_stddev_last) {
nameOfSet = smoothedTrace._params.type + ‘ std’;
}
smoothedTrace.y[index] = smoothedPoints[smoothedIndex];
smoothedTrace.hovertext[index] = nameOfSet + ‘`: ‘ + smoothedPoints[smoothedIndex].toPrecision(7);
if (self.hovertextParameters.length > index) {
smoothedTrace.hovertext[index] += self.hovertextParameters[index];
}
smoothedIndex++;
});
}
};
CatboostIpython.prototype.formatItemValue = function(value, index, suffix) {
if (typeof value === ‘undefined’) {
return ”;
}
suffix = suffix || ”;
return ‘‘ + value + ‘‘;
};
CatboostIpython.prototype.updateTraceBest = function(train, hash) {
var traces = this.filterTracesOne(hash.traces, {best_point: true}),
self = this;
traces.forEach(function(trace) {
var testTrace = self.filterTracesEvery(hash.traces, self.getTraceDefParams({
train: trace._params.train,
type: ‘test’,
indexOfSet: trace._params.indexOfSet
}));
if (self.hasCVMode) {
testTrace = self.filterTracesEvery(hash.traces, self.getTraceDefParams({
train: trace._params.train,
type: ‘test’,
cv_avg: true
}));
}
var bestValue = self.getBestValue(testTrace.length === 1 ? testTrace[0].y : []);
if (bestValue.index !== -1) {
trace.x[0] = bestValue.index;
trace.y[0] = bestValue.best;
trace.hovertext[0] = bestValue.func + ‘ (‘ + (self.hasCVMode ? ‘avg’ : trace._params.nameOfSet) + ‘): ‘ + bestValue.index + ‘ ‘ + bestValue.best;
}
});
};
CatboostIpython.prototype.updateTraceValues = function(name, hash, iteration, click) {
var id = ‘catboost-serie-‘ + this.index + ‘-‘ + hash.index,
traces = {
learn: this.filterTracesEvery(hash.traces, this.getTraceDefParams({type: ‘learn’})),
test: this.filterTracesEvery(hash.traces, this.getTraceDefParams({type: ‘test’}))
},
path = hash.info.path,
self = this;
[‘learn’, ‘test’].forEach(function(type) {
traces[type].forEach(function(trace) {
var data = trace.y || [],
index = typeof iteration !== ‘undefined’ && iteration -1 ? bestValue.index : ”);
$(‘#’ + id + ‘ .catboost-panel__serie_best_test_value[data-index=’ + trace._params.indexOfSet + ‘]’, self.layout)
.html(self.formatItemValue(bestValue.best, bestValue.index, ‘best ‘ + trace._params.nameOfSet + ‘ ‘));
}
});
});
if (this.hasCVMode) {
var testTrace = this.filterTracesEvery(hash.traces, this.getTraceDefParams({
type: ‘test’,
cv_avg: true
})),
bestValue = this.getBestValue(testTrace.length === 1 ? testTrace[0].y : []);
$(‘#’ + id + ‘ .catboost-panel__serie_best_iteration’, this.layout).html(bestValue.index > -1 ? bestValue.index : ”);
}
if (click) {
this.clickMode = true;
$(‘#catboost-control2-clickmode’ + this.index, this.layout)[0].checked = true;
}
};
CatboostIpython.prototype.addTracesEvents = function() {
var self = this;
$(‘.catboost-panel__serie_checkbox’, this.layout).click(function() {
var name = $(this).data(‘seriename’);
self.layoutDisabled.traces[name] = !$(this)[0].checked;
self.redrawActiveChart();
});
};
CatboostIpython.prototype.getNextColor = function(path, opacity) {
var color;
if (this.colorsByPath[path]) {
color = this.colorsByPath[path];
} else {
color = this.colors[this.colorIndex];
this.colorsByPath[path] = color;
this.colorIndex++;
if (this.colorIndex > this.colors.length – 1) {
this.colorIndex = 0;
}
}
return this.hexToRgba(color, opacity);
};
CatboostIpython.prototype.hexToRgba = function(value, opacity) {
if (value.length 0) {
out += hours + ‘h ‘;
seconds = 0;
millis = 0;
}
if (minutes && minutes > 0) {
out += minutes + ‘m ‘;
millis = 0;
}
if (seconds && seconds > 0) {
out += seconds + ‘s ‘;
}
if (millis && millis > 0) {
out += millis + ‘ms’;
}
return out.trim();
};
CatboostIpython.prototype.mean = function(values, valueof) {
var n = values.length,
m = n,
i = -1,
value,
sum = 0,
number = function(x) {
return x === null ? NaN : +x;
};
if (valueof === null) {
while (++i
Można testować więcej wskaźników: https://catboost.ai/docs/search/?query=%27Accuracy%27
model = CatBoostClassifier(iterations=50, random_seed=42, logging_level='Silent').fit(train_pool)
eval_metrics = model.eval_metrics(validate_pool, ['Recall'], plot=True)
‘;
}
this.layout = $(‘
‘
‘
‘‘ +
‘
‘‘ +
‘
‘
‘ +
‘
‘
‘ +
‘
‘‘ +
‘
‘‘ +
‘
‘
‘‘ +
‘
‘‘ +
‘‘ +
‘
‘ +
cvAreaControls +
‘
‘ +
‘
‘ +
‘
‘
‘
‘
‘ +
‘
‘);
$(parent).append(this.layout);
this.addTabEvents();
this.addControlEvents();
};
CatboostIpython.prototype.addTabEvents = function() {
var self = this;
$(‘.catboost-graph__tabs’, this.layout).click(function(e) {
if (!$(e.target).is(‘.catboost-graph__tab:not(.catboost-graph__tab_active)’)) {
return;
}
var id = $(e.target).attr(‘tabid’);
self.activeTab = id;
$(‘.catboost-graph__tab_active’, self.layout).removeClass(‘catboost-graph__tab_active’);
$(‘.catboost-graph__chart_active’, self.layout).removeClass(‘catboost-graph__chart_active’);
$(‘.catboost-graph__tab[tabid=”‘ + id + ‘”]’, self.layout).addClass(‘catboost-graph__tab_active’);
$(‘.catboost-graph__chart[tabid=”‘ + id + ‘”]’, self.layout).addClass(‘catboost-graph__chart_active’);
self.cleanSeries();
self.redrawActiveChart();
self.resizeCharts();
});
};
CatboostIpython.prototype.addControlEvents = function() {
var self = this;
$(‘#catboost-control-learn’ + this.index, this.layout).click(function() {
self.layoutDisabled.learn = !$(this)[0].checked;
$(‘.catboost-panel__series’, self.layout).toggleClass(‘catboost-panel__series_learn_disabled’, self.layoutDisabled.learn);
self.redrawActiveChart();
});
$(‘#catboost-control-test’ + this.index, this.layout).click(function() {
self.layoutDisabled.test = !$(this)[0].checked;
$(‘.catboost-panel__series’, self.layout).toggleClass(‘catboost-panel__series_test_disabled’, self.layoutDisabled.test);
self.redrawActiveChart();
});
$(‘#catboost-control2-clickmode’ + this.index, this.layout).click(function() {
self.clickMode = $(this)[0].checked;
});
$(‘#catboost-control2-log’ + this.index, this.layout).click(function() {
self.logarithmMode = $(this)[0].checked ? ‘log’ : ‘linear’;
self.forEveryLayout(function(layout) {
layout.yaxis = {type: self.logarithmMode};
});
self.redrawActiveChart();
});
var slider = $(‘#catboost-control2-slider’ + this.index),
sliderValue = $(‘#catboost-control2-slidervalue’ + this.index);
$(‘#catboost-control2-smooth’ + this.index, this.layout).click(function() {
var enabled = $(this)[0].checked;
self.setSmoothness(enabled ? self.lastSmooth : -1);
slider.prop(‘disabled’, !enabled);
sliderValue.prop(‘disabled’, !enabled);
self.redrawActiveChart();
});
$(‘#catboost-control2-cvstddev’ + this.index, this.layout).click(function() {
var enabled = $(this)[0].checked;
self.setStddev(enabled);
self.redrawActiveChart();
});
slider.on(‘input change’, function() {
var smooth = Number($(this).val());
sliderValue.val(isNaN(smooth) ? 0 : smooth);
self.setSmoothness(smooth);
self.lastSmooth = smooth;
self.redrawActiveChart();
});
sliderValue.on(‘input change’, function() {
var smooth = Number($(this).val());
slider.val(isNaN(smooth) ? 0 : smooth);
self.setSmoothness(smooth);
self.lastSmooth = smooth;
self.redrawActiveChart();
});
};
CatboostIpython.prototype.setTraceVisibility = function(trace, visibility) {
if (trace) {
trace.visible = visibility;
}
};
CatboostIpython.prototype.updateTracesVisibility = function() {
var tracesHash = this.groupTraces(),
traces,
smoothDisabled = this.getSmoothness() === -1,
self = this;
for (var train in tracesHash) {
if (tracesHash.hasOwnProperty(train)) {
traces = tracesHash[train].traces;
if (this.layoutDisabled.traces[train]) {
traces.forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
} else {
traces.forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
if (this.hasCVMode) {
if (this.stddevEnabled) {
self.filterTracesOne(traces, {type: ‘learn’}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesOne(traces, {type: ‘test’}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘learn’, cv_avg: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘learn’, cv_avg: true, smoothed: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true, smoothed: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true, best_point: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesOne(traces, {cv_stddev_first: true}).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesOne(traces, {cv_stddev_last: true}).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
} else {
self.filterTracesOne(traces, {cv_stddev_first: true}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesOne(traces, {cv_stddev_last: true}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘learn’, cv_avg: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘learn’, cv_avg: true, smoothed: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true, smoothed: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true, best_point: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
}
}
if (smoothDisabled) {
self.filterTracesOne(traces, {smoothed: true}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
}
if (this.layoutDisabled[‘learn’]) {
self.filterTracesOne(traces, {type: ‘learn’}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
}
if (this.layoutDisabled[‘test’]) {
self.filterTracesOne(traces, {type: ‘test’}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
}
}
}
}
};
CatboostIpython.prototype.getSmoothness = function() {
return this.smoothness && this.smoothness > -1 ? this.smoothness : -1;
};
CatboostIpython.prototype.setSmoothness = function(weight) {
if (weight 1) {
return;
}
this.smoothness = weight;
};
CatboostIpython.prototype.setStddev = function(enabled) {
this.stddevEnabled = enabled;
};
CatboostIpython.prototype.redrawActiveChart = function() {
this.chartsToRedraw[this.activeTab] = true;
this.redrawAll();
};
CatboostIpython.prototype.redraw = function() {
if (this.chartsToRedraw[this.activeTab]) {
this.chartsToRedraw[this.activeTab] = false;
this.updateTracesVisibility();
this.updateTracesCV();
this.updateTracesBest();
this.updateTracesValues();
this.updateTracesSmoothness();
this.plotly.redraw(this.traces[this.activeTab].parent);
}
this.drawTraces();
};
CatboostIpython.prototype.addRedrawFunc = function() {
this.redrawFunc = throttle(this.redraw, 400, false, this);
};
CatboostIpython.prototype.redrawAll = function() {
if (!this.redrawFunc) {
this.addRedrawFunc();
}
this.redrawFunc();
};
CatboostIpython.prototype.addPoints = function(parent, data) {
var self = this;
data.chunks.forEach(function(item) {
if (typeof item.remaining_time !== ‘undefined’ && typeof item.passed_time !== ‘undefined’) {
if (!self.timeLeft[data.path]) {
self.timeLeft[data.path] = [];
}
self.timeLeft[data.path][item.iteration] = [item.remaining_time, item.passed_time];
}
[‘test’, ‘learn’].forEach(function(type) {
var sets = self.meta[data.path][type + ‘_sets’],
metrics = self.meta[data.path][type + ‘_metrics’];
for (var i = 0; i ‘ + parameter + ‘ : ‘ + valueOfParameter;
}
}
}
if (!hovertextParametersAdded && type === ‘test’) {
hovertextParametersAdded = true;
trace.hovertext[pointIndex] += self.hovertextParameters[pointIndex];
}
smoothedTrace.x[pointIndex] = pointIndex;
}
if (bestValueTrace) {
bestValueTrace.x[pointIndex] = pointIndex;
bestValueTrace.y[pointIndex] = self.lossFuncs[nameOfMetric];
}
if (launchMode === ‘CV’ && !cvAdded) {
cvAdded = true;
self.getTrace(parent, $.extend({cv_stddev_first: true}, params));
self.getTrace(parent, $.extend({cv_stddev_last: true}, params));
self.getTrace(parent, $.extend({cv_stddev_first: true, smoothed: true}, params));
self.getTrace(parent, $.extend({cv_stddev_last: true, smoothed: true}, params));
self.getTrace(parent, $.extend({cv_avg: true}, params));
self.getTrace(parent, $.extend({cv_avg: true, smoothed: true}, params));
if (type === ‘test’) {
self.getTrace(parent, $.extend({cv_avg: true, best_point: true}, params));
}
}
}
self.chartsToRedraw[key.chartId] = true;
self.redrawAll();
}
});
});
};
CatboostIpython.prototype.getLaunchMode = function(path) {
return this.meta[path].launch_mode;
};
CatboostIpython.prototype.getChartNode = function(params, active) {
var node = $(‘
if (active) {
node.addClass(‘catboost-graph__chart_active’);
}
return node;
};
CatboostIpython.prototype.getChartTab = function(params, active) {
var node = $(‘
‘);
if (active) {
node.addClass(‘catboost-graph__tab_active’);
}
return node;
};
CatboostIpython.prototype.forEveryChart = function(callback) {
for (var name in this.traces) {
if (this.traces.hasOwnProperty(name)) {
callback(this.traces[name]);
}
}
};
CatboostIpython.prototype.forEveryLayout = function(callback) {
this.forEveryChart(function(chart) {
callback(chart.layout);
});
};
CatboostIpython.prototype.getChart = function(parent, params) {
var id = params.id,
self = this;
if (this.charts[id]) {
return this.charts[id];
}
this.addLayout(parent);
var active = this.activeTab === params.id,
chartNode = this.getChartNode(params, active),
chartTab = this.getChartTab(params, active);
$(‘.catboost-graph__charts’, this.layout).append(chartNode);
$(‘.catboost-graph__tabs’, this.layout).append(chartTab);
this.traces[id] = {
id: params.id,
name: params.name,
parent: chartNode[0],
traces: [],
layout: {
xaxis: {
range: [0, Number(this.meta[params.path].iteration_count)],
type: ‘linear’,
tickmode: ‘auto’,
showspikes: true,
spikethickness: 1,
spikedash: ‘longdashdot’,
spikemode: ‘across’,
zeroline: false,
showgrid: false
},
yaxis: {
zeroline: false
//showgrid: false
//hoverformat : ‘.7f’
},
separators: ‘. ‘,
//hovermode: ‘x’,
margin: {l: 38, r: 0, t: 35, b: 30},
autosize: true,
showlegend: false
},
options: {
scrollZoom: false,
modeBarButtonsToRemove: [‘toggleSpikelines’],
displaylogo: false
}
};
this.charts[id] = this.plotly.plot(chartNode[0], this.traces[id].traces, this.traces[id].layout, this.traces[id].options);
chartNode[0].on(‘plotly_hover’, function(e) {
self.updateTracesValues(e.points[0].x);
});
chartNode[0].on(‘plotly_click’, function(e) {
self.updateTracesValues(e.points[0].x, true);
});
return this.charts[id];
};
CatboostIpython.prototype.getTrace = function(parent, params) {
var key = this.getKey(params),
chartSeries = [];
if (this.traces[key.chartId]) {
chartSeries = this.traces[key.chartId].traces.filter(function(trace) {
return trace.name === key.traceName;
});
}
if (chartSeries.length) {
return chartSeries[0];
} else {
this.getChart(parent, {id: key.chartId, name: params.chartName, path: params.path});
var plotParams = {
color: this.getNextColor(params.path, params.smoothed ? 0.2 : 1),
fillsmoothcolor: this.getNextColor(params.path, 0.1),
fillcolor: this.getNextColor(params.path, 0.4),
hoverinfo: params.cv_avg ? ‘skip’ : ‘text+x’,
width: params.cv_avg ? 2 : 1,
dash: params.type === ‘test’ ? ‘solid’ : ‘dot’
},
trace = {
name: key.traceName,
_params: params,
x: [],
y: [],
hovertext: [],
hoverinfo: plotParams.hoverinfo,
line: {
width: plotParams.width,
dash: plotParams.dash,
color: plotParams.color
},
mode: ‘lines’,
hoveron: ‘points’,
connectgaps: true
};
if (params.best_point) {
trace = {
name: key.traceName,
_params: params,
x: [],
y: [],
marker: {
width: 2,
color: plotParams.color
},
hovertext: [],
hoverinfo: ‘text’,
mode: ‘markers’,
type: ‘scatter’
};
}
if (params.best_value) {
trace = {
name: key.traceName,
_params: params,
x: [],
y: [],
line: {
width: 1,
dash: ‘dash’,
color: ‘#CCCCCC’
},
mode: ‘lines’,
connectgaps: true,
hoverinfo: ‘skip’
};
}
if (params.cv_stddev_last) {
trace.fill = ‘tonexty’;
}
trace._params.plotParams = plotParams;
this.traces[key.chartId].traces.push(trace);
return trace;
}
};
CatboostIpython.prototype.getKey = function(params) {
var traceName = [
params.train,
params.type,
params.indexOfSet,
(params.smoothed ? ‘smoothed’ : ”),
(params.best_point ? ‘best_pount’ : ”),
(params.best_value ? ‘best_value’ : ”),
(params.cv_avg ? ‘cv_avg’ : ”),
(params.cv_stddev_first ? ‘cv_stddev_first’ : ”),
(params.cv_stddev_last ? ‘cv_stddev_last’ : ”)
].join(‘;’);
return {
chartId: params.chartName,
traceName: traceName,
colorId: params.train
};
};
CatboostIpython.prototype.filterTracesEvery = function(traces, filter) {
traces = traces || this.traces[this.activeTab].traces;
return traces.filter(function(trace) {
for (var prop in filter) {
if (filter.hasOwnProperty(prop)) {
if (filter[prop] !== trace._params[prop]) {
return false;
}
}
}
return true;
});
};
CatboostIpython.prototype.filterTracesOne = function(traces, filter) {
traces = traces || this.traces[this.activeTab].traces;
return traces.filter(function(trace) {
for (var prop in filter) {
if (filter.hasOwnProperty(prop)) {
if (filter[prop] === trace._params[prop]) {
return true;
}
}
}
return false;
});
};
CatboostIpython.prototype.cleanSeries = function() {
$(‘.catboost-panel__series’, this.layout).html(”);
};
CatboostIpython.prototype.groupTraces = function() {
var traces = this.traces[this.activeTab].traces,
index = 0,
tracesHash = {};
traces.map(function(trace) {
var train = trace._params.train;
if (!tracesHash[train]) {
tracesHash[train] = {
index: index,
traces: [],
info: {
path: trace._params.path,
color: trace._params.plotParams.color
}
};
index++;
}
tracesHash[train].traces.push(trace);
});
return tracesHash;
};
CatboostIpython.prototype.drawTraces = function() {
if ($(‘.catboost-panel__series .catboost-panel__serie’, this.layout).length) {
return;
}
var html = ”,
tracesHash = this.groupTraces();
for (var train in tracesHash) {
if (tracesHash.hasOwnProperty(train)) {
html += this.drawTrace(train, tracesHash[train]);
}
}
$(‘.catboost-panel__series’, this.layout).html(html);
this.updateTracesValues();
this.addTracesEvents();
};
CatboostIpython.prototype.getTraceDefParams = function(params) {
var defParams = {
smoothed: undefined,
best_point: undefined,
best_value: undefined,
cv_avg: undefined,
cv_stddev_first: undefined,
cv_stddev_last: undefined
};
if (params) {
return $.extend(defParams, params);
} else {
return defParams;
}
};
CatboostIpython.prototype.drawTrace = function(train, hash) {
var info = hash.info,
id = ‘catboost-serie-‘ + this.index + ‘-‘ + hash.index,
traces = {
learn: this.filterTracesEvery(hash.traces, this.getTraceDefParams({type: ‘learn’})),
test: this.filterTracesEvery(hash.traces, this.getTraceDefParams({type: ‘test’}))
},
items = {
learn: {
middle: ”,
bottom: ”
},
test: {
middle: ”,
bottom: ”
}
},
tracesNames = ”;
[‘learn’, ‘test’].forEach(function(type) {
traces[type].forEach(function(trace) {
items[type].middle += ‘
‘
items[type].bottom += ‘
‘
tracesNames += ‘
‘
‘;
});
});
var timeSpendHtml = ‘
‘
‘
‘;
var html = ‘
‘
‘‘ +
‘
(this.getLaunchMode(info.path) !== ‘Eval’ ? timeSpendHtml : ”) +
‘
‘ +
‘
‘ +
‘
‘ +
‘
‘
‘
‘
tracesNames +
‘
‘ +
‘
items.learn.middle +
items.test.middle +
‘
‘ +
‘
items.learn.bottom +
items.test.bottom +
‘
‘ +
‘
‘ +
‘
‘;
return html;
};
CatboostIpython.prototype.updateTracesValues = function(iteration, click) {
var tracesHash = this.groupTraces();
for (var train in tracesHash) {
if (tracesHash.hasOwnProperty(train) && !this.layoutDisabled.traces[train]) {
this.updateTraceValues(train, tracesHash[train], iteration, click);
}
}
};
CatboostIpython.prototype.updateTracesBest = function() {
var tracesHash = this.groupTraces();
for (var train in tracesHash) {
if (tracesHash.hasOwnProperty(train) && !this.layoutDisabled.traces[train]) {
this.updateTraceBest(train, tracesHash[train]);
}
}
};
CatboostIpython.prototype.getBestValue = function(data) {
if (!data.length) {
return {
best: undefined,
index: -1
};
}
var best = data[0],
index = 0,
func = this.lossFuncs[this.traces[this.activeTab].name],
bestDiff = typeof func === ‘number’ ? Math.abs(data[0] – func) : 0;
for (var i = 1, l = data.length; i best) {
best = data[i];
index = i;
}
if (typeof func === ‘number’ && Math.abs(data[i] – func) maxLength) {
maxLength = origTrace.y.length;
}
});
for (var i = 0; i 0) {
avgTrace.x[i] = i;
avgTrace.y[i] = sum / count;
}
}
};
CatboostIpython.prototype.updateTracesCVStdDev = function() {
var tracesHash = this.groupTraces(),
firstTraces = this.filterTracesOne(tracesHash.traces, {cv_stddev_first: true}),
self = this;
firstTraces.forEach(function(trace) {
var origTraces = self.filterTracesEvery(tracesHash.traces, self.getTraceDefParams({
train: trace._params.train,
type: trace._params.type,
smoothed: trace._params.smoothed
})),
lastTraces = self.filterTracesEvery(tracesHash.traces, self.getTraceDefParams({
train: trace._params.train,
type: trace._params.type,
smoothed: trace._params.smoothed,
cv_stddev_last: true
}));
if (origTraces.length && lastTraces.length === 1) {
self.cvStdDevFunc(origTraces, trace, lastTraces[0]);
}
});
};
CatboostIpython.prototype.cvStdDevFunc = function(origTraces, firstTrace, lastTrace) {
var maxCount = origTraces.length,
maxLength = -1,
count,
sum,
i, j;
origTraces.forEach(function(origTrace) {
if (origTrace.y.length > maxLength) {
maxLength = origTrace.y.length;
}
});
for (i = 0; i i) {
firstTrace.hovertext[i] += this.hovertextParameters[i];
lastTrace.hovertext[i] += this.hovertextParameters[i];
}
}
};
CatboostIpython.prototype.updateTracesSmoothness = function() {
var tracesHash = this.groupTraces(),
smoothedTraces = this.filterTracesOne(tracesHash.traces, {smoothed: true}),
enabled = this.getSmoothness() > -1,
self = this;
smoothedTraces.forEach(function(trace) {
var origTraces = self.filterTracesEvery(tracesHash.traces, self.getTraceDefParams({
train: trace._params.train,
type: trace._params.type,
indexOfSet: trace._params.indexOfSet,
cv_avg: trace._params.cv_avg,
cv_stddev_first: trace._params.cv_stddev_first,
cv_stddev_last: trace._params.cv_stddev_last
})),
colorFlag = false;
if (origTraces.length === 1) {
origTraces = origTraces[0];
if (origTraces.visible) {
if (enabled) {
self.smoothFunc(origTraces, trace);
colorFlag = true;
}
self.highlightSmoothedTrace(origTraces, trace, colorFlag);
}
}
});
};
CatboostIpython.prototype.highlightSmoothedTrace = function(trace, smoothedTrace, flag) {
if (flag) {
smoothedTrace.line.color = trace._params.plotParams.color;
trace.line.color = smoothedTrace._params.plotParams.color;
trace.hoverinfo = ‘skip’;
if (trace._params.cv_stddev_last) {
trace.fillcolor = trace._params.plotParams.fillsmoothcolor;
}
} else {
trace.line.color = trace._params.plotParams.color;
trace.hoverinfo = trace._params.plotParams.hoverinfo;
if (trace._params.cv_stddev_last) {
trace.fillcolor = trace._params.plotParams.fillcolor;
}
}
};
CatboostIpython.prototype.smoothFunc = function(origTrace, smoothedTrace) {
var data = origTrace.y,
smoothedPoints = this.smooth(data, this.getSmoothness()),
smoothedIndex = 0,
self = this;
if (smoothedPoints.length) {
data.forEach(function (d, index) {
if (!smoothedTrace.x[index]) {
smoothedTrace.x[index] = index;
}
var nameOfSet = smoothedTrace._params.nameOfSet;
if (smoothedTrace._params.cv_stddev_first || smoothedTrace._params.cv_stddev_last) {
nameOfSet = smoothedTrace._params.type + ‘ std’;
}
smoothedTrace.y[index] = smoothedPoints[smoothedIndex];
smoothedTrace.hovertext[index] = nameOfSet + ‘`: ‘ + smoothedPoints[smoothedIndex].toPrecision(7);
if (self.hovertextParameters.length > index) {
smoothedTrace.hovertext[index] += self.hovertextParameters[index];
}
smoothedIndex++;
});
}
};
CatboostIpython.prototype.formatItemValue = function(value, index, suffix) {
if (typeof value === ‘undefined’) {
return ”;
}
suffix = suffix || ”;
return ‘‘ + value + ‘‘;
};
CatboostIpython.prototype.updateTraceBest = function(train, hash) {
var traces = this.filterTracesOne(hash.traces, {best_point: true}),
self = this;
traces.forEach(function(trace) {
var testTrace = self.filterTracesEvery(hash.traces, self.getTraceDefParams({
train: trace._params.train,
type: ‘test’,
indexOfSet: trace._params.indexOfSet
}));
if (self.hasCVMode) {
testTrace = self.filterTracesEvery(hash.traces, self.getTraceDefParams({
train: trace._params.train,
type: ‘test’,
cv_avg: true
}));
}
var bestValue = self.getBestValue(testTrace.length === 1 ? testTrace[0].y : []);
if (bestValue.index !== -1) {
trace.x[0] = bestValue.index;
trace.y[0] = bestValue.best;
trace.hovertext[0] = bestValue.func + ‘ (‘ + (self.hasCVMode ? ‘avg’ : trace._params.nameOfSet) + ‘): ‘ + bestValue.index + ‘ ‘ + bestValue.best;
}
});
};
CatboostIpython.prototype.updateTraceValues = function(name, hash, iteration, click) {
var id = ‘catboost-serie-‘ + this.index + ‘-‘ + hash.index,
traces = {
learn: this.filterTracesEvery(hash.traces, this.getTraceDefParams({type: ‘learn’})),
test: this.filterTracesEvery(hash.traces, this.getTraceDefParams({type: ‘test’}))
},
path = hash.info.path,
self = this;
[‘learn’, ‘test’].forEach(function(type) {
traces[type].forEach(function(trace) {
var data = trace.y || [],
index = typeof iteration !== ‘undefined’ && iteration -1 ? bestValue.index : ”);
$(‘#’ + id + ‘ .catboost-panel__serie_best_test_value[data-index=’ + trace._params.indexOfSet + ‘]’, self.layout)
.html(self.formatItemValue(bestValue.best, bestValue.index, ‘best ‘ + trace._params.nameOfSet + ‘ ‘));
}
});
});
if (this.hasCVMode) {
var testTrace = this.filterTracesEvery(hash.traces, this.getTraceDefParams({
type: ‘test’,
cv_avg: true
})),
bestValue = this.getBestValue(testTrace.length === 1 ? testTrace[0].y : []);
$(‘#’ + id + ‘ .catboost-panel__serie_best_iteration’, this.layout).html(bestValue.index > -1 ? bestValue.index : ”);
}
if (click) {
this.clickMode = true;
$(‘#catboost-control2-clickmode’ + this.index, this.layout)[0].checked = true;
}
};
CatboostIpython.prototype.addTracesEvents = function() {
var self = this;
$(‘.catboost-panel__serie_checkbox’, this.layout).click(function() {
var name = $(this).data(‘seriename’);
self.layoutDisabled.traces[name] = !$(this)[0].checked;
self.redrawActiveChart();
});
};
CatboostIpython.prototype.getNextColor = function(path, opacity) {
var color;
if (this.colorsByPath[path]) {
color = this.colorsByPath[path];
} else {
color = this.colors[this.colorIndex];
this.colorsByPath[path] = color;
this.colorIndex++;
if (this.colorIndex > this.colors.length – 1) {
this.colorIndex = 0;
}
}
return this.hexToRgba(color, opacity);
};
CatboostIpython.prototype.hexToRgba = function(value, opacity) {
if (value.length 0) {
out += hours + ‘h ‘;
seconds = 0;
millis = 0;
}
if (minutes && minutes > 0) {
out += minutes + ‘m ‘;
millis = 0;
}
if (seconds && seconds > 0) {
out += seconds + ‘s ‘;
}
if (millis && millis > 0) {
out += millis + ‘ms’;
}
return out.trim();
};
CatboostIpython.prototype.mean = function(values, valueof) {
var n = values.length,
m = n,
i = -1,
value,
sum = 0,
number = function(x) {
return x === null ? NaN : +x;
};
if (valueof === null) {
while (++i
model = CatBoostClassifier(iterations=50, random_seed=42, logging_level='Silent').fit(train_pool)
eval_metrics = model.eval_metrics(validate_pool, ['Accuracy'], plot=True)
‘;
}
this.layout = $(‘
‘
‘
‘‘ +
‘
‘‘ +
‘
‘
‘ +
‘
‘
‘ +
‘
‘‘ +
‘
‘‘ +
‘
‘
‘‘ +
‘
‘‘ +
‘‘ +
‘
‘ +
cvAreaControls +
‘
‘ +
‘
‘ +
‘
‘
‘
‘
‘ +
‘
‘);
$(parent).append(this.layout);
this.addTabEvents();
this.addControlEvents();
};
CatboostIpython.prototype.addTabEvents = function() {
var self = this;
$(‘.catboost-graph__tabs’, this.layout).click(function(e) {
if (!$(e.target).is(‘.catboost-graph__tab:not(.catboost-graph__tab_active)’)) {
return;
}
var id = $(e.target).attr(‘tabid’);
self.activeTab = id;
$(‘.catboost-graph__tab_active’, self.layout).removeClass(‘catboost-graph__tab_active’);
$(‘.catboost-graph__chart_active’, self.layout).removeClass(‘catboost-graph__chart_active’);
$(‘.catboost-graph__tab[tabid=”‘ + id + ‘”]’, self.layout).addClass(‘catboost-graph__tab_active’);
$(‘.catboost-graph__chart[tabid=”‘ + id + ‘”]’, self.layout).addClass(‘catboost-graph__chart_active’);
self.cleanSeries();
self.redrawActiveChart();
self.resizeCharts();
});
};
CatboostIpython.prototype.addControlEvents = function() {
var self = this;
$(‘#catboost-control-learn’ + this.index, this.layout).click(function() {
self.layoutDisabled.learn = !$(this)[0].checked;
$(‘.catboost-panel__series’, self.layout).toggleClass(‘catboost-panel__series_learn_disabled’, self.layoutDisabled.learn);
self.redrawActiveChart();
});
$(‘#catboost-control-test’ + this.index, this.layout).click(function() {
self.layoutDisabled.test = !$(this)[0].checked;
$(‘.catboost-panel__series’, self.layout).toggleClass(‘catboost-panel__series_test_disabled’, self.layoutDisabled.test);
self.redrawActiveChart();
});
$(‘#catboost-control2-clickmode’ + this.index, this.layout).click(function() {
self.clickMode = $(this)[0].checked;
});
$(‘#catboost-control2-log’ + this.index, this.layout).click(function() {
self.logarithmMode = $(this)[0].checked ? ‘log’ : ‘linear’;
self.forEveryLayout(function(layout) {
layout.yaxis = {type: self.logarithmMode};
});
self.redrawActiveChart();
});
var slider = $(‘#catboost-control2-slider’ + this.index),
sliderValue = $(‘#catboost-control2-slidervalue’ + this.index);
$(‘#catboost-control2-smooth’ + this.index, this.layout).click(function() {
var enabled = $(this)[0].checked;
self.setSmoothness(enabled ? self.lastSmooth : -1);
slider.prop(‘disabled’, !enabled);
sliderValue.prop(‘disabled’, !enabled);
self.redrawActiveChart();
});
$(‘#catboost-control2-cvstddev’ + this.index, this.layout).click(function() {
var enabled = $(this)[0].checked;
self.setStddev(enabled);
self.redrawActiveChart();
});
slider.on(‘input change’, function() {
var smooth = Number($(this).val());
sliderValue.val(isNaN(smooth) ? 0 : smooth);
self.setSmoothness(smooth);
self.lastSmooth = smooth;
self.redrawActiveChart();
});
sliderValue.on(‘input change’, function() {
var smooth = Number($(this).val());
slider.val(isNaN(smooth) ? 0 : smooth);
self.setSmoothness(smooth);
self.lastSmooth = smooth;
self.redrawActiveChart();
});
};
CatboostIpython.prototype.setTraceVisibility = function(trace, visibility) {
if (trace) {
trace.visible = visibility;
}
};
CatboostIpython.prototype.updateTracesVisibility = function() {
var tracesHash = this.groupTraces(),
traces,
smoothDisabled = this.getSmoothness() === -1,
self = this;
for (var train in tracesHash) {
if (tracesHash.hasOwnProperty(train)) {
traces = tracesHash[train].traces;
if (this.layoutDisabled.traces[train]) {
traces.forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
} else {
traces.forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
if (this.hasCVMode) {
if (this.stddevEnabled) {
self.filterTracesOne(traces, {type: ‘learn’}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesOne(traces, {type: ‘test’}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘learn’, cv_avg: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘learn’, cv_avg: true, smoothed: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true, smoothed: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true, best_point: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesOne(traces, {cv_stddev_first: true}).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesOne(traces, {cv_stddev_last: true}).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
} else {
self.filterTracesOne(traces, {cv_stddev_first: true}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesOne(traces, {cv_stddev_last: true}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘learn’, cv_avg: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘learn’, cv_avg: true, smoothed: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true, smoothed: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true, best_point: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
}
}
if (smoothDisabled) {
self.filterTracesOne(traces, {smoothed: true}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
}
if (this.layoutDisabled[‘learn’]) {
self.filterTracesOne(traces, {type: ‘learn’}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
}
if (this.layoutDisabled[‘test’]) {
self.filterTracesOne(traces, {type: ‘test’}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
}
}
}
}
};
CatboostIpython.prototype.getSmoothness = function() {
return this.smoothness && this.smoothness > -1 ? this.smoothness : -1;
};
CatboostIpython.prototype.setSmoothness = function(weight) {
if (weight 1) {
return;
}
this.smoothness = weight;
};
CatboostIpython.prototype.setStddev = function(enabled) {
this.stddevEnabled = enabled;
};
CatboostIpython.prototype.redrawActiveChart = function() {
this.chartsToRedraw[this.activeTab] = true;
this.redrawAll();
};
CatboostIpython.prototype.redraw = function() {
if (this.chartsToRedraw[this.activeTab]) {
this.chartsToRedraw[this.activeTab] = false;
this.updateTracesVisibility();
this.updateTracesCV();
this.updateTracesBest();
this.updateTracesValues();
this.updateTracesSmoothness();
this.plotly.redraw(this.traces[this.activeTab].parent);
}
this.drawTraces();
};
CatboostIpython.prototype.addRedrawFunc = function() {
this.redrawFunc = throttle(this.redraw, 400, false, this);
};
CatboostIpython.prototype.redrawAll = function() {
if (!this.redrawFunc) {
this.addRedrawFunc();
}
this.redrawFunc();
};
CatboostIpython.prototype.addPoints = function(parent, data) {
var self = this;
data.chunks.forEach(function(item) {
if (typeof item.remaining_time !== ‘undefined’ && typeof item.passed_time !== ‘undefined’) {
if (!self.timeLeft[data.path]) {
self.timeLeft[data.path] = [];
}
self.timeLeft[data.path][item.iteration] = [item.remaining_time, item.passed_time];
}
[‘test’, ‘learn’].forEach(function(type) {
var sets = self.meta[data.path][type + ‘_sets’],
metrics = self.meta[data.path][type + ‘_metrics’];
for (var i = 0; i ‘ + parameter + ‘ : ‘ + valueOfParameter;
}
}
}
if (!hovertextParametersAdded && type === ‘test’) {
hovertextParametersAdded = true;
trace.hovertext[pointIndex] += self.hovertextParameters[pointIndex];
}
smoothedTrace.x[pointIndex] = pointIndex;
}
if (bestValueTrace) {
bestValueTrace.x[pointIndex] = pointIndex;
bestValueTrace.y[pointIndex] = self.lossFuncs[nameOfMetric];
}
if (launchMode === ‘CV’ && !cvAdded) {
cvAdded = true;
self.getTrace(parent, $.extend({cv_stddev_first: true}, params));
self.getTrace(parent, $.extend({cv_stddev_last: true}, params));
self.getTrace(parent, $.extend({cv_stddev_first: true, smoothed: true}, params));
self.getTrace(parent, $.extend({cv_stddev_last: true, smoothed: true}, params));
self.getTrace(parent, $.extend({cv_avg: true}, params));
self.getTrace(parent, $.extend({cv_avg: true, smoothed: true}, params));
if (type === ‘test’) {
self.getTrace(parent, $.extend({cv_avg: true, best_point: true}, params));
}
}
}
self.chartsToRedraw[key.chartId] = true;
self.redrawAll();
}
});
});
};
CatboostIpython.prototype.getLaunchMode = function(path) {
return this.meta[path].launch_mode;
};
CatboostIpython.prototype.getChartNode = function(params, active) {
var node = $(‘
if (active) {
node.addClass(‘catboost-graph__chart_active’);
}
return node;
};
CatboostIpython.prototype.getChartTab = function(params, active) {
var node = $(‘
‘);
if (active) {
node.addClass(‘catboost-graph__tab_active’);
}
return node;
};
CatboostIpython.prototype.forEveryChart = function(callback) {
for (var name in this.traces) {
if (this.traces.hasOwnProperty(name)) {
callback(this.traces[name]);
}
}
};
CatboostIpython.prototype.forEveryLayout = function(callback) {
this.forEveryChart(function(chart) {
callback(chart.layout);
});
};
CatboostIpython.prototype.getChart = function(parent, params) {
var id = params.id,
self = this;
if (this.charts[id]) {
return this.charts[id];
}
this.addLayout(parent);
var active = this.activeTab === params.id,
chartNode = this.getChartNode(params, active),
chartTab = this.getChartTab(params, active);
$(‘.catboost-graph__charts’, this.layout).append(chartNode);
$(‘.catboost-graph__tabs’, this.layout).append(chartTab);
this.traces[id] = {
id: params.id,
name: params.name,
parent: chartNode[0],
traces: [],
layout: {
xaxis: {
range: [0, Number(this.meta[params.path].iteration_count)],
type: ‘linear’,
tickmode: ‘auto’,
showspikes: true,
spikethickness: 1,
spikedash: ‘longdashdot’,
spikemode: ‘across’,
zeroline: false,
showgrid: false
},
yaxis: {
zeroline: false
//showgrid: false
//hoverformat : ‘.7f’
},
separators: ‘. ‘,
//hovermode: ‘x’,
margin: {l: 38, r: 0, t: 35, b: 30},
autosize: true,
showlegend: false
},
options: {
scrollZoom: false,
modeBarButtonsToRemove: [‘toggleSpikelines’],
displaylogo: false
}
};
this.charts[id] = this.plotly.plot(chartNode[0], this.traces[id].traces, this.traces[id].layout, this.traces[id].options);
chartNode[0].on(‘plotly_hover’, function(e) {
self.updateTracesValues(e.points[0].x);
});
chartNode[0].on(‘plotly_click’, function(e) {
self.updateTracesValues(e.points[0].x, true);
});
return this.charts[id];
};
CatboostIpython.prototype.getTrace = function(parent, params) {
var key = this.getKey(params),
chartSeries = [];
if (this.traces[key.chartId]) {
chartSeries = this.traces[key.chartId].traces.filter(function(trace) {
return trace.name === key.traceName;
});
}
if (chartSeries.length) {
return chartSeries[0];
} else {
this.getChart(parent, {id: key.chartId, name: params.chartName, path: params.path});
var plotParams = {
color: this.getNextColor(params.path, params.smoothed ? 0.2 : 1),
fillsmoothcolor: this.getNextColor(params.path, 0.1),
fillcolor: this.getNextColor(params.path, 0.4),
hoverinfo: params.cv_avg ? ‘skip’ : ‘text+x’,
width: params.cv_avg ? 2 : 1,
dash: params.type === ‘test’ ? ‘solid’ : ‘dot’
},
trace = {
name: key.traceName,
_params: params,
x: [],
y: [],
hovertext: [],
hoverinfo: plotParams.hoverinfo,
line: {
width: plotParams.width,
dash: plotParams.dash,
color: plotParams.color
},
mode: ‘lines’,
hoveron: ‘points’,
connectgaps: true
};
if (params.best_point) {
trace = {
name: key.traceName,
_params: params,
x: [],
y: [],
marker: {
width: 2,
color: plotParams.color
},
hovertext: [],
hoverinfo: ‘text’,
mode: ‘markers’,
type: ‘scatter’
};
}
if (params.best_value) {
trace = {
name: key.traceName,
_params: params,
x: [],
y: [],
line: {
width: 1,
dash: ‘dash’,
color: ‘#CCCCCC’
},
mode: ‘lines’,
connectgaps: true,
hoverinfo: ‘skip’
};
}
if (params.cv_stddev_last) {
trace.fill = ‘tonexty’;
}
trace._params.plotParams = plotParams;
this.traces[key.chartId].traces.push(trace);
return trace;
}
};
CatboostIpython.prototype.getKey = function(params) {
var traceName = [
params.train,
params.type,
params.indexOfSet,
(params.smoothed ? ‘smoothed’ : ”),
(params.best_point ? ‘best_pount’ : ”),
(params.best_value ? ‘best_value’ : ”),
(params.cv_avg ? ‘cv_avg’ : ”),
(params.cv_stddev_first ? ‘cv_stddev_first’ : ”),
(params.cv_stddev_last ? ‘cv_stddev_last’ : ”)
].join(‘;’);
return {
chartId: params.chartName,
traceName: traceName,
colorId: params.train
};
};
CatboostIpython.prototype.filterTracesEvery = function(traces, filter) {
traces = traces || this.traces[this.activeTab].traces;
return traces.filter(function(trace) {
for (var prop in filter) {
if (filter.hasOwnProperty(prop)) {
if (filter[prop] !== trace._params[prop]) {
return false;
}
}
}
return true;
});
};
CatboostIpython.prototype.filterTracesOne = function(traces, filter) {
traces = traces || this.traces[this.activeTab].traces;
return traces.filter(function(trace) {
for (var prop in filter) {
if (filter.hasOwnProperty(prop)) {
if (filter[prop] === trace._params[prop]) {
return true;
}
}
}
return false;
});
};
CatboostIpython.prototype.cleanSeries = function() {
$(‘.catboost-panel__series’, this.layout).html(”);
};
CatboostIpython.prototype.groupTraces = function() {
var traces = this.traces[this.activeTab].traces,
index = 0,
tracesHash = {};
traces.map(function(trace) {
var train = trace._params.train;
if (!tracesHash[train]) {
tracesHash[train] = {
index: index,
traces: [],
info: {
path: trace._params.path,
color: trace._params.plotParams.color
}
};
index++;
}
tracesHash[train].traces.push(trace);
});
return tracesHash;
};
CatboostIpython.prototype.drawTraces = function() {
if ($(‘.catboost-panel__series .catboost-panel__serie’, this.layout).length) {
return;
}
var html = ”,
tracesHash = this.groupTraces();
for (var train in tracesHash) {
if (tracesHash.hasOwnProperty(train)) {
html += this.drawTrace(train, tracesHash[train]);
}
}
$(‘.catboost-panel__series’, this.layout).html(html);
this.updateTracesValues();
this.addTracesEvents();
};
CatboostIpython.prototype.getTraceDefParams = function(params) {
var defParams = {
smoothed: undefined,
best_point: undefined,
best_value: undefined,
cv_avg: undefined,
cv_stddev_first: undefined,
cv_stddev_last: undefined
};
if (params) {
return $.extend(defParams, params);
} else {
return defParams;
}
};
CatboostIpython.prototype.drawTrace = function(train, hash) {
var info = hash.info,
id = ‘catboost-serie-‘ + this.index + ‘-‘ + hash.index,
traces = {
learn: this.filterTracesEvery(hash.traces, this.getTraceDefParams({type: ‘learn’})),
test: this.filterTracesEvery(hash.traces, this.getTraceDefParams({type: ‘test’}))
},
items = {
learn: {
middle: ”,
bottom: ”
},
test: {
middle: ”,
bottom: ”
}
},
tracesNames = ”;
[‘learn’, ‘test’].forEach(function(type) {
traces[type].forEach(function(trace) {
items[type].middle += ‘
‘
items[type].bottom += ‘
‘
tracesNames += ‘
‘
‘;
});
});
var timeSpendHtml = ‘
‘
‘
‘;
var html = ‘
‘
‘‘ +
‘
(this.getLaunchMode(info.path) !== ‘Eval’ ? timeSpendHtml : ”) +
‘
‘ +
‘
‘ +
‘
‘ +
‘
‘
‘
‘
tracesNames +
‘
‘ +
‘
items.learn.middle +
items.test.middle +
‘
‘ +
‘
items.learn.bottom +
items.test.bottom +
‘
‘ +
‘
‘ +
‘
‘;
return html;
};
CatboostIpython.prototype.updateTracesValues = function(iteration, click) {
var tracesHash = this.groupTraces();
for (var train in tracesHash) {
if (tracesHash.hasOwnProperty(train) && !this.layoutDisabled.traces[train]) {
this.updateTraceValues(train, tracesHash[train], iteration, click);
}
}
};
CatboostIpython.prototype.updateTracesBest = function() {
var tracesHash = this.groupTraces();
for (var train in tracesHash) {
if (tracesHash.hasOwnProperty(train) && !this.layoutDisabled.traces[train]) {
this.updateTraceBest(train, tracesHash[train]);
}
}
};
CatboostIpython.prototype.getBestValue = function(data) {
if (!data.length) {
return {
best: undefined,
index: -1
};
}
var best = data[0],
index = 0,
func = this.lossFuncs[this.traces[this.activeTab].name],
bestDiff = typeof func === ‘number’ ? Math.abs(data[0] – func) : 0;
for (var i = 1, l = data.length; i best) {
best = data[i];
index = i;
}
if (typeof func === ‘number’ && Math.abs(data[i] – func) maxLength) {
maxLength = origTrace.y.length;
}
});
for (var i = 0; i 0) {
avgTrace.x[i] = i;
avgTrace.y[i] = sum / count;
}
}
};
CatboostIpython.prototype.updateTracesCVStdDev = function() {
var tracesHash = this.groupTraces(),
firstTraces = this.filterTracesOne(tracesHash.traces, {cv_stddev_first: true}),
self = this;
firstTraces.forEach(function(trace) {
var origTraces = self.filterTracesEvery(tracesHash.traces, self.getTraceDefParams({
train: trace._params.train,
type: trace._params.type,
smoothed: trace._params.smoothed
})),
lastTraces = self.filterTracesEvery(tracesHash.traces, self.getTraceDefParams({
train: trace._params.train,
type: trace._params.type,
smoothed: trace._params.smoothed,
cv_stddev_last: true
}));
if (origTraces.length && lastTraces.length === 1) {
self.cvStdDevFunc(origTraces, trace, lastTraces[0]);
}
});
};
CatboostIpython.prototype.cvStdDevFunc = function(origTraces, firstTrace, lastTrace) {
var maxCount = origTraces.length,
maxLength = -1,
count,
sum,
i, j;
origTraces.forEach(function(origTrace) {
if (origTrace.y.length > maxLength) {
maxLength = origTrace.y.length;
}
});
for (i = 0; i i) {
firstTrace.hovertext[i] += this.hovertextParameters[i];
lastTrace.hovertext[i] += this.hovertextParameters[i];
}
}
};
CatboostIpython.prototype.updateTracesSmoothness = function() {
var tracesHash = this.groupTraces(),
smoothedTraces = this.filterTracesOne(tracesHash.traces, {smoothed: true}),
enabled = this.getSmoothness() > -1,
self = this;
smoothedTraces.forEach(function(trace) {
var origTraces = self.filterTracesEvery(tracesHash.traces, self.getTraceDefParams({
train: trace._params.train,
type: trace._params.type,
indexOfSet: trace._params.indexOfSet,
cv_avg: trace._params.cv_avg,
cv_stddev_first: trace._params.cv_stddev_first,
cv_stddev_last: trace._params.cv_stddev_last
})),
colorFlag = false;
if (origTraces.length === 1) {
origTraces = origTraces[0];
if (origTraces.visible) {
if (enabled) {
self.smoothFunc(origTraces, trace);
colorFlag = true;
}
self.highlightSmoothedTrace(origTraces, trace, colorFlag);
}
}
});
};
CatboostIpython.prototype.highlightSmoothedTrace = function(trace, smoothedTrace, flag) {
if (flag) {
smoothedTrace.line.color = trace._params.plotParams.color;
trace.line.color = smoothedTrace._params.plotParams.color;
trace.hoverinfo = ‘skip’;
if (trace._params.cv_stddev_last) {
trace.fillcolor = trace._params.plotParams.fillsmoothcolor;
}
} else {
trace.line.color = trace._params.plotParams.color;
trace.hoverinfo = trace._params.plotParams.hoverinfo;
if (trace._params.cv_stddev_last) {
trace.fillcolor = trace._params.plotParams.fillcolor;
}
}
};
CatboostIpython.prototype.smoothFunc = function(origTrace, smoothedTrace) {
var data = origTrace.y,
smoothedPoints = this.smooth(data, this.getSmoothness()),
smoothedIndex = 0,
self = this;
if (smoothedPoints.length) {
data.forEach(function (d, index) {
if (!smoothedTrace.x[index]) {
smoothedTrace.x[index] = index;
}
var nameOfSet = smoothedTrace._params.nameOfSet;
if (smoothedTrace._params.cv_stddev_first || smoothedTrace._params.cv_stddev_last) {
nameOfSet = smoothedTrace._params.type + ‘ std’;
}
smoothedTrace.y[index] = smoothedPoints[smoothedIndex];
smoothedTrace.hovertext[index] = nameOfSet + ‘`: ‘ + smoothedPoints[smoothedIndex].toPrecision(7);
if (self.hovertextParameters.length > index) {
smoothedTrace.hovertext[index] += self.hovertextParameters[index];
}
smoothedIndex++;
});
}
};
CatboostIpython.prototype.formatItemValue = function(value, index, suffix) {
if (typeof value === ‘undefined’) {
return ”;
}
suffix = suffix || ”;
return ‘‘ + value + ‘‘;
};
CatboostIpython.prototype.updateTraceBest = function(train, hash) {
var traces = this.filterTracesOne(hash.traces, {best_point: true}),
self = this;
traces.forEach(function(trace) {
var testTrace = self.filterTracesEvery(hash.traces, self.getTraceDefParams({
train: trace._params.train,
type: ‘test’,
indexOfSet: trace._params.indexOfSet
}));
if (self.hasCVMode) {
testTrace = self.filterTracesEvery(hash.traces, self.getTraceDefParams({
train: trace._params.train,
type: ‘test’,
cv_avg: true
}));
}
var bestValue = self.getBestValue(testTrace.length === 1 ? testTrace[0].y : []);
if (bestValue.index !== -1) {
trace.x[0] = bestValue.index;
trace.y[0] = bestValue.best;
trace.hovertext[0] = bestValue.func + ‘ (‘ + (self.hasCVMode ? ‘avg’ : trace._params.nameOfSet) + ‘): ‘ + bestValue.index + ‘ ‘ + bestValue.best;
}
});
};
CatboostIpython.prototype.updateTraceValues = function(name, hash, iteration, click) {
var id = ‘catboost-serie-‘ + this.index + ‘-‘ + hash.index,
traces = {
learn: this.filterTracesEvery(hash.traces, this.getTraceDefParams({type: ‘learn’})),
test: this.filterTracesEvery(hash.traces, this.getTraceDefParams({type: ‘test’}))
},
path = hash.info.path,
self = this;
[‘learn’, ‘test’].forEach(function(type) {
traces[type].forEach(function(trace) {
var data = trace.y || [],
index = typeof iteration !== ‘undefined’ && iteration -1 ? bestValue.index : ”);
$(‘#’ + id + ‘ .catboost-panel__serie_best_test_value[data-index=’ + trace._params.indexOfSet + ‘]’, self.layout)
.html(self.formatItemValue(bestValue.best, bestValue.index, ‘best ‘ + trace._params.nameOfSet + ‘ ‘));
}
});
});
if (this.hasCVMode) {
var testTrace = this.filterTracesEvery(hash.traces, this.getTraceDefParams({
type: ‘test’,
cv_avg: true
})),
bestValue = this.getBestValue(testTrace.length === 1 ? testTrace[0].y : []);
$(‘#’ + id + ‘ .catboost-panel__serie_best_iteration’, this.layout).html(bestValue.index > -1 ? bestValue.index : ”);
}
if (click) {
this.clickMode = true;
$(‘#catboost-control2-clickmode’ + this.index, this.layout)[0].checked = true;
}
};
CatboostIpython.prototype.addTracesEvents = function() {
var self = this;
$(‘.catboost-panel__serie_checkbox’, this.layout).click(function() {
var name = $(this).data(‘seriename’);
self.layoutDisabled.traces[name] = !$(this)[0].checked;
self.redrawActiveChart();
});
};
CatboostIpython.prototype.getNextColor = function(path, opacity) {
var color;
if (this.colorsByPath[path]) {
color = this.colorsByPath[path];
} else {
color = this.colors[this.colorIndex];
this.colorsByPath[path] = color;
this.colorIndex++;
if (this.colorIndex > this.colors.length – 1) {
this.colorIndex = 0;
}
}
return this.hexToRgba(color, opacity);
};
CatboostIpython.prototype.hexToRgba = function(value, opacity) {
if (value.length 0) {
out += hours + ‘h ‘;
seconds = 0;
millis = 0;
}
if (minutes && minutes > 0) {
out += minutes + ‘m ‘;
millis = 0;
}
if (seconds && seconds > 0) {
out += seconds + ‘s ‘;
}
if (millis && millis > 0) {
out += millis + ‘ms’;
}
return out.trim();
};
CatboostIpython.prototype.mean = function(values, valueof) {
var n = values.length,
m = n,
i = -1,
value,
sum = 0,
number = function(x) {
return x === null ? NaN : +x;
};
if (valueof === null) {
while (++i
print(eval_metrics['Accuracy'][:16])
3.10 Porównanie procesów uczenia się¶
Możesz także porównać proces uczenia się różnych modeli na jednym wykresie.
model1 = CatBoostClassifier(iterations=1000, depth=1, train_dir='model_depth_1/', logging_level='Silent')
model1.fit(train_pool, eval_set=validate_pool)
model2 = CatBoostClassifier(iterations=1000, depth=5, train_dir='model_depth_5/', logging_level='Silent')
model2.fit(train_pool, eval_set=validate_pool);
from catboost import MetricVisualizer
widget = MetricVisualizer(['model_depth_1', 'model_depth_5'])
widget.start()
‘;
}
this.layout = $(‘
‘
‘
‘‘ +
‘
‘‘ +
‘
‘
‘ +
‘
‘
‘ +
‘
‘‘ +
‘
‘‘ +
‘
‘
‘‘ +
‘
‘‘ +
‘‘ +
‘
‘ +
cvAreaControls +
‘
‘ +
‘
‘ +
‘
‘
‘
‘
‘ +
‘
‘);
$(parent).append(this.layout);
this.addTabEvents();
this.addControlEvents();
};
CatboostIpython.prototype.addTabEvents = function() {
var self = this;
$(‘.catboost-graph__tabs’, this.layout).click(function(e) {
if (!$(e.target).is(‘.catboost-graph__tab:not(.catboost-graph__tab_active)’)) {
return;
}
var id = $(e.target).attr(‘tabid’);
self.activeTab = id;
$(‘.catboost-graph__tab_active’, self.layout).removeClass(‘catboost-graph__tab_active’);
$(‘.catboost-graph__chart_active’, self.layout).removeClass(‘catboost-graph__chart_active’);
$(‘.catboost-graph__tab[tabid=”‘ + id + ‘”]’, self.layout).addClass(‘catboost-graph__tab_active’);
$(‘.catboost-graph__chart[tabid=”‘ + id + ‘”]’, self.layout).addClass(‘catboost-graph__chart_active’);
self.cleanSeries();
self.redrawActiveChart();
self.resizeCharts();
});
};
CatboostIpython.prototype.addControlEvents = function() {
var self = this;
$(‘#catboost-control-learn’ + this.index, this.layout).click(function() {
self.layoutDisabled.learn = !$(this)[0].checked;
$(‘.catboost-panel__series’, self.layout).toggleClass(‘catboost-panel__series_learn_disabled’, self.layoutDisabled.learn);
self.redrawActiveChart();
});
$(‘#catboost-control-test’ + this.index, this.layout).click(function() {
self.layoutDisabled.test = !$(this)[0].checked;
$(‘.catboost-panel__series’, self.layout).toggleClass(‘catboost-panel__series_test_disabled’, self.layoutDisabled.test);
self.redrawActiveChart();
});
$(‘#catboost-control2-clickmode’ + this.index, this.layout).click(function() {
self.clickMode = $(this)[0].checked;
});
$(‘#catboost-control2-log’ + this.index, this.layout).click(function() {
self.logarithmMode = $(this)[0].checked ? ‘log’ : ‘linear’;
self.forEveryLayout(function(layout) {
layout.yaxis = {type: self.logarithmMode};
});
self.redrawActiveChart();
});
var slider = $(‘#catboost-control2-slider’ + this.index),
sliderValue = $(‘#catboost-control2-slidervalue’ + this.index);
$(‘#catboost-control2-smooth’ + this.index, this.layout).click(function() {
var enabled = $(this)[0].checked;
self.setSmoothness(enabled ? self.lastSmooth : -1);
slider.prop(‘disabled’, !enabled);
sliderValue.prop(‘disabled’, !enabled);
self.redrawActiveChart();
});
$(‘#catboost-control2-cvstddev’ + this.index, this.layout).click(function() {
var enabled = $(this)[0].checked;
self.setStddev(enabled);
self.redrawActiveChart();
});
slider.on(‘input change’, function() {
var smooth = Number($(this).val());
sliderValue.val(isNaN(smooth) ? 0 : smooth);
self.setSmoothness(smooth);
self.lastSmooth = smooth;
self.redrawActiveChart();
});
sliderValue.on(‘input change’, function() {
var smooth = Number($(this).val());
slider.val(isNaN(smooth) ? 0 : smooth);
self.setSmoothness(smooth);
self.lastSmooth = smooth;
self.redrawActiveChart();
});
};
CatboostIpython.prototype.setTraceVisibility = function(trace, visibility) {
if (trace) {
trace.visible = visibility;
}
};
CatboostIpython.prototype.updateTracesVisibility = function() {
var tracesHash = this.groupTraces(),
traces,
smoothDisabled = this.getSmoothness() === -1,
self = this;
for (var train in tracesHash) {
if (tracesHash.hasOwnProperty(train)) {
traces = tracesHash[train].traces;
if (this.layoutDisabled.traces[train]) {
traces.forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
} else {
traces.forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
if (this.hasCVMode) {
if (this.stddevEnabled) {
self.filterTracesOne(traces, {type: ‘learn’}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesOne(traces, {type: ‘test’}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘learn’, cv_avg: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘learn’, cv_avg: true, smoothed: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true, smoothed: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true, best_point: true})).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesOne(traces, {cv_stddev_first: true}).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
self.filterTracesOne(traces, {cv_stddev_last: true}).forEach(function(trace) {
self.setTraceVisibility(trace, true);
});
} else {
self.filterTracesOne(traces, {cv_stddev_first: true}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesOne(traces, {cv_stddev_last: true}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘learn’, cv_avg: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘learn’, cv_avg: true, smoothed: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true, smoothed: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
self.filterTracesEvery(traces, this.getTraceDefParams({type: ‘test’, cv_avg: true, best_point: true})).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
}
}
if (smoothDisabled) {
self.filterTracesOne(traces, {smoothed: true}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
}
if (this.layoutDisabled[‘learn’]) {
self.filterTracesOne(traces, {type: ‘learn’}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
}
if (this.layoutDisabled[‘test’]) {
self.filterTracesOne(traces, {type: ‘test’}).forEach(function(trace) {
self.setTraceVisibility(trace, false);
});
}
}
}
}
};
CatboostIpython.prototype.getSmoothness = function() {
return this.smoothness && this.smoothness > -1 ? this.smoothness : -1;
};
CatboostIpython.prototype.setSmoothness = function(weight) {
if (weight 1) {
return;
}
this.smoothness = weight;
};
CatboostIpython.prototype.setStddev = function(enabled) {
this.stddevEnabled = enabled;
};
CatboostIpython.prototype.redrawActiveChart = function() {
this.chartsToRedraw[this.activeTab] = true;
this.redrawAll();
};
CatboostIpython.prototype.redraw = function() {
if (this.chartsToRedraw[this.activeTab]) {
this.chartsToRedraw[this.activeTab] = false;
this.updateTracesVisibility();
this.updateTracesCV();
this.updateTracesBest();
this.updateTracesValues();
this.updateTracesSmoothness();
this.plotly.redraw(this.traces[this.activeTab].parent);
}
this.drawTraces();
};
CatboostIpython.prototype.addRedrawFunc = function() {
this.redrawFunc = throttle(this.redraw, 400, false, this);
};
CatboostIpython.prototype.redrawAll = function() {
if (!this.redrawFunc) {
this.addRedrawFunc();
}
this.redrawFunc();
};
CatboostIpython.prototype.addPoints = function(parent, data) {
var self = this;
data.chunks.forEach(function(item) {
if (typeof item.remaining_time !== ‘undefined’ && typeof item.passed_time !== ‘undefined’) {
if (!self.timeLeft[data.path]) {
self.timeLeft[data.path] = [];
}
self.timeLeft[data.path][item.iteration] = [item.remaining_time, item.passed_time];
}
[‘test’, ‘learn’].forEach(function(type) {
var sets = self.meta[data.path][type + ‘_sets’],
metrics = self.meta[data.path][type + ‘_metrics’];
for (var i = 0; i ‘ + parameter + ‘ : ‘ + valueOfParameter;
}
}
}
if (!hovertextParametersAdded && type === ‘test’) {
hovertextParametersAdded = true;
trace.hovertext[pointIndex] += self.hovertextParameters[pointIndex];
}
smoothedTrace.x[pointIndex] = pointIndex;
}
if (bestValueTrace) {
bestValueTrace.x[pointIndex] = pointIndex;
bestValueTrace.y[pointIndex] = self.lossFuncs[nameOfMetric];
}
if (launchMode === ‘CV’ && !cvAdded) {
cvAdded = true;
self.getTrace(parent, $.extend({cv_stddev_first: true}, params));
self.getTrace(parent, $.extend({cv_stddev_last: true}, params));
self.getTrace(parent, $.extend({cv_stddev_first: true, smoothed: true}, params));
self.getTrace(parent, $.extend({cv_stddev_last: true, smoothed: true}, params));
self.getTrace(parent, $.extend({cv_avg: true}, params));
self.getTrace(parent, $.extend({cv_avg: true, smoothed: true}, params));
if (type === ‘test’) {
self.getTrace(parent, $.extend({cv_avg: true, best_point: true}, params));
}
}
}
self.chartsToRedraw[key.chartId] = true;
self.redrawAll();
}
});
});
};
CatboostIpython.prototype.getLaunchMode = function(path) {
return this.meta[path].launch_mode;
};
CatboostIpython.prototype.getChartNode = function(params, active) {
var node = $(‘
if (active) {
node.addClass(‘catboost-graph__chart_active’);
}
return node;
};
CatboostIpython.prototype.getChartTab = function(params, active) {
var node = $(‘
‘);
if (active) {
node.addClass(‘catboost-graph__tab_active’);
}
return node;
};
CatboostIpython.prototype.forEveryChart = function(callback) {
for (var name in this.traces) {
if (this.traces.hasOwnProperty(name)) {
callback(this.traces[name]);
}
}
};
CatboostIpython.prototype.forEveryLayout = function(callback) {
this.forEveryChart(function(chart) {
callback(chart.layout);
});
};
CatboostIpython.prototype.getChart = function(parent, params) {
var id = params.id,
self = this;
if (this.charts[id]) {
return this.charts[id];
}
this.addLayout(parent);
var active = this.activeTab === params.id,
chartNode = this.getChartNode(params, active),
chartTab = this.getChartTab(params, active);
$(‘.catboost-graph__charts’, this.layout).append(chartNode);
$(‘.catboost-graph__tabs’, this.layout).append(chartTab);
this.traces[id] = {
id: params.id,
name: params.name,
parent: chartNode[0],
traces: [],
layout: {
xaxis: {
range: [0, Number(this.meta[params.path].iteration_count)],
type: ‘linear’,
tickmode: ‘auto’,
showspikes: true,
spikethickness: 1,
spikedash: ‘longdashdot’,
spikemode: ‘across’,
zeroline: false,
showgrid: false
},
yaxis: {
zeroline: false
//showgrid: false
//hoverformat : ‘.7f’
},
separators: ‘. ‘,
//hovermode: ‘x’,
margin: {l: 38, r: 0, t: 35, b: 30},
autosize: true,
showlegend: false
},
options: {
scrollZoom: false,
modeBarButtonsToRemove: [‘toggleSpikelines’],
displaylogo: false
}
};
this.charts[id] = this.plotly.plot(chartNode[0], this.traces[id].traces, this.traces[id].layout, this.traces[id].options);
chartNode[0].on(‘plotly_hover’, function(e) {
self.updateTracesValues(e.points[0].x);
});
chartNode[0].on(‘plotly_click’, function(e) {
self.updateTracesValues(e.points[0].x, true);
});
return this.charts[id];
};
CatboostIpython.prototype.getTrace = function(parent, params) {
var key = this.getKey(params),
chartSeries = [];
if (this.traces[key.chartId]) {
chartSeries = this.traces[key.chartId].traces.filter(function(trace) {
return trace.name === key.traceName;
});
}
if (chartSeries.length) {
return chartSeries[0];
} else {
this.getChart(parent, {id: key.chartId, name: params.chartName, path: params.path});
var plotParams = {
color: this.getNextColor(params.path, params.smoothed ? 0.2 : 1),
fillsmoothcolor: this.getNextColor(params.path, 0.1),
fillcolor: this.getNextColor(params.path, 0.4),
hoverinfo: params.cv_avg ? ‘skip’ : ‘text+x’,
width: params.cv_avg ? 2 : 1,
dash: params.type === ‘test’ ? ‘solid’ : ‘dot’
},
trace = {
name: key.traceName,
_params: params,
x: [],
y: [],
hovertext: [],
hoverinfo: plotParams.hoverinfo,
line: {
width: plotParams.width,
dash: plotParams.dash,
color: plotParams.color
},
mode: ‘lines’,
hoveron: ‘points’,
connectgaps: true
};
if (params.best_point) {
trace = {
name: key.traceName,
_params: params,
x: [],
y: [],
marker: {
width: 2,
color: plotParams.color
},
hovertext: [],
hoverinfo: ‘text’,
mode: ‘markers’,
type: ‘scatter’
};
}
if (params.best_value) {
trace = {
name: key.traceName,
_params: params,
x: [],
y: [],
line: {
width: 1,
dash: ‘dash’,
color: ‘#CCCCCC’
},
mode: ‘lines’,
connectgaps: true,
hoverinfo: ‘skip’
};
}
if (params.cv_stddev_last) {
trace.fill = ‘tonexty’;
}
trace._params.plotParams = plotParams;
this.traces[key.chartId].traces.push(trace);
return trace;
}
};
CatboostIpython.prototype.getKey = function(params) {
var traceName = [
params.train,
params.type,
params.indexOfSet,
(params.smoothed ? ‘smoothed’ : ”),
(params.best_point ? ‘best_pount’ : ”),
(params.best_value ? ‘best_value’ : ”),
(params.cv_avg ? ‘cv_avg’ : ”),
(params.cv_stddev_first ? ‘cv_stddev_first’ : ”),
(params.cv_stddev_last ? ‘cv_stddev_last’ : ”)
].join(‘;’);
return {
chartId: params.chartName,
traceName: traceName,
colorId: params.train
};
};
CatboostIpython.prototype.filterTracesEvery = function(traces, filter) {
traces = traces || this.traces[this.activeTab].traces;
return traces.filter(function(trace) {
for (var prop in filter) {
if (filter.hasOwnProperty(prop)) {
if (filter[prop] !== trace._params[prop]) {
return false;
}
}
}
return true;
});
};
CatboostIpython.prototype.filterTracesOne = function(traces, filter) {
traces = traces || this.traces[this.activeTab].traces;
return traces.filter(function(trace) {
for (var prop in filter) {
if (filter.hasOwnProperty(prop)) {
if (filter[prop] === trace._params[prop]) {
return true;
}
}
}
return false;
});
};
CatboostIpython.prototype.cleanSeries = function() {
$(‘.catboost-panel__series’, this.layout).html(”);
};
CatboostIpython.prototype.groupTraces = function() {
var traces = this.traces[this.activeTab].traces,
index = 0,
tracesHash = {};
traces.map(function(trace) {
var train = trace._params.train;
if (!tracesHash[train]) {
tracesHash[train] = {
index: index,
traces: [],
info: {
path: trace._params.path,
color: trace._params.plotParams.color
}
};
index++;
}
tracesHash[train].traces.push(trace);
});
return tracesHash;
};
CatboostIpython.prototype.drawTraces = function() {
if ($(‘.catboost-panel__series .catboost-panel__serie’, this.layout).length) {
return;
}
var html = ”,
tracesHash = this.groupTraces();
for (var train in tracesHash) {
if (tracesHash.hasOwnProperty(train)) {
html += this.drawTrace(train, tracesHash[train]);
}
}
$(‘.catboost-panel__series’, this.layout).html(html);
this.updateTracesValues();
this.addTracesEvents();
};
CatboostIpython.prototype.getTraceDefParams = function(params) {
var defParams = {
smoothed: undefined,
best_point: undefined,
best_value: undefined,
cv_avg: undefined,
cv_stddev_first: undefined,
cv_stddev_last: undefined
};
if (params) {
return $.extend(defParams, params);
} else {
return defParams;
}
};
CatboostIpython.prototype.drawTrace = function(train, hash) {
var info = hash.info,
id = ‘catboost-serie-‘ + this.index + ‘-‘ + hash.index,
traces = {
learn: this.filterTracesEvery(hash.traces, this.getTraceDefParams({type: ‘learn’})),
test: this.filterTracesEvery(hash.traces, this.getTraceDefParams({type: ‘test’}))
},
items = {
learn: {
middle: ”,
bottom: ”
},
test: {
middle: ”,
bottom: ”
}
},
tracesNames = ”;
[‘learn’, ‘test’].forEach(function(type) {
traces[type].forEach(function(trace) {
items[type].middle += ‘
‘
items[type].bottom += ‘
‘
tracesNames += ‘
‘
‘;
});
});
var timeSpendHtml = ‘
‘
‘
‘;
var html = ‘
‘
‘‘ +
‘
(this.getLaunchMode(info.path) !== ‘Eval’ ? timeSpendHtml : ”) +
‘
‘ +
‘
‘ +
‘
‘ +
‘
‘
‘
‘
tracesNames +
‘
‘ +
‘
items.learn.middle +
items.test.middle +
‘
‘ +
‘
items.learn.bottom +
items.test.bottom +
‘
‘ +
‘
‘ +
‘
‘;
return html;
};
CatboostIpython.prototype.updateTracesValues = function(iteration, click) {
var tracesHash = this.groupTraces();
for (var train in tracesHash) {
if (tracesHash.hasOwnProperty(train) && !this.layoutDisabled.traces[train]) {
this.updateTraceValues(train, tracesHash[train], iteration, click);
}
}
};
CatboostIpython.prototype.updateTracesBest = function() {
var tracesHash = this.groupTraces();
for (var train in tracesHash) {
if (tracesHash.hasOwnProperty(train) && !this.layoutDisabled.traces[train]) {
this.updateTraceBest(train, tracesHash[train]);
}
}
};
CatboostIpython.prototype.getBestValue = function(data) {
if (!data.length) {
return {
best: undefined,
index: -1
};
}
var best = data[0],
index = 0,
func = this.lossFuncs[this.traces[this.activeTab].name],
bestDiff = typeof func === ‘number’ ? Math.abs(data[0] – func) : 0;
for (var i = 1, l = data.length; i best) {
best = data[i];
index = i;
}
if (typeof func === ‘number’ && Math.abs(data[i] – func) maxLength) {
maxLength = origTrace.y.length;
}
});
for (var i = 0; i 0) {
avgTrace.x[i] = i;
avgTrace.y[i] = sum / count;
}
}
};
CatboostIpython.prototype.updateTracesCVStdDev = function() {
var tracesHash = this.groupTraces(),
firstTraces = this.filterTracesOne(tracesHash.traces, {cv_stddev_first: true}),
self = this;
firstTraces.forEach(function(trace) {
var origTraces = self.filterTracesEvery(tracesHash.traces, self.getTraceDefParams({
train: trace._params.train,
type: trace._params.type,
smoothed: trace._params.smoothed
})),
lastTraces = self.filterTracesEvery(tracesHash.traces, self.getTraceDefParams({
train: trace._params.train,
type: trace._params.type,
smoothed: trace._params.smoothed,
cv_stddev_last: true
}));
if (origTraces.length && lastTraces.length === 1) {
self.cvStdDevFunc(origTraces, trace, lastTraces[0]);
}
});
};
CatboostIpython.prototype.cvStdDevFunc = function(origTraces, firstTrace, lastTrace) {
var maxCount = origTraces.length,
maxLength = -1,
count,
sum,
i, j;
origTraces.forEach(function(origTrace) {
if (origTrace.y.length > maxLength) {
maxLength = origTrace.y.length;
}
});
for (i = 0; i i) {
firstTrace.hovertext[i] += this.hovertextParameters[i];
lastTrace.hovertext[i] += this.hovertextParameters[i];
}
}
};
CatboostIpython.prototype.updateTracesSmoothness = function() {
var tracesHash = this.groupTraces(),
smoothedTraces = this.filterTracesOne(tracesHash.traces, {smoothed: true}),
enabled = this.getSmoothness() > -1,
self = this;
smoothedTraces.forEach(function(trace) {
var origTraces = self.filterTracesEvery(tracesHash.traces, self.getTraceDefParams({
train: trace._params.train,
type: trace._params.type,
indexOfSet: trace._params.indexOfSet,
cv_avg: trace._params.cv_avg,
cv_stddev_first: trace._params.cv_stddev_first,
cv_stddev_last: trace._params.cv_stddev_last
})),
colorFlag = false;
if (origTraces.length === 1) {
origTraces = origTraces[0];
if (origTraces.visible) {
if (enabled) {
self.smoothFunc(origTraces, trace);
colorFlag = true;
}
self.highlightSmoothedTrace(origTraces, trace, colorFlag);
}
}
});
};
CatboostIpython.prototype.highlightSmoothedTrace = function(trace, smoothedTrace, flag) {
if (flag) {
smoothedTrace.line.color = trace._params.plotParams.color;
trace.line.color = smoothedTrace._params.plotParams.color;
trace.hoverinfo = ‘skip’;
if (trace._params.cv_stddev_last) {
trace.fillcolor = trace._params.plotParams.fillsmoothcolor;
}
} else {
trace.line.color = trace._params.plotParams.color;
trace.hoverinfo = trace._params.plotParams.hoverinfo;
if (trace._params.cv_stddev_last) {
trace.fillcolor = trace._params.plotParams.fillcolor;
}
}
};
CatboostIpython.prototype.smoothFunc = function(origTrace, smoothedTrace) {
var data = origTrace.y,
smoothedPoints = this.smooth(data, this.getSmoothness()),
smoothedIndex = 0,
self = this;
if (smoothedPoints.length) {
data.forEach(function (d, index) {
if (!smoothedTrace.x[index]) {
smoothedTrace.x[index] = index;
}
var nameOfSet = smoothedTrace._params.nameOfSet;
if (smoothedTrace._params.cv_stddev_first || smoothedTrace._params.cv_stddev_last) {
nameOfSet = smoothedTrace._params.type + ‘ std’;
}
smoothedTrace.y[index] = smoothedPoints[smoothedIndex];
smoothedTrace.hovertext[index] = nameOfSet + ‘`: ‘ + smoothedPoints[smoothedIndex].toPrecision(7);
if (self.hovertextParameters.length > index) {
smoothedTrace.hovertext[index] += self.hovertextParameters[index];
}
smoothedIndex++;
});
}
};
CatboostIpython.prototype.formatItemValue = function(value, index, suffix) {
if (typeof value === ‘undefined’) {
return ”;
}
suffix = suffix || ”;
return ‘‘ + value + ‘‘;
};
CatboostIpython.prototype.updateTraceBest = function(train, hash) {
var traces = this.filterTracesOne(hash.traces, {best_point: true}),
self = this;
traces.forEach(function(trace) {
var testTrace = self.filterTracesEvery(hash.traces, self.getTraceDefParams({
train: trace._params.train,
type: ‘test’,
indexOfSet: trace._params.indexOfSet
}));
if (self.hasCVMode) {
testTrace = self.filterTracesEvery(hash.traces, self.getTraceDefParams({
train: trace._params.train,
type: ‘test’,
cv_avg: true
}));
}
var bestValue = self.getBestValue(testTrace.length === 1 ? testTrace[0].y : []);
if (bestValue.index !== -1) {
trace.x[0] = bestValue.index;
trace.y[0] = bestValue.best;
trace.hovertext[0] = bestValue.func + ‘ (‘ + (self.hasCVMode ? ‘avg’ : trace._params.nameOfSet) + ‘): ‘ + bestValue.index + ‘ ‘ + bestValue.best;
}
});
};
CatboostIpython.prototype.updateTraceValues = function(name, hash, iteration, click) {
var id = ‘catboost-serie-‘ + this.index + ‘-‘ + hash.index,
traces = {
learn: this.filterTracesEvery(hash.traces, this.getTraceDefParams({type: ‘learn’})),
test: this.filterTracesEvery(hash.traces, this.getTraceDefParams({type: ‘test’}))
},
path = hash.info.path,
self = this;
[‘learn’, ‘test’].forEach(function(type) {
traces[type].forEach(function(trace) {
var data = trace.y || [],
index = typeof iteration !== ‘undefined’ && iteration -1 ? bestValue.index : ”);
$(‘#’ + id + ‘ .catboost-panel__serie_best_test_value[data-index=’ + trace._params.indexOfSet + ‘]’, self.layout)
.html(self.formatItemValue(bestValue.best, bestValue.index, ‘best ‘ + trace._params.nameOfSet + ‘ ‘));
}
});
});
if (this.hasCVMode) {
var testTrace = this.filterTracesEvery(hash.traces, this.getTraceDefParams({
type: ‘test’,
cv_avg: true
})),
bestValue = this.getBestValue(testTrace.length === 1 ? testTrace[0].y : []);
$(‘#’ + id + ‘ .catboost-panel__serie_best_iteration’, this.layout).html(bestValue.index > -1 ? bestValue.index : ”);
}
if (click) {
this.clickMode = true;
$(‘#catboost-control2-clickmode’ + this.index, this.layout)[0].checked = true;
}
};
CatboostIpython.prototype.addTracesEvents = function() {
var self = this;
$(‘.catboost-panel__serie_checkbox’, this.layout).click(function() {
var name = $(this).data(‘seriename’);
self.layoutDisabled.traces[name] = !$(this)[0].checked;
self.redrawActiveChart();
});
};
CatboostIpython.prototype.getNextColor = function(path, opacity) {
var color;
if (this.colorsByPath[path]) {
color = this.colorsByPath[path];
} else {
color = this.colors[this.colorIndex];
this.colorsByPath[path] = color;
this.colorIndex++;
if (this.colorIndex > this.colors.length – 1) {
this.colorIndex = 0;
}
}
return this.hexToRgba(color, opacity);
};
CatboostIpython.prototype.hexToRgba = function(value, opacity) {
if (value.length 0) {
out += hours + ‘h ‘;
seconds = 0;
millis = 0;
}
if (minutes && minutes > 0) {
out += minutes + ‘m ‘;
millis = 0;
}
if (seconds && seconds > 0) {
out += seconds + ‘s ‘;
}
if (millis && millis > 0) {
out += millis + ‘ms’;
}
return out.trim();
};
CatboostIpython.prototype.mean = function(values, valueof) {
var n = values.length,
m = n,
i = -1,
value,
sum = 0,
number = function(x) {
return x === null ? NaN : +x;
};
if (valueof === null) {
while (++i
3.11 Zapisywanie modelu¶
Zawsze bardzo przydatne jest zrzucenie modelu na dysk (szczególnie jeśli szkolenie zajęło trochę czasu).
model = CatBoostClassifier(iterations=10, random_seed=42, logging_level='Silent').fit(train_pool)
model.save_model('catboost_model.dump')
model = CatBoostClassifier()
model.load_model('catboost_model.dump');
hyperopt¶
Chociaż zawsze można wybrać optymalną liczbę iteracji (etapy przyspieszające) poprzez walidację krzyżową i wykresy krzywej uczenia się, ważne jest również, aby bawić się niektórymi parametrami modelu, i chcielibyśmy zwrócić szczególną uwagę na l2_leaf_reg i learning_rate.
W tej sekcji wybieramy te parametry za pomocą pakietu hyperopt.
Instalujemy to!
!pip install hyperopt¶
import hyperopt
def hyperopt_objective(params):
model = CatBoostClassifier(
l2_leaf_reg=int(params['l2_leaf_reg']),
learning_rate=params['learning_rate'],
iterations=500,
eval_metric='Accuracy',
random_seed=42,
verbose=False,
loss_function='Logloss',
)
cv_data = cv(
Pool(X, y, cat_features=categorical_features_indices),
model.get_params()
)
best_accuracy = np.max(cv_data['test-Accuracy-mean'])
return 1 - best_accuracy # as hyperopt minimises
from numpy.random import RandomState
params_space = {
'l2_leaf_reg': hyperopt.hp.qloguniform('l2_leaf_reg', 0, 2, 1),
'learning_rate': hyperopt.hp.uniform('learning_rate', 1e-3, 5e-1),
}
trials = hyperopt.Trials()
best = hyperopt.fmin(
hyperopt_objective,
space=params_space,
algo=hyperopt.tpe.suggest,
max_evals=50,
trials=trials,
rstate=RandomState(123)
)
print(best)
‘l2_leaf_reg’ Współczynnik na poziomie regularyzacji L2 funkcji kosztu. Każda wartość dodatnia jest dozwolona.
Iteracje i szybkość uczenia się (Iterations and learning rate)¶
Domyślnie CatBoost buduje 1000 drzew. Liczbę iteracji można zmniejszyć, aby przyspieszyć trening.
Gdy liczba iteracji maleje, należy zwiększyć szybkość uczenia się. Domyślnie wartość szybkości uczenia się jest definiowana automatycznie w zależności od liczby iteracji i wejściowego zestawu danych. Zmiana liczby iteracji na mniejszą wartość jest dobrym punktem wyjścia do optymalizacji.
Teraz (po znalezieniu optymalnych parametrów: ‘l2_leaf_reg’ i ‘learning_rate’, zdobądźmy wszystkie dane CV z najlepszymi parametrami:¶
model = CatBoostClassifier(
l2_leaf_reg=int(best['l2_leaf_reg']),
learning_rate=best['learning_rate'],
iterations=500,
eval_metric='Accuracy',
random_seed=42,
verbose=False,
loss_function='Logloss',
)
cv_data = cv(Pool(X, y, cat_features=categorical_features_indices), model.get_params())
print('Precise validation accuracy score: {}'.format(np.max(cv_data['test-Accuracy-mean'])))
Przypomnijmy, że przy domyślnych parametrach wynik cv wyniósł 0,8283, a zatem mamy (prawdopodobnie nieistotną statystycznie) pewną poprawę.
Prześlij na konkurs¶
Teraz zmienilibyśmy nasz dostrojony model na wszystkich danych treningowych, które mamy
model_VV.fit(X, y, cat_features=categorical_features_indices)
Na koniec przygotujmy plik zgłoszenia:
import pandas as pd
submisstion = pd.DataFrame()
#submisstion['PassengerId'] = X_train['PassengerId']
#submisstion['Survived'] = model.predict(X_train)
submisstion.to_csv('submission.csv', index=False)
Wreszcie możesz złożyć zgłoszenie w konkursie Titanic Kaggle.
Otóż to! Teraz możesz grać z CatBoost i wygrywać niektóre konkursy! 🙂

