
The dataset contains 9568 data points collected from a Combined Cycle Power Plant over 6 years (2006-2011), when the power plant was set to work with full load. Features consist of hourly average ambient variables Temperature (T), Ambient Pressure (AP), Relative Humidity (RH) and Exhaust Vacuum (V) to predict the net hourly electrical energy output (EP) of the plant. Features consist of hourly average ambient variablesCombined Cycle Power Plant Data Set¶
Data Set Information:¶
A combined cycle power plant (CCPP) is composed of gas turbines (GT), steam turbines (ST) and heat recovery steam generators. In a CCPP, the electricity is generated by gas and steam turbines, which are combined in one cycle, and is transferred from one turbine to another. While the Vacuum is colected from and has effect on the Steam Turbine, he other three of the ambient variables effect the GT performance.
For comparability with our baseline studies, and to allow 5×2 fold statistical tests be carried out, we provide the data shuffled five times. For each shuffling 2-fold CV is carried out and the resulting 10 measurements are used for statistical testing.
We provide the data both in .ods and in .xlsx formats.Attribute Information:¶
The averages are taken from various sensors located around the plant that record the ambient variables every second. The variables are given without normalization.import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
df = pd.read_csv('/home/wojciech/Pulpit/1/Folds5x2_pp.csv')
del df['Unnamed: 0']
df.columns = ['Temperature', 'Exhaust_Vacuum', 'Ambient_Pressure', 'Relative_Humidity', 'Energy_output']
df.sample(3)
sns.set(style="ticks")
corr = df.corr()
mask = np.zeros_like(corr, dtype=np.bool)
mask[np.triu_indices_from(mask)] = True
f, ax = plt.subplots(figsize=(12, 6))
cmap = sns.diverging_palette(180, 90, as_cmap=True)
sns.heatmap(corr, mask=mask, cmap=cmap, vmax=.3, center=0,annot=True,
square=True, linewidths=.9, cbar_kws={"shrink": .9})
df2 = pd.read_csv('/home/wojciech/Pulpit/1/bank.csv')
del df2['Unnamed: 0']
del df2['Unnamed: 0.1']
df2.head()
sns.set(style="ticks")
corr = df2.corr()
mask = np.zeros_like(corr, dtype=np.bool)
mask[np.triu_indices_from(mask)] = True
f, ax = plt.subplots(figsize=(22, 10))
cmap = sns.diverging_palette(580, 10, as_cmap=True)
sns.heatmap(corr, mask=mask, cmap=cmap, vmax=0.3, center=0.03,annot=True,
square=True, linewidths=.9, cbar_kws={"shrink": 0.8})
Definition¶
def matrix_plot(df,title):
sns.set(style="ticks")
corr = df2.corr()
mask = np.zeros_like(corr, dtype=np.bool)
mask[np.triu_indices_from(mask)] = True
f, ax = plt.subplots(figsize=(22, 10))
#cmap = sns.diverging_palette(580, 10, as_cmap=True)
cmap = sns.diverging_palette(180, 90, as_cmap=True) #Inna paleta barw
sns.heatmap(corr, mask=mask, cmap=cmap, vmax=0.3, center=0.03,annot=True,
square=True, linewidths=.9, cbar_kws={"shrink": 0.8})
plt.xticks(rotation=90)
plt.title(title,fontsize=22,color='#0c343d',alpha=0.5)
plt.show
matrix_plot(df2, 'Perfect Plots: Matrix of corelation')
Definition by class¶
class mx_plot:
def __init__(self,df,title):
self.df = df
self.title = title
def matrix(self):
sns.set(style="ticks")
corr = df2.corr()
mask = np.zeros_like(corr, dtype=np.bool)
mask[np.triu_indices_from(mask)] = True
f, ax = plt.subplots(figsize=(22, 10))
#cmap = sns.diverging_palette(580, 10, as_cmap=True)
cmap = sns.diverging_palette(580, 10, as_cmap=True) #Inna paleta barw
sns.heatmap(corr, mask=mask, cmap=cmap, vmax=0.3, center=0.03,annot=True,
square=True, linewidths=.9, cbar_kws={"shrink": 0.8})
plt.xticks(rotation=90)
plt.title(title,fontsize=22,color='#0c343d',alpha=0.5)
plt.show
import seaborn as sns
df=df2
title = 'Perfect Plots: Matrix of corelation'
PKP = mx_plot(df2,title)
PKP.matrix()