Perfect Plots: Joyplot Plot

In [1]:
import joypy
import pandas as pd
import matplotlib.pyplot as plt
In [2]:
df= pd.read_csv('c:/1/mpg_ggplot2.txt')
df.head()
Out[2]:
  manufacturer model displ year cyl trans drv cty hwy fl class
0 audi a4 1.8 1999 4 auto(l5) f 18 29 p compact
1 audi a4 1.8 1999 4 manual(m5) f 21 29 p compact
2 audi a4 2.0 2008 4 manual(m6) f 20 31 p compact
3 audi a4 2.0 2008 4 auto(av) f 21 30 p compact
4 audi a4 2.8 1999 6 auto(l5) f 16 26 p compact
In [3]:
plt.figure(dpi= 380)

fig, axes = joypy.joyplot(df, column=['hwy', 'cty'], by="class", ylim='own', figsize=(12,8), legend=True, color=['#76a5af', '#134f5c'], alpha=0.9)

# Decoration
plt.title('Joy Plot of City and Highway Mileage by Class', fontsize=32, color='#d0e0e3', alpha=0.9)
plt.rc("font", size=20)
plt.xlabel('Year 2018',  fontsize=16, color='darkred', alpha=1)
plt.ylabel('Data Scientist', fontsize=26,  color='grey', alpha=0.8)

plt.show
Out[3]:
<function matplotlib.pyplot.show(*args, **kw)>
<Figure size 2280x1520 with 0 Axes>
 

Titanic disaster

We ought to find which passengers have chance to survive according to their affiliation to the established groups.

Source of data: https://www.kaggle.com/shivamp629/traincsv

In [4]:
df2 = pd.read_csv('c:/1/kaggletrain.csv')
df2.head(3)
Out[4]:
  Unnamed: 0 PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Embarked
0 0 1 0 3 Braund, Mr. Owen Harris male 22.0 1 0 A/5 21171 7.2500 NaN S
1 1 2 1 1 Cumings, Mrs. John Bradley (Florence Briggs Th… female 38.0 1 0 PC 17599 71.2833 C85 C
2 2 3 1 3 Heikkinen, Miss. Laina female 26.0 0 0 STON/O2. 3101282 7.9250 NaN S
In [5]:
df2['Age'].head()
Out[5]:
0    22.0
1    38.0
2    26.0
3    35.0
4    35.0
Name: Age, dtype: float64
In [6]:
AA = df2.pivot_table(index=['Name','Pclass'], columns='Sex', values='Age').reset_index()
AA.head()
Out[6]:
Sex Name Pclass female male
0 Abbing, Mr. Anthony 3 NaN 42.0
1 Abbott, Mr. Rossmore Edward 3 NaN 16.0
2 Abbott, Mrs. Stanton (Rosa Hunt) 3 35.0 NaN
3 Abelson, Mr. Samuel 2 NaN 30.0
4 Abelson, Mrs. Samuel (Hannah Wizosky) 2 28.0 NaN
In [7]:
plt.figure(dpi= 380)

fig, axes = joypy.joyplot(AA, column=['female', 'male'], by="Pclass", ylim='own', figsize=(12,8), legend=True, color=['#f4cccc', '#0c343d'], alpha=0.4)

# Decoration
plt.title('Titanic disaster: age distribution of casualties by the class', fontsize=32, color='#d0e0e3', alpha=0.9)
plt.rc("font", size=20)
plt.xlabel('Age of passengers',  fontsize=16, color='darkred', alpha=1)
#plt.ylabel('Data Scientist', fontsize=26,  color='grey', alpha=0.8)

plt.show
Out[7]:
<function matplotlib.pyplot.show(*args, **kw)>
<Figure size 2280x1520 with 0 Axes>
In [8]:
BB = df2.pivot_table(index=['Name','Survived'], columns='Sex', values='Age').reset_index()
BB.head()
Out[8]:
Sex Name Survived female male
0 Abbing, Mr. Anthony 0 NaN 42.0
1 Abbott, Mr. Rossmore Edward 0 NaN 16.0
2 Abbott, Mrs. Stanton (Rosa Hunt) 1 35.0 NaN
3 Abelson, Mr. Samuel 0 NaN 30.0
4 Abelson, Mrs. Samuel (Hannah Wizosky) 1 28.0 NaN
In [9]:
plt.figure(dpi= 380)

fig, axes = joypy.joyplot(BB, column=['female', 'male'], by="Survived", ylim='own', figsize=(12,8), legend=True, color=['#a4c2f4', '#1c4587'], alpha=0.4)

# Decoration
plt.title('Titanic disaster: age distribution of casualties by the gender', fontsize=32, color='#d0e0e3', alpha=0.9)
plt.rc("font", size=20)
plt.xlabel('Age of passangers',  fontsize=16, color='darkred', alpha=1)
plt.ylabel('Data Scientist', fontsize=26,  color='grey', alpha=0.8)

plt.show
Out[9]:
<function matplotlib.pyplot.show(*args, **kw)>
<Figure size 2280x1520 with 0 Axes>
In [10]:
df3= pd.read_csv('c:/1/drinksbycountry.csv')
df3.head()
Out[10]:
  Unnamed: 0 country beer_servings spirit_servings wine_servings total_litres_of_pure_alcohol continent
0 0 Afghanistan 0 0 0 0.0 Asia
1 1 Albania 89 132 54 4.9 Europe
2 2 Algeria 25 0 14 0.7 Africa
3 3 Andorra 245 138 312 12.4 Europe
4 4 Angola 217 57 45 5.9 Africa
In [11]:
plt.figure(dpi= 380)

fig, axes = joypy.joyplot(df3, column=['beer_servings', 'spirit_servings','wine_servings'], by="continent", ylim='own', figsize=(12,8), legend=True, color=['#274e13', 'red', '#f1c232'], alpha=0.4)

# Decoration
plt.title('Alcohol consumption by continents', fontsize=32, color='#d0e0e3', alpha=0.9)
plt.rc("font", size=20)
plt.xlabel('The level of consumption',  fontsize=16, color='darkred', alpha=0.4)
#plt.ylabel('Data Scientist', fontsize=26,  color='grey', alpha=0.8)

plt.show
Out[11]:
<function matplotlib.pyplot.show(*args, **kw)>
<Figure size 2280x1520 with 0 Axes>
 

World Happiness Report

Source of data: https://data.world/promptcloud/world-happiness-report-2019

In [12]:
df4 = pd.read_csv('c:/1/WorldHappinessReport.csv')
df4.head(3)
Out[12]:
  Unnamed: 0 Country Region Happiness Rank Happiness Score Economy (GDP per Capita) Family Health (Life Expectancy) Freedom Trust (Government Corruption) Generosity Dystopia Residual Year
0 0 Afghanistan Southern Asia 153.0 3.575 0.31982 0.30285 0.30335 0.23414 0.09719 0.36510 1.95210 2015.0
1 1 Albania Central and Eastern Europe 95.0 4.959 0.87867 0.80434 0.81325 0.35733 0.06413 0.14272 1.89894 2015.0
2 2 Algeria Middle East and Northern Africa 68.0 5.605 0.93929 1.07772 0.61766 0.28579 0.17383 0.07822 2.43209 2015.0
In [13]:
df4['Year'].value_counts()
Out[13]:
2017.0    164
2016.0    164
2015.0    164
Name: Year, dtype: int64
In [14]:
CC = df4[df4['Year']==2017]
CC.head(3)
Out[14]:
  Unnamed: 0 Country Region Happiness Rank Happiness Score Economy (GDP per Capita) Family Health (Life Expectancy) Freedom Trust (Government Corruption) Generosity Dystopia Residual Year
330 330 Afghanistan Southern Asia 141.0 3.794 0.401477 0.581543 0.180747 0.106180 0.061158 0.311871 2.150801 2017.0
331 331 Albania Central and Eastern Europe 109.0 4.644 0.996193 0.803685 0.731160 0.381499 0.039864 0.201313 1.490442 2017.0
332 332 Algeria Middle East and Northern Africa 53.0 5.872 1.091864 1.146217 0.617585 0.233336 0.146096 0.069437 2.567604 2017.0
In [15]:
plt.figure(dpi= 380)

fig, axes = joypy.joyplot(CC, column=['Freedom', 'Trust (Government Corruption)'], by="Region", ylim='own', figsize=(12,8), legend=True, alpha=0.4)

# Decoration
plt.title('World Happiness Report', fontsize=32, color='#d0e0e3', alpha=0.9)
plt.rc("font", size=20)
plt.xlabel('Indicator',  fontsize=16, color='darkred', alpha=0.4)
plt.ylabel('Data Scientist', fontsize=26,  color='grey', alpha=0.8)

plt.show
Out[15]:
<function matplotlib.pyplot.show(*args, **kw)>
<Figure size 2280x1520 with 0 Axes>
 

Banking marketing

Analysis of the categorical results.
Source of data: https://archive.ics.uci.edu/ml/machine-learning-databases/00222/

In [16]:
df5 = pd.read_csv('c:/1/bank.csv')
df5.head(3)
Out[16]:
  Unnamed: 0 Unnamed: 0.1 age job marital education default housing loan contact campaign pdays previous poutcome emp_var_rate cons_price_idx cons_conf_idx euribor3m nr_employed y
0 0 0 44 blue-collar married basic.4y unknown yes no cellular 1 999 0 nonexistent 1.4 93.444 -36.1 4.963 5228.1 0
1 1 1 53 technician married unknown no no no cellular 1 999 0 nonexistent -0.1 93.200 -42.0 4.021 5195.8 0
2 2 2 28 management single university.degree no yes no cellular 3 6 2 success -1.7 94.055 -39.8 0.729 4991.6 1

3 rows × 23 columns

In [17]:
FF = df5.pivot_table(index=['Unnamed: 0','marital'], columns='y', values='age').reset_index()
FF.head()
Out[17]:
y Unnamed: 0 marital 0 1
0 0 married 44.0 NaN
1 1 married 53.0 NaN
2 2 single NaN 28.0
3 3 married 39.0 NaN
4 4 married NaN 55.0
In [20]:
plt.figure(dpi= 380)

fig, axes = joypy.joyplot(FF, column=[0,1], by="marital", ylim='own', figsize=(12,8), legend=True, color=['#351c75', '#b4a7d6'], alpha=0.4)

# Decoration
plt.title('Customer age structure', fontsize=32, color='#d0e0e3', alpha=0.9)
plt.rc("font", size=20)
plt.xlabel('customer age',  fontsize=16, color='darkred', alpha=0.4)
plt.ylabel('Data Scientist', fontsize=26,  color='grey', alpha=0.8)

plt.show
Out[20]:
<function matplotlib.pyplot.show(*args, **kw)>
<Figure size 2280x1520 with 0 Axes>