Introduction to Seaborn - Python
introml.analyticsdojo.com
13. Introduction to Seaborn¶
13.1. Overview¶
Look at distributions
Seaborn is an alternate data visualization package.
This has been adopted from the Seaborn Documentation.Read more at https://stanford.edu/~mwaskom/software/seaborn/api.html
#This uses the same mechanisms.
%matplotlib inline
import numpy as np
import pandas as pd
from scipy import stats, integrate
import matplotlib.pyplot as plt
import seaborn as sns
sns.set(color_codes=True)
13.2. Distribution Plots¶
Histogram with KDE
Histogram with Rugplot
import seaborn as sns, numpy as np
sns.set(); np.random.seed(0)
x = np.random.randn(100)
x
array([ 1.76405235, 0.40015721, 0.97873798, 2.2408932 , 1.86755799,
-0.97727788, 0.95008842, -0.15135721, -0.10321885, 0.4105985 ,
0.14404357, 1.45427351, 0.76103773, 0.12167502, 0.44386323,
0.33367433, 1.49407907, -0.20515826, 0.3130677 , -0.85409574,
-2.55298982, 0.6536186 , 0.8644362 , -0.74216502, 2.26975462,
-1.45436567, 0.04575852, -0.18718385, 1.53277921, 1.46935877,
0.15494743, 0.37816252, -0.88778575, -1.98079647, -0.34791215,
0.15634897, 1.23029068, 1.20237985, -0.38732682, -0.30230275,
-1.04855297, -1.42001794, -1.70627019, 1.9507754 , -0.50965218,
-0.4380743 , -1.25279536, 0.77749036, -1.61389785, -0.21274028,
-0.89546656, 0.3869025 , -0.51080514, -1.18063218, -0.02818223,
0.42833187, 0.06651722, 0.3024719 , -0.63432209, -0.36274117,
-0.67246045, -0.35955316, -0.81314628, -1.7262826 , 0.17742614,
-0.40178094, -1.63019835, 0.46278226, -0.90729836, 0.0519454 ,
0.72909056, 0.12898291, 1.13940068, -1.23482582, 0.40234164,
-0.68481009, -0.87079715, -0.57884966, -0.31155253, 0.05616534,
-1.16514984, 0.90082649, 0.46566244, -1.53624369, 1.48825219,
1.89588918, 1.17877957, -0.17992484, -1.07075262, 1.05445173,
-0.40317695, 1.22244507, 0.20827498, 0.97663904, 0.3563664 ,
0.70657317, 0.01050002, 1.78587049, 0.12691209, 0.40198936])
13.2.1. Distribution Plot (distplot)¶
Any compbination of hist, rug, kde
Note it also has in it a KDE plot included
Can manually set the number of bins
See documentation here
#Histogram
# https://seaborn.pydata.org/generated/seaborn.distplot.html#seaborn.distplot
ax = sns.distplot(x)
#Adjust number of bins for more fine grained view
ax = sns.distplot(x, bins = 20)
#Include rug and kde (no histogram)
sns.distplot(x, hist=False, rug=True);
#Kernel Density
#https://seaborn.pydata.org/generated/seaborn.rugplot.html#seaborn.rugplot
ax = sns.distplot(x, bins=10, kde=True, rug=True)
13.2.2. Box Plots¶
Break data into quartiles.
Can show distribution/ranges of different categories.
Jhguch at en.wikipedia [CC BY-SA 2.5 (https://creativecommons.org/licenses/by-sa/2.5)], from Wikimedia Commons
sns.set_style("whitegrid")
#This is data on tips (a real dataset) and our familiar iris dataset
tips = sns.load_dataset("tips")
iris = sns.load_dataset("iris")
titanic = sns.load_dataset("titanic")
#Tips is a pandas dataframe
tips.head()
total_bill | tip | sex | smoker | day | time | size | |
---|---|---|---|---|---|---|---|
0 | 16.99 | 1.01 | Female | No | Sun | Dinner | 2 |
1 | 10.34 | 1.66 | Male | No | Sun | Dinner | 3 |
2 | 21.01 | 3.50 | Male | No | Sun | Dinner | 3 |
3 | 23.68 | 3.31 | Male | No | Sun | Dinner | 2 |
4 | 24.59 | 3.61 | Female | No | Sun | Dinner | 4 |
ax = sns.boxplot(x=tips["total_bill"])
# Notice we can see the few ouliers on right side
ax = sns.distplot(tips["total_bill"], kde=True, rug=True)
13.3. Relationship Plots¶
Pairplots to show all
Regplot for 2 continuous variables
Scatterplot for two continuous variables
Swarmplot or BoxPlot for continuous and categorical
#Notice how this works for continuous, not great for categorical
h = sns.pairplot(tips, hue="time")
g = sns.pairplot(iris, hue="species")
# Show relationship between 2 continuous variables with regression line.
sns.regplot(x="total_bill", y="tip", data=tips);
# Break down
sns.boxplot(x="day", y="total_bill", hue="time", data=tips);
#Uses an algorithm to prevent overlap
sns.swarmplot(x="day", y="total_bill", hue= "time",data=tips);
#Uses an algorithm to prevent overlap
sns.violinplot(x="day", y="total_bill", hue= "time",data=tips);
#Stacking Graphs Is Easy
sns.violinplot(x="day", y="total_bill", data=tips, inner=None)
sns.swarmplot(x="day", y="total_bill", data=tips, color="w", alpha=.5);
13.4. Visualizing Summary Data¶
Barplots will show the
#This
sns.barplot(x="sex", y="tip", data=tips);
tips
total_bill | tip | sex | smoker | day | time | size | |
---|---|---|---|---|---|---|---|
0 | 16.99 | 1.01 | Female | No | Sun | Dinner | 2 |
1 | 10.34 | 1.66 | Male | No | Sun | Dinner | 3 |
2 | 21.01 | 3.50 | Male | No | Sun | Dinner | 3 |
3 | 23.68 | 3.31 | Male | No | Sun | Dinner | 2 |
4 | 24.59 | 3.61 | Female | No | Sun | Dinner | 4 |
5 | 25.29 | 4.71 | Male | No | Sun | Dinner | 4 |
6 | 8.77 | 2.00 | Male | No | Sun | Dinner | 2 |
7 | 26.88 | 3.12 | Male | No | Sun | Dinner | 4 |
8 | 15.04 | 1.96 | Male | No | Sun | Dinner | 2 |
9 | 14.78 | 3.23 | Male | No | Sun | Dinner | 2 |
10 | 10.27 | 1.71 | Male | No | Sun | Dinner | 2 |
11 | 35.26 | 5.00 | Female | No | Sun | Dinner | 4 |
12 | 15.42 | 1.57 | Male | No | Sun | Dinner | 2 |
13 | 18.43 | 3.00 | Male | No | Sun | Dinner | 4 |
14 | 14.83 | 3.02 | Female | No | Sun | Dinner | 2 |
15 | 21.58 | 3.92 | Male | No | Sun | Dinner | 2 |
16 | 10.33 | 1.67 | Female | No | Sun | Dinner | 3 |
17 | 16.29 | 3.71 | Male | No | Sun | Dinner | 3 |
18 | 16.97 | 3.50 | Female | No | Sun | Dinner | 3 |
19 | 20.65 | 3.35 | Male | No | Sat | Dinner | 3 |
20 | 17.92 | 4.08 | Male | No | Sat | Dinner | 2 |
21 | 20.29 | 2.75 | Female | No | Sat | Dinner | 2 |
22 | 15.77 | 2.23 | Female | No | Sat | Dinner | 2 |
23 | 39.42 | 7.58 | Male | No | Sat | Dinner | 4 |
24 | 19.82 | 3.18 | Male | No | Sat | Dinner | 2 |
25 | 17.81 | 2.34 | Male | No | Sat | Dinner | 4 |
26 | 13.37 | 2.00 | Male | No | Sat | Dinner | 2 |
27 | 12.69 | 2.00 | Male | No | Sat | Dinner | 2 |
28 | 21.70 | 4.30 | Male | No | Sat | Dinner | 2 |
29 | 19.65 | 3.00 | Female | No | Sat | Dinner | 2 |
... | ... | ... | ... | ... | ... | ... | ... |
214 | 28.17 | 6.50 | Female | Yes | Sat | Dinner | 3 |
215 | 12.90 | 1.10 | Female | Yes | Sat | Dinner | 2 |
216 | 28.15 | 3.00 | Male | Yes | Sat | Dinner | 5 |
217 | 11.59 | 1.50 | Male | Yes | Sat | Dinner | 2 |
218 | 7.74 | 1.44 | Male | Yes | Sat | Dinner | 2 |
219 | 30.14 | 3.09 | Female | Yes | Sat | Dinner | 4 |
220 | 12.16 | 2.20 | Male | Yes | Fri | Lunch | 2 |
221 | 13.42 | 3.48 | Female | Yes | Fri | Lunch | 2 |
222 | 8.58 | 1.92 | Male | Yes | Fri | Lunch | 1 |
223 | 15.98 | 3.00 | Female | No | Fri | Lunch | 3 |
224 | 13.42 | 1.58 | Male | Yes | Fri | Lunch | 2 |
225 | 16.27 | 2.50 | Female | Yes | Fri | Lunch | 2 |
226 | 10.09 | 2.00 | Female | Yes | Fri | Lunch | 2 |
227 | 20.45 | 3.00 | Male | No | Sat | Dinner | 4 |
228 | 13.28 | 2.72 | Male | No | Sat | Dinner | 2 |
229 | 22.12 | 2.88 | Female | Yes | Sat | Dinner | 2 |
230 | 24.01 | 2.00 | Male | Yes | Sat | Dinner | 4 |
231 | 15.69 | 3.00 | Male | Yes | Sat | Dinner | 3 |
232 | 11.61 | 3.39 | Male | No | Sat | Dinner | 2 |
233 | 10.77 | 1.47 | Male | No | Sat | Dinner | 2 |
234 | 15.53 | 3.00 | Male | Yes | Sat | Dinner | 2 |
235 | 10.07 | 1.25 | Male | No | Sat | Dinner | 2 |
236 | 12.60 | 1.00 | Male | Yes | Sat | Dinner | 2 |
237 | 32.83 | 1.17 | Male | Yes | Sat | Dinner | 2 |
238 | 35.83 | 4.67 | Female | No | Sat | Dinner | 3 |
239 | 29.03 | 5.92 | Male | No | Sat | Dinner | 3 |
240 | 27.18 | 2.00 | Female | Yes | Sat | Dinner | 2 |
241 | 22.67 | 2.00 | Male | Yes | Sat | Dinner | 2 |
242 | 17.82 | 1.75 | Male | No | Sat | Dinner | 2 |
243 | 18.78 | 3.00 | Female | No | Thur | Dinner | 2 |
244 rows × 7 columns
tips
total_bill | tip | sex | smoker | day | time | size | |
---|---|---|---|---|---|---|---|
0 | 16.99 | 1.01 | Female | No | Sun | Dinner | 2 |
1 | 10.34 | 1.66 | Male | No | Sun | Dinner | 3 |
2 | 21.01 | 3.50 | Male | No | Sun | Dinner | 3 |
3 | 23.68 | 3.31 | Male | No | Sun | Dinner | 2 |
4 | 24.59 | 3.61 | Female | No | Sun | Dinner | 4 |
5 | 25.29 | 4.71 | Male | No | Sun | Dinner | 4 |
6 | 8.77 | 2.00 | Male | No | Sun | Dinner | 2 |
7 | 26.88 | 3.12 | Male | No | Sun | Dinner | 4 |
8 | 15.04 | 1.96 | Male | No | Sun | Dinner | 2 |
9 | 14.78 | 3.23 | Male | No | Sun | Dinner | 2 |
10 | 10.27 | 1.71 | Male | No | Sun | Dinner | 2 |
11 | 35.26 | 5.00 | Female | No | Sun | Dinner | 4 |
12 | 15.42 | 1.57 | Male | No | Sun | Dinner | 2 |
13 | 18.43 | 3.00 | Male | No | Sun | Dinner | 4 |
14 | 14.83 | 3.02 | Female | No | Sun | Dinner | 2 |
15 | 21.58 | 3.92 | Male | No | Sun | Dinner | 2 |
16 | 10.33 | 1.67 | Female | No | Sun | Dinner | 3 |
17 | 16.29 | 3.71 | Male | No | Sun | Dinner | 3 |
18 | 16.97 | 3.50 | Female | No | Sun | Dinner | 3 |
19 | 20.65 | 3.35 | Male | No | Sat | Dinner | 3 |
20 | 17.92 | 4.08 | Male | No | Sat | Dinner | 2 |
21 | 20.29 | 2.75 | Female | No | Sat | Dinner | 2 |
22 | 15.77 | 2.23 | Female | No | Sat | Dinner | 2 |
23 | 39.42 | 7.58 | Male | No | Sat | Dinner | 4 |
24 | 19.82 | 3.18 | Male | No | Sat | Dinner | 2 |
25 | 17.81 | 2.34 | Male | No | Sat | Dinner | 4 |
26 | 13.37 | 2.00 | Male | No | Sat | Dinner | 2 |
27 | 12.69 | 2.00 | Male | No | Sat | Dinner | 2 |
28 | 21.70 | 4.30 | Male | No | Sat | Dinner | 2 |
29 | 19.65 | 3.00 | Female | No | Sat | Dinner | 2 |
... | ... | ... | ... | ... | ... | ... | ... |
214 | 28.17 | 6.50 | Female | Yes | Sat | Dinner | 3 |
215 | 12.90 | 1.10 | Female | Yes | Sat | Dinner | 2 |
216 | 28.15 | 3.00 | Male | Yes | Sat | Dinner | 5 |
217 | 11.59 | 1.50 | Male | Yes | Sat | Dinner | 2 |
218 | 7.74 | 1.44 | Male | Yes | Sat | Dinner | 2 |
219 | 30.14 | 3.09 | Female | Yes | Sat | Dinner | 4 |
220 | 12.16 | 2.20 | Male | Yes | Fri | Lunch | 2 |
221 | 13.42 | 3.48 | Female | Yes | Fri | Lunch | 2 |
222 | 8.58 | 1.92 | Male | Yes | Fri | Lunch | 1 |
223 | 15.98 | 3.00 | Female | No | Fri | Lunch | 3 |
224 | 13.42 | 1.58 | Male | Yes | Fri | Lunch | 2 |
225 | 16.27 | 2.50 | Female | Yes | Fri | Lunch | 2 |
226 | 10.09 | 2.00 | Female | Yes | Fri | Lunch | 2 |
227 | 20.45 | 3.00 | Male | No | Sat | Dinner | 4 |
228 | 13.28 | 2.72 | Male | No | Sat | Dinner | 2 |
229 | 22.12 | 2.88 | Female | Yes | Sat | Dinner | 2 |
230 | 24.01 | 2.00 | Male | Yes | Sat | Dinner | 4 |
231 | 15.69 | 3.00 | Male | Yes | Sat | Dinner | 3 |
232 | 11.61 | 3.39 | Male | No | Sat | Dinner | 2 |
233 | 10.77 | 1.47 | Male | No | Sat | Dinner | 2 |
234 | 15.53 | 3.00 | Male | Yes | Sat | Dinner | 2 |
235 | 10.07 | 1.25 | Male | No | Sat | Dinner | 2 |
236 | 12.60 | 1.00 | Male | Yes | Sat | Dinner | 2 |
237 | 32.83 | 1.17 | Male | Yes | Sat | Dinner | 2 |
238 | 35.83 | 4.67 | Female | No | Sat | Dinner | 3 |
239 | 29.03 | 5.92 | Male | No | Sat | Dinner | 3 |
240 | 27.18 | 2.00 | Female | Yes | Sat | Dinner | 2 |
241 | 22.67 | 2.00 | Male | Yes | Sat | Dinner | 2 |
242 | 17.82 | 1.75 | Male | No | Sat | Dinner | 2 |
243 | 18.78 | 3.00 | Female | No | Thur | Dinner | 2 |
244 rows × 7 columns
#Notice the selection of palette and how we can swap the axis.
sns.barplot(x="tip", y="day", data=tips, palette="Greens_d");
#Notice the selection of palette and how we can swap the axis.
sns.barplot(x="total_bill", y="day", data=tips, palette="Reds_d");
#Saturday is the bigger night
sns.countplot(x="day", data=tips);