Pandas visualisation
In order to use shellplot with pandas, we first need to set the plotting backend:
>>> import pandas as pd
>>> pd.set_option("plotting.backend", "shellplot")
Basic plotting
The plot method is a wrapper around shellplot.plot
function. For example, we can plot a series:
>>> ts = pd.Series(np.random.randn(500), index=pd.date_range("1/1/2000", periods=500))
>>> ts = ts.cumsum()
>>> ts.plot()
31┤ +
| ++
| +
| ++ ++
| + +++ +
| + +++ +++
24┤ + ++ ++++++++
| +++ +++ ++
| ++++ ++ +++++
| ++ ++ ++++
| +++ + +
| + + + +
17┤ ++++ +++ ++
| +++++ + +++
| +++ ++ ++ +
| + ++++++ + +
| ++ ++ + +
| + +++++ ++ +
10┤ ++ + ++ ++ +
| ++ ++ + ++
| +++++ ++++ + ++
| + + ++ + ++ ++++ ++ +
| + ++ +++++ +++ + ++ ++ +
| +++ +++++ ++ ++ ++ +
3┤ ++++++++++ + ++++ + +
| +++ + ++++ + ++ +
|+++ ++ ++ + ++ +
|++++++ ++ +
| + + +++
| ++
-4┤
└┬---------------┬---------------┬---------------┬--------------┬---------------┬-
2000-01-01 2000-04-09 2000-07-17 2000-10-24 2001-01-31 2001-05-10
On a dataframe, calling plot will create a scatterplot of all columns against the index:
>>> df = pd.DataFrame(np.random.randn(500, 2), index=ts.index, columns=list("AB"))
>>> df = df.cumsum()
>>> df.plot()
| +
27┤ +++
| ++ ++ +
| ++ +++++ ++++ +
| ++ ++ ++ ++++++ +
| + +++++++ ++ ++
| +++++++++ ++
18┤ + ++ ++ + + + +
| + + + +++ + ++ +
| + +++ ++ ++ ++++ +
| +++ + +++++ ++ + ++ + ++
| +** + ++ +++ +++++ ++ +
| + ***** + +++++ ++++ ++
9┤ ++ + ++******* +++ ++ +++
| +++++ +** +*+* *** +++++
| ** ++ **+** +*** + **** +
|*** +*********** **** * ++
|**+**** ** ***** *** **
|*++**** +* * * * ** **
0┤++ +** ** ** * ** * * * **
| ** * * ** * *** ***
| ***** ** * **** *****
| **** ** * ** *
| * * ** * **
-9┤ ** ** ** ***
| * *** ** * * *
| *** ** **** **
| ** * ***** ** * * **
| ** ************
| * ** **** ** + A
-18┤ * * B
└┬---------------┬---------------┬---------------┬--------------┬---------------┬-
2000-01-01 2000-04-09 2000-07-17 2000-10-24 2001-01-31 2001-05-10
Providing the keywords x and y allows to select columns:
>>> df["A"] = pd.Series(df.index)
>>> df.plot(x="A", y="B")
B
| +
|
| ++
| ++
5┤ ++
| + + +
| + ++++ +
|+++++ ++++ +
|+++++ +++++
|++ +++++ ++
| + + + ++ ++
-1┤ + +++++
| ++ + ++ ++
| ++++ ++ + +
| ++++++ ++ +
| +++++ ++ ++
| + ++++ ++ +++
-7┤ +++ +++ ++
| ++ + + ++ +
| ++++ ++ + + + +
| +++ ++ + ++ + + +
| + ++ + + + + + + ++ + +
| + ++ + ++ + + + + ++ + ++
| ++++ ++ +++ + + ++ +++++ ++ ++
-13┤ +++ + + ++ ++++ +++++ +++ +++
| ++++ ++ +++ + +++++ + + +++
| ++++++ ++ + ++ + +++++
| + + + ++++++ + +
| + ++++ + +
| + +
-19┤ +
└┬---------------┬---------------┬---------------┬---------------┬---------------┬
0 100 200 300 400 500
A
Bar plots
Bar plots can be created by:
>>> df = pd.DataFrame(np.random.randn(500, 4), columns=list("ABCD"))
>>> df.iloc[5].abs().plot.barh(figsize=(60, 17))
|---------------------------------------------------
| |
D┤ |
| |
|-----------------------------------------------------------
| |
C┤ |
| |
|-----------------------------------------------------------
| |
B┤ |
| |
|-------------------------
| |
A┤ |
| |
|-------------------------
└┬----------┬-----------┬----------┬-----------┬----------┬--
0.0 0.3 0.6 0.9 1.2 1.5
5
Histograms
Histograms can be created by:
>>> df = pd.DataFrame(np.random.randn(10000, 1), columns=list("A"))
>>> df["A"].plot.hist(bins=10)
counts
2850┤ -------
| | |
| -------| |
| | | |
| | | |
| | | |
2280┤ | | |
| | | |
| | | |
| | | |
| | | |
| | | |
1710┤ | | |-------
| | | | |
| | | | |
| -------| | | |
| | | | | |
| | | | | |
1140┤ | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |-------
570┤ | | | | | |
| -------| | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| -------| | | | | | |-------
0┤ -------| | | | | | | | |-------
└┬-------------------┬-------------------┬-------------------┬-------------------┬
-4 -2 0 2 4
A
Box plots
Box plots can be created by:
>>> df = pd.DataFrame(np.random.rand(10, 4), columns=list("ABCD"))
>>> df.plot.box(figsize=(80, 27))
|
| --------------------------------------------------
| | | | | |
D┤ |-----| | |---|
| | | | | |
| --------------------------------------------------
|
|
| ----------------
| | | | | |
C┤ |------------| | |-------------------------------|
| | | | | |
| ----------------
|
| ---------------------------------------
| | | | | |
B┤ |--------------| | |---------------|
| | | | | |
| ---------------------------------------
|
|
| -------------------------------------------
| | | | | |
A┤ |-----------| | |----------|
| | | | | |
| -------------------------------------------
|
└┬---------------┬---------------┬--------------┬---------------┬---------------┬
0.0 0.2 0.4 0.6 0.8 1.0
Scatter plots
Scatter plots can be created by:
>>> df = pd.DataFrame(np.random.rand(50, 2), columns=["a", "b"])
>>> df["c"] = df["a"] > 0.5
>>> df.plot.scatter(x="a", y="b", color="c")
b
1.0┤ *
| * *
| * *
| *
| *
| *
0.8┤ *
|
| * *
| * *
| * * * * *
| * *
0.6┤ ** *
| *
| * *
| * * +
| + + +
|
0.4┤ +
| +
| +
| +
| +
| +
0.2┤ + + +
| + + +
|
| + + +
| +
| + + + False
0.0┤ * True
└┬---------------┬---------------┬---------------┬---------------┬---------------┬
0.0 0.2 0.4 0.6 0.8 1.0
a