%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
Fabricated by Dr. John D Hunter, Matplotlib is one of the main scientific Python tools used for data visualization. My goal is to develop an interactive guide with succinct explanations and helpful copy-paste style codes for future reference. Included are barebone methods to develop simple graphs and we function descriptions only include relevant parameters.
n = 10
Xs = np.reshape([np.arange(n) for _ in range(n)], newshape=(10,10)).T; Ys = np.random.random((n,n))
xs = Xs[:,0]; ys = Ys[:,0]
In Matplotlib, the Figure is the "picture frame" that holds one or more Axes, which unintuitively are the individual plots. When dealing with multiple figures, the function plt.get_fignums()
returns an array of all figures' ids which can then be used to retrieve figures via plt.figure(i)
. To close figures, simply call plt.close()
, where the input can be an id or the string "all".
By default, the plot shown is thte most recently called (e.g. last figure to have been plt.plot()
). Alternatively, one can call ax.imshow()
, where one passes the X=(xs,ys)
value as a paramater. There are a lot of paramaters (see docs). A wrapper function is defined as ax.matshow()
.
Side note, plt.plot()
is a wrapper for plotting with the current Axes object (retrieved via gcf()
, see docs).
# matplotlib/pyplot.py
>>> def plot(*args, **kwargs):
... """An abridged version of plt.plot()."""
... ax = plt.gca()
... return ax.plot(*args, **kwargs)
>>> def gca(**kwargs):
... """Get the current Axes of the current Figure."""
... return plt.gcf().gca(**kwargs)
subplots_adjust
docssubplot()
add_subplot()
"""
If nrows=ncols=1, then axes is just a single Axes else returns np.ndarray of Axes.
Note that it will return an array of (nrwos,ncols) dimension. If desired, call
axes.flatten() to convert into a 1D array for simpler indexing.
"""
nrows,ncols=1,2
fig, axes = plt.subplots(
nrows=nrows,
ncols=ncols,
sharex=True, # {"none", "all", "row", or "col"}. If shared, use bottom subplot's
sharey=True,
figsize=(5*ncols,3*nrows) # width x height
)
axes[0].plot(xs,ys)
axes[1].plot(xs,ys)
# TODO: Use subplot(id) and add_subplot
# method 2
gridsize=(3,2)
fig = plt.figure(figsize=(6,4))
ax1 = plt.subplot2grid(gridsize, (0,0), colspan=2, rowspan=2)
ax2 = plt.subplot2grid(gridsize, (2,0))
ax3 = plt.subplot2grid(gridsize, (2,1))
plt.tight_layout()
ax1.plot(xs,ys); ax2.plot(xs,ys); ax3.plot(xs,ys)
# Method 2 above is equivalent to the following using GridSpecs
The canonical way to plot is via ax.plot()
(or plt.plot()
). Each Axes object includes basic configurations like set_title()
, set_xlabel()
(and y-axis), set_xlim(xmin, xmax)
, legend(loc=0)
(location of legend detailed in docs), and set_yscale(...,basey=)
. It may be more concise to call ax.set(...)
instead (lots of docs here).
When dealing wtih multiples Axes objects in one figure, call fig.tight_layout()
to clean up whitespace padding.
Next we explore how to make graphs pretty, which can be invoked via (More parameter options, such as markeredgecolor
, in the docs)
ax.plot(x=, y=, color=, marker=, linestyle=, linewidth=, markersize=, label=, ...)
Basic ones include "." (dot), "o" (circle), "^/>/<" (triangles), "d" (diamond), "X" (cross).
More in docs. It seems there are a variety of LaTex based markers, consider 1, 2.
Basic linestyles are in docs. Users can define their own configuration with the template format (offset, (on_off_seq)*)
, where the *
means we can define consecutive on-off sequences (on being length of a line, off being length of space). For example:
(0,())
since we have no space(0,(1,1))
(0,(3,10,1,10,1,10))
is a line followed by two dots (and repeated) and very spaced outThere is quite a bit of configurations in the docs. Some pertinent ones include:
ax.yaxis.tick_right()
. Moves tick from left to right side.ax.set_axis_off()
. Turn off axisIn terms of fonts for labels, one can pass the additional parameters either through explicilty defining them or passing a dictionary abiding by matplotlib.text.Text,
ax.set_title('My Title', fontdict={'fontsize': 8, 'fontweight': 'medium', 'fontfamily': 'fantasy', ...})
We can add grid lines into the plot with (similar parameters to ax.plot()
),
ax.grid(color=...,linestyle=...,)
# Create 2 plots. We will show two ways to pass settings
nplts = 5
fig,(ax,ax2) = plt.subplots(
nrows=1,
ncols=2,
sharey=True,
figsize=(2*7,5))
# Configure figure text
fig.suptitle("Global Title")
ax.set(title="Title",xlabel="x-axis",ylabel="y-axis",
xlim=(np.min(Xs[:,:nplts])-1,np.max(Xs[:,:nplts])+1),
ylim=(np.min(Ys)-0.25,np.max(Ys)+0.25))
# Attempts to set y-lim are futile since we have sharey=True for ax2
ax2.set_title("Title 2", fontsize=25)
ax2.set_xlabel("x2-axis",
fontdict={'fontsize': 15, 'fontweight': 'medium', 'fontfamily': 'fantasy'})
ax.grid(color="gray",linestyle=(0,(2,5)))
# Axes settings
colors=["red","orange","green","blue","purple"]
mkrs=[".","o","s",r'$\clubsuit$','$\Phi$']
lss=["solid","dashed","dotted","dashdot",(1,(3,5,1,5,1,5))]
mkrszs=1.5*np.arange(4,9)
labels=["Plot {}".format(i) for i in range(1,6)]
for i in range(nplts):
ax.plot(Xs[:,i],Ys[:,i],color=colors[i],marker=mkrs[i],
linestyle=lss[i],label=labels[i],markersize=mkrszs[i])
# can also pass a dictionary. Here, we define a series of dictionaries
plt2_kwargs = [
{'color':colors[i], 'marker':mkrs[i], 'label':str(len(Xs)-i), 'linestyle':lss[i], 'markersize':mkrszs[i]}
for i in range(nplts)]
for i in range(nplts):
ax2.plot(Xs[:,-i],Ys[:,-i],**plt2_kwargs[i])
# Do this after plot to clean things up and add legend
ax.legend(loc=0)
ax2.legend(loc=0)
plt.tight_layout()
TODO: yaxis, xaxis configurations, ticks
Simply call new_ax = ax.twinx()
. The two Axes objects will share the same x-axis (the new one being scaled to match the old one) with independent y-axes.
When using $3+$ twins, we want to move the y-axis (spines). There are two ways to do this. One involves more esoteric matplotlib packages through this tutorial. Another requires only matplotlib. It's a bit more verbose, but since it doesn't require extra packages, I'll include the code here based on this code.
In the first plot, we illustrate what happens if we do not separate multiple twins.
fig, host = plt.subplots()
# fig.subplots_adjust(right=0.75)
par1 = host.twinx(); par2 = host.twinx()
# par1, par2 share the same y-axis on the right side
p1, = host.plot([0, 1, 2], [0, 1, 2], "b", label="Density")
p2, = par1.plot([0, 1, 2], [0, 3, 2], "r", label="Temperature")
p3, = par2.plot([0, 1, 2], [50, 30, 15], "g", label="Velocity")
host.set(xlim=(0,2),ylim=(0,2),xlabel="Distance",ylabel="Density")
par1.set(ylim=(0,4),ylabel="Temperature")
par2.set(ylim=(1,65),ylabel="Velocity")
host.yaxis.label.set_color(p1.get_color())
par1.yaxis.label.set_color(p2.get_color())
par2.yaxis.label.set_color(p3.get_color())
tkw = dict(size=4, width=1.5)
host.tick_params(axis='y', colors=p1.get_color(), **tkw)
par1.tick_params(axis='y', colors=p2.get_color(), **tkw)
par2.tick_params(axis='y', colors=p3.get_color(), **tkw)
host.tick_params(axis='x', **tkw)
lines = [p1, p2, p3]
host.legend(lines, [l.get_label() for l in lines])
par2.spines["right"].set_position(("axes", 1.15))
fig
xs = np.linspace(-10,10,100)
sig = 1/(1 + np.exp(-xs))
_,ax = plt.subplots()
# horizontal line
ax.axhline(y=0, color="black", linestyle="--")
ax.axhline(y=0.5, color="black", linestyle=":")
ax.axhline(y=1, color="black", linestyle="--")
# vertical line
ax.axvline(color="gray") # default x=0
ax.plot(xs,sig,label=r"$\sigma(t) = \frac{1}{1 + e^{-t}}$")
ax.yaxis.tick_right()
ax.yaxis.set_label_position("right")
ax.set(xlim=(-10,10),xlabel="t",ylabel="cdf",title="Guassian distribution CDF")
ax.legend(fontsize=14,loc=0)
stackplot
¶rng = np.arange(50) # range
rnd = np.random.randint(0,10,size=(3, rng.size)) # random
yrs = 1950 + rng
fig, ax = plt.subplots(figsize=(5,3))
ax.stackplot(yrs, rng + rnd, labels=['USA', 'Canada', 'Mexico'])
ax.set_title('Combined debt over time')
ax.legend(loc='upper left')
ax.set_ylabel('Total debt')
ax.set_xlim(xmin=yrs[0], xmax=yrs[-1])
fig.tight_layout()
scatter
¶# TODO: include "cmap" in plot/imshow
# Demo 2: Scatter with Color
n = 1000
xs = np.random.randint(low=1,high=100,size=n)
ys = np.random.randn(n)
zs = np.exp(10*np.random.rand(n)+2,)
ys -= ys.min()
fig, ax3 = plt.subplots()
sctr = ax3.scatter(x=xs,y=ys, c=zs,cmap='RdYlGn')
plt.colorbar(sctr, ax=ax3, format='$%d')
hist
¶x = np.random.randint(low=1, high=11, size=50)
y = x + np.random.randint(1,5,size=x.size)
data = np.column_stack((x,y))
fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(8,4))
axes[0].scatter(x=x, y=y, marker='o', c='r', edgecolor='b')
axes[0].set_title('Scatter: $x$ vs. $y$')
axes[0].set_ylabel('$y$')
axes[1].hist(data, bins=np.arange(data.min(), data.max()), label=('x','y'))
axes[1].legend(loc=(0.65,0.8))
axes[1].set_title('Frequencies of $x$ and $y$')
axes[1].yaxis.tick_right()
text
¶# Demo 4: Include Text
axes[1].text(x=0.55, y=0.8,
s="hi there",
horizontalalignment='center',
transform=ax.transAxes,
bbox=dict(facecolor='white', alpha=0.6),
fontsize=12.5
)
imshow
¶Produces a raster graphic, essentially color coded graph. Includes simpler function matshow
(docs).
from mpl_toolkits.axes_grid1.axes_divider import make_axes_locatable
x = np.diag(np.arange(2,12))[::-1]
x[np.diag_indices_from(x)] = np.arange(2,12) # build criss-cross diagonal
x2 = np.arange(x.size).reshape(x.shape)
sides = ('left', 'right', 'top', 'bottom')
nolabels = {s : False for s in sides} # creates dictionary
nolabels.update({'label%s' % s : False for s in sides}) # add extra items with label+{}
with plt.rc_context(rc={'axes.grid': False}):
fig, (ax1, ax2) = plt.subplots(1,2,figsize=(8,4))
ax1.matshow(x)
img2 = ax2.matshow(x2, cmap='RdYlGn_r')
for ax in (ax1,ax2):
ax.tick_params(axis='both', which='both', **nolabels)
for i,j in zip(*x.nonzero()):
ax1.text(j,i,x[i,j], color='w', ha='center', va='center')
divider = make_axes_locatable(ax2)
cax = divider.append_axes("right", size='5%', pad=0)
plt.colorbar(img2, cax=cax, ax=[ax1,ax2])
fig.suptitle('Heatmaps with Axes.matshow', fontsize=16)