Learn how to use the Bokeh library to make beautiful interactive plots, and also save them to PDF.
We need to begin by setting up the environment, so start with making a venv
$ python -m venv .venv
$ source .venv/bin/activate
Make a requirements.txt
and place it in the root. Add bokeh
to that file, and then run pip install -r requirements.txt
Let’s start by making a super simple line chart. This will show up some of the basic concepts of Bokeh.
We can start by importing the fundamentals
from bokeh.plotting import figure, show
And then we create some data
x = [1, 2, 3, 4, 5]
y = [6, 7, 2, 4, 5]
It’s important that these lists are the same length. The next step is to create the figure.
p = figure(title="Simple line example", x_axis_label="x", y_axis_label="y")
p.line(x, y, legend_label="Values", line_width=2)
Notice how we first create a figure, and then add a line to it. We also get a lot of options to customise the figure and line - we can add titles, labels and legends. Finally, we want to see this plot so we can use
show(p)
Bokeh is intended for the web, so when we run this it will open the chart on a page in your default web browser. Here’s the final chart.
Let’s add some more data - this is really easy to do! All we have to do is create more lines. Let’s start with some more data
1x = [1, 2, 3, 4, 5]
2y = [1, 2, 3, 2, 1]
3y1 = [2, 3, 4, 5, 6]
4y2 = [5, 4, 3, 2, 1]
And then we make more calls to line
1p = figure(title="Multiple lines", x_axis_label="x", y_axis_label="y")
2p.line(x, y, legend_label="Values", line_width=2, color="blue")
3p.line(x, y1, legend_label="More Values", line_width=2, color="red")
4p.line(x, y2, legend_label="Even More Values", line_width=2, color="purple")
And this is the chart it produces
So now we can see the basic workflow of Bokeh
We can customise each step of this flow quite extensively, as we will see in this article
Bokeh supports many different kinds of “glyphs” - that’s basically what Bokeh calls different items that can be displayed on the figure. Let’s explore some of these options.
1p.vbar(x=x, top=y, legend_label="Bar", color="blue", width=0.5, bottom=0)
2p.scatter(x, y1, legend_label="Scatter Crosses", color="red", size=16, marker="x")
3p.scatter(x, y2, legend_label="Scatter Circles", color="purple", size=16)
Here we use vbar
and scatter
. scatter
works a lot like lines
, but we can customise the size of the marker and the marker type. Circles and crosses are the most common, but there are others too. vbar
is a little more involved - we need to set the x
and top
as named arguments. You can customise the width of the bars, as well as from where the bottom starts (usually, you just want this to be 0).
And then when we display it we get the following chart
A really useful feature in Bokeh is annotations - they let you mark certain areas of the plot. To give a real life example of this, we’re going to plot some random data, as well as display the standard deviation of that data. Let’s start by generating those numbers and standard deviations
1import random
2import statistics
3
4N = 30 # Number of random numbers to generate
5
6x = [i for i in range(N)]
7random_numbers = [random.random() for _ in range(N)]
8mean = statistics.mean(random_numbers)
Now we’ll create the basic line graph - it’s the same as we did before so this should be familiar.
p = figure(title="Standard Deviation Example", x_axis_label="x", y_axis_label="y")
p.line(x, random_numbers, line_width=2, color="black")
In order to add annotations to this plot, we need to import the following
from bokeh.models import BoxAnnotation
Then we can create the annotations. Since we want to annotate the areas within and outside the standard deviation of the data, we want the “inside” region to be between the mean plus and minus the standard deviation. The middle box has two bounds - the top and bottom. The high and low box have no bottom and top bound respectively, meaning they extend to the end of the plot
1low = mean - std_dev
2high = mean + std_dev
3
4low_box = BoxAnnotation(top=low, fill_alpha=0.2, fill_color="red")
5mid_box = BoxAnnotation(bottom=low, top=high, fill_alpha=0.2, fill_color="green")
6high_box = BoxAnnotation(bottom=high, fill_alpha=0.2, fill_color="red")
7
8p.add_layout(low_box)
9p.add_layout(mid_box)
10p.add_layout(high_box)
We then simply add the three boxes to the plot with add_layout
- and we get the following plot
An interesting plotting project we can use to show off some of Bokeh’s potential is plotting K-means. For this we need a few more dependencies, so add the following to the requirements.txt
(and make sure you run pip install -r requirements.txt
)
1numpy
2scikit-learn
Since this isn’t a K-means tutorial, we’ll skip over the details - but if you don’t know what K-means does, the basic idea is to group data into K groups. Here’s the code we’ll use for this
1import numpy as np
2from sklearn.cluster import KMeans
3
4data = np.vstack(
5 [
6 np.random.normal(loc=(0, 0), scale=1.0, size=(100, 2)),
7 np.random.normal(loc=(5, 5), scale=1.0, size=(100, 2)),
8 np.random.normal(loc=(0, 5), scale=1.0, size=(100, 2)),
9 ]
10)
11
12kmeans = KMeans(n_clusters=3)
13pred = kmeans.fit_predict(data)
Now we have the groups we need to group the data in a more convenient way for us to plot.
1plotting_data = {}
2
3for i in range(N):
4 plotting_data[i] = []
5
6for point, group in zip(data, pred):
7 plotting_data[group].append(point.tolist())
This will be nice and generic so if we increase N
in future it still works - we are essentially making a dictionary of group to a list of the coordinates in that group.
We can make the basic plot again with
p = figure(title="K-means", x_axis_label="x", y_axis_label="y")
The next thing to do is sort out the colours. For this kind of plot the best colour scheme to use would be viridis. In order to create the viridis colours we can do the following
from bokeh.palettes import Viridis256
colors = Viridis256[::len(Viridis256) // N]
This gives us a list of colours, which we can access. Fortunately, scikitlearn numbers the groups from 0 to N-1, which is exactly the same format as the colours we just generated! Therefore, we can plot with the following
1for k in plotting_data:
2 v = plotting_data[k]
3
4 x = [row[0] for row in v]
5 y = [row[1] for row in v]
6
7 p.scatter(x, y, legend_label="Group: {}".format(k), size=8, color=colors[k])
8
9show(p)
And this generated the following graph
Bokeh doesn’t have a built in way to save to PDF. However, we can export to an SVG and then convert that into a PDF plot. We need a few other dependencies to do this, so add the following to the requirements.txt
(and make sure you run pip install -r requirements.txt
)
1svglib
2reportlab
3selenium
We also need to have a webbrowser installed. According to the docs, FireFox or Chrome will work, but I couldn’t make it work with FireFox on my ArchLinux system. I just had to install Chromium and it worked fine (sudo pacman -S chromium
on Arch).
First, we need to import a few things
1from bokeh.io import export_svgs
2import svglib.svglib as svglib
3from reportlab.graphics import renderPDF
And then I turned saving to PDF into a simple function
1def save_to_pdf(p, name):
2 # Step 1: Save to SVG
3 p.output_backend = "svg"
4 export_svgs(p, filename=name + ".svg")
5
6 # Step 2: Read in SVG
7 svglib.register_font("helvetica", "/home/fonts/Helvetica.ttf")
8 svg = svglib.svg2rlg(name + ".svg")
9
10 # Step 3: Save as PDF
11 renderPDF.drawToFile(svg, name + ".pdf")
All you have to is to provide the plot and the name of the PDF (without the .pdf extension). An example usage looks like this
1x = [1, 2, 3, 4, 5]
2y = [1, 2, 3, 2, 1]
3
4p = figure(title="Save in PDF", x_axis_label="x", y_axis_label="y")
5p.line(x, y, line_width=2, color="blue")
6
7save_to_pdf(p, "pdf_test")
Also, this will keep the SVG saved on your system, which is helpful as you can also use that in many places where you might want to use a PDF!
In conclusion, Bokeh is a very powerful library for creating beautiful interactive plots. When it comes to the workflow of using it just remember the four steps
The examples in this guide should be enough to get you started in most applications. There’s a huge amount of customisation which Bokeh supports, but too much to cover everything in this article. You can find the full reference here.