Introducing Bokeh in Python

Learn how to use the Bokeh library to make beautiful interactive plots, and also save them to PDF.

Indigo Curnick
October 28, 2024
Articles

We need to begin by setting up the environment, so start with making a venv

$ python -m venv .venv
$ source .venv/bin/activate 

Make a requirements.txt and place it in the root. Add bokeh to that file, and then run pip install -r requirements.txt

Making a Simple Line Chart in Bokeh

Let’s start by making a super simple line chart. This will show up some of the basic concepts of Bokeh.

We can start by importing the fundamentals

from bokeh.plotting import figure, show

And then we create some data

x = [1, 2, 3, 4, 5]
y = [6, 7, 2, 4, 5]

It’s important that these lists are the same length. The next step is to create the figure.

p = figure(title="Simple line example", x_axis_label="x", y_axis_label="y")
p.line(x, y, legend_label="Values", line_width=2)

Notice how we first create a figure, and then add a line to it. We also get a lot of options to customise the figure and line - we can add titles, labels and legends. Finally, we want to see this plot so we can use

show(p)

Bokeh is intended for the web, so when we run this it will open the chart on a page in your default web browser. Here’s the final chart.

Let’s add some more data - this is really easy to do! All we have to do is create more lines. Let’s start with some more data

1x = [1, 2, 3, 4, 5]
2y = [1, 2, 3, 2, 1]
3y1 = [2, 3, 4, 5, 6]
4y2 = [5, 4, 3, 2, 1]

And then we make more calls to line

1p = figure(title="Multiple lines", x_axis_label="x", y_axis_label="y")
2p.line(x, y, legend_label="Values", line_width=2, color="blue")
3p.line(x, y1, legend_label="More Values", line_width=2, color="red")
4p.line(x, y2, legend_label="Even More Values", line_width=2, color="purple")

And this is the chart it produces

So now we can see the basic workflow of Bokeh

  1. Prepare some data, usually into lists or maybe a numpy array
  2. Create a figure
  3. As as many series as you have data
  4. Show the plot

We can customise each step of this flow quite extensively, as we will see in this article

Mixing Glyphs in Bokeh Plots

Bokeh supports many different kinds of “glyphs” - that’s basically what Bokeh calls different items that can be displayed on the figure. Let’s explore some of these options.

1p.vbar(x=x, top=y, legend_label="Bar", color="blue", width=0.5, bottom=0)
2p.scatter(x, y1, legend_label="Scatter Crosses", color="red", size=16, marker="x")
3p.scatter(x, y2, legend_label="Scatter Circles", color="purple", size=16)

Here we use vbar and scatter. scatter works a lot like lines , but we can customise the size of the marker and the marker type. Circles and crosses are the most common, but there are others too. vbar is a little more involved - we need to set the x and top as named arguments. You can customise the width of the bars, as well as from where the bottom starts (usually, you just want this to be 0).

And then when we display it we get the following chart

Using Annotations to Mark Standard Deviations

A really useful feature in Bokeh is annotations - they let you mark certain areas of the plot. To give a real life example of this, we’re going to plot some random data, as well as display the standard deviation of that data. Let’s start by generating those numbers and standard deviations

1import random
2import statistics
3
4N = 30  # Number of random numbers to generate
5
6x = [i for i in range(N)]
7random_numbers = [random.random() for _ in range(N)]
8mean = statistics.mean(random_numbers)

Now we’ll create the basic line graph - it’s the same as we did before so this should be familiar.

p = figure(title="Standard Deviation Example", x_axis_label="x", y_axis_label="y")
p.line(x, random_numbers, line_width=2, color="black")

In order to add annotations to this plot, we need to import the following

from bokeh.models import BoxAnnotation

Then we can create the annotations. Since we want to annotate the areas within and outside the standard deviation of the data, we want the “inside” region to be between the mean plus and minus the standard deviation. The middle box has two bounds - the top and bottom. The high and low box have no bottom and top bound respectively, meaning they extend to the end of the plot

1low = mean - std_dev
2high = mean + std_dev
3
4low_box = BoxAnnotation(top=low, fill_alpha=0.2, fill_color="red")
5mid_box = BoxAnnotation(bottom=low, top=high, fill_alpha=0.2, fill_color="green")
6high_box = BoxAnnotation(bottom=high, fill_alpha=0.2, fill_color="red")
7
8p.add_layout(low_box)
9p.add_layout(mid_box)
10p.add_layout(high_box)

We then simply add the three boxes to the plot with add_layout - and we get the following plot

K-Means Plot

An interesting plotting project we can use to show off some of Bokeh’s potential is plotting K-means. For this we need a few more dependencies, so add the following to the requirements.txt (and make sure you run pip install -r requirements.txt)

1numpy
2scikit-learn

Since this isn’t a K-means tutorial, we’ll skip over the details - but if you don’t know what K-means does, the basic idea is to group data into K groups. Here’s the code we’ll use for this

1import numpy as np
2from sklearn.cluster import KMeans
3
4data = np.vstack(
5    [
6        np.random.normal(loc=(0, 0), scale=1.0, size=(100, 2)),
7        np.random.normal(loc=(5, 5), scale=1.0, size=(100, 2)),
8        np.random.normal(loc=(0, 5), scale=1.0, size=(100, 2)),
9    ]
10)
11
12kmeans = KMeans(n_clusters=3)
13pred = kmeans.fit_predict(data)

Now we have the groups we need to group the data in a more convenient way for us to plot.

1plotting_data = {}
2
3for i in range(N):
4    plotting_data[i] = []
5
6for point, group in zip(data, pred):
7    plotting_data[group].append(point.tolist())

This will be nice and generic so if we increase N in future it still works - we are essentially making a dictionary of group to a list of the coordinates in that group.

We can make the basic plot again with

p = figure(title="K-means", x_axis_label="x", y_axis_label="y")

The next thing to do is sort out the colours. For this kind of plot the best colour scheme to use would be viridis. In order to create the viridis colours we can do the following

from bokeh.palettes import Viridis256
colors = Viridis256[::len(Viridis256) // N]

This gives us a list of colours, which we can access. Fortunately, scikitlearn numbers the groups from 0 to N-1, which is exactly the same format as the colours we just generated! Therefore, we can plot with the following

1for k in plotting_data:
2    v = plotting_data[k]
3
4    x = [row[0] for row in v]
5    y = [row[1] for row in v]
6
7    p.scatter(x, y, legend_label="Group: {}".format(k), size=8, color=colors[k])
8
9show(p)

And this generated the following graph

Save Plots to PDF with Bokeh

Bokeh doesn’t have a built in way to save to PDF. However, we can export to an SVG and then convert that into a PDF plot. We need a few other dependencies to do this, so add the following to the requirements.txt (and make sure you run pip install -r requirements.txt)

1svglib
2reportlab
3selenium

We also need to have a webbrowser installed. According to the docs, FireFox or Chrome will work, but I couldn’t make it work with FireFox on my ArchLinux system. I just had to install Chromium and it worked fine (sudo pacman -S chromium on Arch).

First, we need to import a few things

1from bokeh.io import export_svgs
2import svglib.svglib as svglib
3from reportlab.graphics import renderPDF

And then I turned saving to PDF into a simple function

1def save_to_pdf(p, name):
2    # Step 1: Save to SVG
3    p.output_backend = "svg"
4    export_svgs(p, filename=name + ".svg")
5
6    # Step 2: Read in SVG
7    svglib.register_font("helvetica", "/home/fonts/Helvetica.ttf")
8    svg = svglib.svg2rlg(name + ".svg")
9
10    # Step 3: Save as PDF
11    renderPDF.drawToFile(svg, name + ".pdf")

All you have to is to provide the plot and the name of the PDF (without the .pdf extension). An example usage looks like this

1x = [1, 2, 3, 4, 5]
2y = [1, 2, 3, 2, 1]
3
4p = figure(title="Save in PDF", x_axis_label="x", y_axis_label="y")
5p.line(x, y, line_width=2, color="blue")
6
7save_to_pdf(p, "pdf_test")

Also, this will keep the SVG saved on your system, which is helpful as you can also use that in many places where you might want to use a PDF!

Conclusion

In conclusion, Bokeh is a very powerful library for creating beautiful interactive plots. When it comes to the workflow of using it just remember the four steps

  1. Prepare some data, usually into lists or maybe a numpy array
  2. Create a figure
  3. As as many series as you have data
  4. Show the plot

The examples in this guide should be enough to get you started in most applications. There’s a huge amount of customisation which Bokeh supports, but too much to cover everything in this article. You can find the full reference here.

Subscribe To Our Newsletter - Sleek X Webflow Template

Subscribe to our newsletter

Sign up at Naurt for product updates, and stay in the loop!

Thanks for subscribing to our newsletter
Oops! Something went wrong while submitting the form.