Django, Bokeh and HTMX - Data Driven Bar Charts
In this video, we will create a dynamic, database-driven bar chart for country GDP data. We will use Django for our backend functionality, will use Bokeh to create bar charts representing the GDP data, and will use HTMX to dynamically filter the year for which data is shown.
The associated video for this tutorial can be found below:
Objectives
In this post, we will learn how to:
- Create bar charts from database data using the Bokeh visualization library
- Render Bokeh-driven charts in Django templates
- Use HTMX to dynamically update the rendered charts
- Add basic Bokeh styling directives to our charts
- Load data from a JSON file into the database, via a custom Django management command
Project Setup
Starter code exists on Github here. Clone this repository and install the requirements with the following commands:
git clone https://github.com/bugbytes-io/django-htmx-bokeh
cd django-htmx-bokeh
pip install -r requirements.txt
The starter code contains a data
folder, which contains a gdp.json
file containing GDP data for regions and countries in the world (from the 1960s up until 2016). This data will be our source data for the bar charts, and to use this data in Django, we'll perform two steps:
- Create a Django model that can represent this data in the database
- Create a custom management command to read in the JSON file and insert the rows into our database, via Django's ORM.
We will start by creating the model that'll represent our GDP data.
Creating GDP Model
A single record in our gdp.json file looks like this:
{"Country Code": "ARB", "Country Name": "Arab World", "Value": 25760683041.0857, "Year": 1968}
We are going to create a model that contains these 4 keys, in our models.py
file. Add the following code:
from django.db import models
# Create your models here.
class GDP(models.Model):
country = models.CharField(max_length=100)
country_code = models.CharField(max_length=4)
year = models.PositiveSmallIntegerField()
gdp = models.FloatField()
def __str__(self):
return self.country
Each row in our database will be a GDP entry for a particular country, in a particular year. Let's make the migrations and create the underyling table in our database:
python manage.py makemigrations
python manage.py migrate
With the table created, we are now going to write a management script that'll load all the country data from the JSON file into the database.
Loading Data With Management Script
Within the starter code, the skeleton for a management command is shown in the gdp\management\commands\populate.py
file. We are going to replace the code from the starter repository with the following code, in order to load the data from the JSON file:
import json
from itertools import dropwhile
from django.conf import settings
from django.core.management.base import BaseCommand
from gdp.models import GDP
class Command(BaseCommand):
help = 'Load Courses and Modules'
def handle(self, *args, **kwargs):
# Add GDP objects, if there are none in the DB
if not GDP.objects.count():
datafile = settings.BASE_DIR / 'data' / 'gdp.json'
# read in data from the JSON file
with open(datafile, 'r') as f:
data = json.load(f)
data = dropwhile(lambda x: x['Country Name'] != 'Afghanistan', data)
gdps = []
for d in data:
gdps.append(GDP(
country=d['Country Name'],
country_code=d['Country Code'],
gdp=d['Value'],
year=d['Year']
))
GDP.objects.bulk_create(gdps)
We set up a path to our data file on line 14, and on lines 17-18, we load the data into a Python dictionary.
Since every record in our data file before Afghanistan is not a country, but instead regional data (ex: Europe or Arab World), we need to remove those records from the data. One way to do this is to use the itertools.dropwhile()
function to drop all records, until the first record with country name equal to Afghanistan is found (line 20).
The remaining records are then converted to GDP model instances, added to a list, and finally added to the database via the GDP.objects.bulk_create(gdps)
method on line 31.
To run the command, execute the following in the terminal:
python manage.py populate
This should populate the database with the relevant records from the JSON file.
Adding View and Template for Bokeh Chart
Now that the database is populated, we can get to work writing the code that will create and render our bar chart. We are going to allow users to select a number of countries N (which defaults to 10), and allow them to select a year (which defaults to the most recent year in the dataset). We will then show a bar chart with the top N countries by GDP for that year.
We have already installed the Bokeh Python library into our environment. However, Bokeh also comes with some client-side JavaScript dependencies that are required in order to make the charts interactive. Thus, we need to include some scripts in our base.html
file (the base template). You can find these scripts in the Bokeh docs, on the embedding page. At the time of writing, the most recent versions of these scripts (v 2.4.0) are shown below - add these just before closing the <body>
tag in the base.html file.
<script src="https://cdn.bokeh.org/bokeh/release/bokeh-2.4.0.min.js"
crossorigin="anonymous"></script>
<script src="https://cdn.bokeh.org/bokeh/release/bokeh-widgets-2.4.0.min.js"
crossorigin="anonymous"></script>
<script src="https://cdn.bokeh.org/bokeh/release/bokeh-tables-2.4.0.min.js"
crossorigin="anonymous"></script>
<script src="https://cdn.bokeh.org/bokeh/release/bokeh-gl-2.4.0.min.js"
crossorigin="anonymous"></script>
<script src="https://cdn.bokeh.org/bokeh/release/bokeh-mathjax-2.4.0.min.js"
crossorigin="anonymous"></script>
This will include the necessary JavaScript for our index.html
template, which extends the base template. This template, in the starter code, is very basic - we'll improve that in a minute. But firstly, let's extend the basic view that renders this template to get the data we need from the database, and create the Bokeh bar chart.
On the template, we want to let the user select a year and a count (i.e. how many bars/countries to show, for a given year). We also want to provide defaults of the most recent year, and 10 for the count. Add the following code to the view:
from django.db.models import Max
def index(request):
# get the year from GET request, or default to the maximum year in the data
max_year = GDP.objects.aggregate(max_yr=Max('year'))['max_yr']
year = request.GET.get('year', max_year)
# get the number of countries to show in the bar chart - default to 10
count = int(request.GET.get('count', 10))
...
On line 3, we use the ORM .aggregate()
method to get the maximum value in the year
column in the database. This serves as the default, if no year exists in the GET request (line 4).
We do the same on line 7 to get the number of countries we want to show in the bar chart, with a default of 10.
Let's now get the top count
GDP objects from the database for that year:
from django.db.models import Max
def index(request):
# get the year from GET request, or default to the maximum year in the data
max_year = GDP.objects.aggregate(max_yr=Max('year'))['max_yr']
year = request.GET.get('year', max_year)
# get the number of countries to show in the bar chart - default to 10
count = int(request.GET.get('count', 10))
# extract data for that year for top N
gdps = GDP.objects.filter(year=year).order_by('gdp').reverse()[:count]
We use a few ORM methods on line 10 here: firstly, we filter records down to only those for the chosen year. Then, we order by the 'gdp' field, which will default to the smallest GDP first, up until the largest.
Since we are interested in the largest count
countries, we use the .reverse()
method to reverse the order of the QuerySet (i.e. from largest to smallest). Finally we index into the resulting queryset to get the top count
records.
Now that we have the relevant GDP data, we can create the bar chart using Bokeh. Add the following code to the view:
from django.db.models import Max
from bokeh.embed import components
from bokeh.plotting import figure
from bokeh.models import ColumnDataSource
def index(request):
# get the year from GET request, or default to the maximum year in the data
max_year = GDP.objects.aggregate(max_yr=Max('year'))['max_yr']
year = request.GET.get('year', max_year)
# get the number of countries to show in the bar chart - default to 10
count = int(request.GET.get('count', 10))
# extract data for that year for top N
gdps = GDP.objects.filter(year=year).order_by('gdp').reverse()[:count]
# extract country names and GDPs
country_names = [d.country for d in gdps]
country_gdps = [d.gdp for d in gdps]
# define column data source
cds = ColumnDataSource(data=dict(country_names=country_names, country_gdps=country_gdps))
fig = figure(x_range=country_names, height=500, title=f"Top {count} GDPs ({year})")
# create the bar chart
fig.vbar(x='country_names', top='country_gdps', width=0.8, source=cds)
script, div = components(fig)
context = {
'script': script,
'div': div
}
return render(request, 'index.html', context)
We extract the names of the countries and their GDP values on lines 13-14, using list comprehensions, and then pass this data to a Bokeh ColumnDataSource
. This is a construct that provides the data to the glyphs in your charts, and uses the provided dictionary's keys as its column names. This is somewhat similar to a relational database table with columns, or a Pandas DataFrame.
We then create a figure()
object, passing some attributes such as x_range (the data range for the horizontal dimension of the plot) and the figure's height and title. The figure in Bokeh has methods that can be used to create a variety of shapes and charts, including:
fig.vbar()
- vertical bar chart (used in our example)fig.hbar()
- horizontal bar chartfig.line()
- line chartfig.scatter()
- scatter plot
Other utilities can be used to create heatmaps, histograms, pie charts, and much more.
We are creating a vertical bar chart (line 22), providing our ColumnDataSource
to this function and referencing its keys for both the x
and top
keyword arguments.
On line 29, we call the components(fig)
function and pass the figure object to it. This will generate some JavaScript and an HTML <div>
that is then be included in the context that is passed to our template.
Let's now modify our index.html
template and add the div and the script.
{% extends 'base.html' %}
{% block content %}
<p class="lead">GDP by country</p>
{{ div|safe }}
{{ script|safe }}
{% endblock %}
It's very simple - we just reference the context variables in the template, and that should be enough to render the chart!
Let's run the development server and see how this looks - you should see something similar to below:
This shows the top 10 countries' GDPs for the year 2016. We're going to fix a few styles in this chart - for example, to avoid overlap of the country names on the X axis, and to make the title pop a bit more. So, let's add the following lines to the view, just after creating the figure object.
import math
from bokeh.models import ColumnDataSource, NumeralTickFormatter
def index(request):
...
fig = figure(x_range=country_names, height=500, title=f"Top {count} GDPs ({year})")
fig.title.align = 'center'
fig.title.text_font_size = '1.5em'
fig.yaxis[0].formatter = NumeralTickFormatter(format="$0.0a")
fig.xaxis.major_label_orientation = math.pi / 4
...
The resulting chart should be a little better, stylistically, as shown below.
The orientation of the X-axis labels prevents the overlap we had previously, and the chart's title stands out more.
We're now going to modify the frontend to allow the user to select the year they are interested in, and also allow the user to customize the number of countries (bars) shown in the chart, via HTMX requests.
Adding HTMX-driven Inputs
Let's allow the user to select any year from the minimum year in the data, to the most recent year. We will add a context variable called years
containing this range, as well as another few variables:
year_selected
, which indicates the year for which the current chart showscount
, which indicates to the template the number of bars shown in the chart (we will need these later).
We can add these with the following code:
from django.db.models import Max, Min
def index(request):
# Get min and max year from the data
max_year = GDP.objects.aggregate(max_yr=Max('year'))['max_yr']
min_year = GDP.objects.aggregate(min_yr=Min('year'))['min_yr']
year = request.GET.get('year', max_year)
....
context = {
'script': script,
'div': div,
'years': range(min_year , max_year +1), # +1 to include the final year!
'year_selected': year, # the year for which we currently show data
'count': count, # the number of bars to show
}
return render(request, 'index.html', context)
Now, let's modify the index.html
template to render a <select>
element with a list of all the years. We're also going to create a few columns using Bootstrap, one for the chart, and one for the select <select>
element.
{% extends 'base.html' %}
{% block content %}
<p class="lead">GDP by country</p>
<div class="row">
<div id="barchart" class="col-10">
{% include 'partials/gdp-bar.html' %}
</div>
<div class="col-2">
<select id="select-year" class="custom-select" name="year" autocomplete="off">
{% for year in years %}
<option value="{{year}}">{{year}}</option>
{% endfor %}
</select>
</div>
</div>
{% endblock %}
We also include a new partial template on line 7, which we'll create in a second. The <select>
element on lines 10-14 iterates over all years in the context, and for each one, adds an <option>
element. The gdp/templates/partials/gdp-bar.html
template should now be created, with the following code:
{{ div|safe }}
{{ script|safe }}
Pretty simple! We only create this so we can easily return a new chart when HTMX makes a request, and we can render this simple template.
Let's hook up the HTMX attributes, now! Modify the <select>
element on line 10 to the following:
<select id="select-year"
class="custom-select"
name="year"
autocomplete="off"
hx-get="{% url 'index' %}"
hx-target="#barchart">
{% for year in years %}
<option value="{{year}}"
{% if year_selected == year %} selected {% endif %}>{{year}}</option>
{% endfor %}
</select>
We set up a GET request to our existing View, and specify that the returned HTML should be swapped into the div with ID barchart
.We also set the <option>
element's selected
property to the option that matches the year_selected
passed to the context, on line 10.
The existing endpoint should return only the fragment if it's an HTMX request, so modify the final lines of the view to the following:
def index(request):
...
if request.htmx:
return render(request, 'partials/gdp-bar.html', context)
return render(request, 'index.html', context)
If it's an HTMX request, we return only the partial containing the chart. This will allow us to swap the HTML returned into the target, replacing the chart dynamically.
For this to work, we will add the django-htmx package to our project -
specifically, in settings.py
:
- add
django_htmx
toINSTALLED_APPS
- add
django_htmx.middleware.HtmxMiddleware
toMIDDLEWARE
Now, the frontend should update the chart without refreshing the page whenever a new year is selected from this list, as shown below.
We can add another input, to allow the user to modify the number of countries/bars, to our index.html
template, below.
{% extends 'base.html' %}
{% block content %}
<p class="lead">GDP by country</p>
<div class="row">
<div id="barchart" class="col-10">
{% include 'partials/gdp-bar.html' %}
</div>
<div class="col-2">
<label>Year</label>
<select id="select-year"
class="custom-select"
name="year"
autocomplete="off"
hx-get="{% url 'index' %}"
hx-target="#barchart">
{% for year in years %}
<option value="{{year}}"
{% if year_selected == year %} selected {% endif %}>{{year}}</option>
{% endfor %}
</select>
<hr/>
<label>Count</label>
<input type="number"
id="count"
name="count"
autocomplete="off"
value="{{count}}"
hx-get="{% url 'index' %}"
hx-target="#barchart"/>
</div>
</div>
{% endblock %}
The number input has the same HTMX attributes as the <select>
element, and passes a count
GET parameter to the backend. We set its value
attribute to the count
passed to the context.
This should now allow the user to change the number of bars in the chart, as below: