Introduction to Bokeh
Bokeh is another visualization library that is not built on Matplotlib and is directly exports content to HTML5 and Javascript. Furthermore, Bokeh does offer interactive capabilities that can be ran from a web server or even a framework such as Django.
Let's start by importing in some of the core libraries into our code.
Let's generate some random data so that we can plot it.
Unlike Matplotlib, we need to instatiate a Bokeh object calling the figure method. From there we can call a multitude of glyphs available to us in Bokeh.
Reference: https://ainfographics.wordpress.com/2017/12/01/python-bokeh-cheat-sheet/
With the basics out of the way, let's review a demo setup presented out of the Bokeh quickstart. Let's import the autompg data in from the sample data.
https://github.com/bokeh/bokeh/blob/branch-2.3/bokeh/sampledata/_data/auto-mpg.csv
Breakdown the AutoMPG data
With the data imported lets group it by yr and perform some aggregation functions on it. It is important to note that the data comes in as a Pandas DataFrame.
Separate the data into groups
With the aggregation completed, lets slice the years from the data and then also split the cars up by the country of origin.
With the data prepared, let's plot it using additional glyphs available to use in Bokeh. Below you will see the use of a vertical bar (vbar where the top and bottom are define by +/-1 standard deviation from the mean), and three other shaped based glyphs (square, diamond, circle).
Introduce ColumnDataSource
ColumnnDataSource maps data out into a dictionary like format making it easier for Bokeh to process the data. The data within the ColumnDataSource must contain equal number of elements within each property. Bokeh is optimized to consume ColumnDataSource to draw visuals for viewing through web browsers (HTML5 and JS).
Reference: https://docs.bokeh.org/en/latest/docs/user_guide/data.html
With a better understanding of how ColumnDataSource works, we can now easily convert a Pandas Data Frame into a ColumnDataSource by just passing it through the associated method. For gridplots (similar to facets in ggplot and grid plots in Seaborn), you can pass in the data. What makes ColumnDataSource easier to work with in Bokeh is now you can identify the data source with the "source=" attribute and reference the column headers in the method call.
Now that we have a sense of the data types, let's move onto using ColumnDataSource in a visualization.
Bokeh Styling Guide Reference: https://docs.bokeh.org/en/latest/docs/user_guide/styling.html
Reference: Bokeh Quickstart
Bokeh also provides us a way to filter now the data prior to visiualization. To do that, you use the CDSView and IndexFilter. Calling the CDSView, you can pass an IndexFilter method into the filters attribute. This will limit which indexes a used in the visualization.
Reference: https://www.tutorialspoint.com/bokeh/bokeh_filtering_data.htm
Instead of filtereing by index, you can also leverage a BooleanFilter to identify which data entries are to be plotted.
With the boolean filter configured, we can use that filter in the CDSView to only plot data points of interest -- in this case data that is greater or equal to 1980.
Reference: https://www.tutorialspoint.com/bokeh/bokeh_filtering_data.htm Note: The example in the above link uses a line plot; CDSView does not accept contigous plots for filtering
Finally, below is an example of how you can use Bokeh widgets in your chart. They key with these widgets is to create a callback method that will adjust the visualization along assigning the widget the callback function and value. To access the value from the callback method, use the cb_obj. method along with the target property in this case value. source.change.emit() will update the figure.