Geospatial – Actualités

Dataviz with OpenLayers: let’s plot some graphs!

14 janvier 2020

Introduction

Web mapping is great because we get nice interactive maps, smoothly zooming in/out and showing us data with slick styling and interactivity. Only sometimes just looking at the data is not enough: what is the point of staring at thousands of points spread on a map, without more ways of understanding what they mean? For instance, if these points each represent an event in time, maybe having a way to show only the ones in a given time range would be of interest… there you have it: dataviz comes into play!

What next?

Admittedly, this introduction was a bit over-simplistic. What is usually called dataviz may include many fields and questions, from data preparation and processing to final visualization. In this article we will take a look at the Airship UI components made by Carto, and how to couple these to an OpenLayers map.

More specifically, we will use a time series chart to explore a rich dataset of earthquakes in its temporal dimension. This is what the chart looks like:

Dataviz1

And a live example here:

https://carto.com/developers/airship/examples/#example-time-series-bridge

As you can see, Carto did a pretty good job! The chart is nice and tidy, allows animating in time, selecting a range, etc. It is all based on d3.js which we all know and love.

Now, Carto provides so-called bridge components to easily connect this kind of widget (which has absolutely no knowledge of any map) to a Mapbox-gl-js map as demonstrated in the live example above.

How awesome would it be to write our own bridge, for OpenLayers this time? Let me tell you: pretty damn awesome. And maybe less of a headache that one might think!

Start small… or not

Let’s commit to a first achievable goal: set up a super simple OpenLayers map with a dataset of time stamped events in it, possibly a lot.

As always, setting up a Javascript application from scratch is always boring at best, so I’m going to focus on the interesting bits later on. As a rule of thumb, this is a quick recap of what was used for the project structure:

Now, these are the three first files we need for the project:

public/index.html

<html>
<head>
  <title>OL Airship bridge test</title>
  <link href="https://fonts.googleapis.com/css?family=Open+Sans&display=swap" rel="stylesheet">
  <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/openlayers/openlayers.github.io@master/en/v6.1.1/css/ol.css" type="text/css">
  <style>
    html, body {
      height: 100%;
      width: 100%;
      margin: 0;
      padding: 0;
      font-family: "Open Sans";
      font-size: 14px;
    }
    #map {
      position: absolute;
      width: 100%;
      height: 100%;
    }
  </style>
</head>
<body>
  <div id="map"></div>
  <script src="main.js"></script>
</body>
</html>

src/app.js

import Map from 'ol/Map'
import View from 'ol/View'
import TileLayer from 'ol/layer/Tile'
import XYZ from 'ol/source/XYZ'
import { fromLonLat } from 'ol/proj'

export function init() {
  const view = new View({
    center: fromLonLat([-122.297374, 37.355579]),
    zoom: 5.55,
  })
  const olMap = new Map({
    view,
    target: 'map',
    layers: [
      new TileLayer({
        source: new XYZ({
          urls: [
            'https://a.basemaps.cartocdn.com/light_all/{z}/{x}/{y}.png',
            'https://b.basemaps.cartocdn.com/light_all/{z}/{x}/{y}.png',
            'https://c.basemaps.cartocdn.com/light_all/{z}/{x}/{y}.png'
          ],
          crossOrigin: 'anonymous',
        }),
      })
    ],
  })
}

src/index.js

import { init } from './app'

window.addEventListener('load', init)

Simple enough! That should give us a simple interactive map centered on California (earthquakes, remember?) with a sweet grayscale base layer.

Dataviz

Now, let’s put some data into this. For the following part I took my inspiration from this live example from kepler.gl showing more than 50,000 earthquakes styled by magnitude in California. The dataset used in this map is quite rich so it will be a good opportunity to test the performance of our experiment. Also the example already showcases a time slider widget, we might compare it with the result we obtain eventually!

Ok, let’s load the data, parse it (it’s a CSV file) and put it in a vector layer:

src/app.js

// add import
import WebGLPointsLayer from 'ol/layer/WebGLPoints'

  // ...

  // create style & vector layer
  const layerStyle = {
    symbol: {
      symbolType: 'circle',
      size: [
        'interpolate',
        ['linear'],
        ['get', 'magnitude'],
        2.5, 4,
        5, 20,
      ],
      color: [
        'case',
        ['<', ['get', 'depth'], 0], 'rgb(223,22,172)',
        'rgb(223,113,7)',
      ],
      opacity: 0.5,
    }
  }
  const vectorLayer = new WebGLPointsLayer({
    source: new VectorSource({ attributions: 'USGS', }),
    style: layerStyle,
  })

  // ...

  // put it in the map
  const olMap = new Map({
    view,
    target: 'map',
    layers: [
      // ...
      vectorLayer
    ],
  })

  // ...

  // load CSV data & put features in the vector source
  fetch('https://raw.githubusercontent.com/uber-web/kepler.gl-data/master/earthquakes/data.csv')
    .then(response => response.text())
    .then(csv => {
      var features = []
      var prevIndex = csv.indexOf('\n') + 1 // scan past the header line
      var curIndex

      while ((curIndex = csv.indexOf('\n', prevIndex)) !== -1) {
        var line = csv.substr(prevIndex, curIndex - prevIndex).split(',')
        prevIndex = curIndex + 1

        var coords = fromLonLat([parseFloat(line[2]), parseFloat(line[1])])

        features.push(
          new Feature({
            date: new Date(line[0]),
            depth: parseInt(line[3]),
            magnitude: parseInt(line[4]),
            geometry: new Point(coords),
            eventId: parseInt(line[11]),
          })
        )
      }

      vectorLayer.getSource().addFeatures(features)
    })

The result should be:

Dataviz

A WebGL-accelerated vector layer with all points, nice! Notice how the layer has a style where the size varies according to magnitude, and a color which is violet in case the earthquake was recorded as “above ground” (i.e. depth < 0), which is most likely an artifact of the data.

To learn more about WebGL layer and their styles in OpenLayers, take a look at this example (these features are still experimental and not part of the official API yet). To keep it short, let’s just say these styles are plain JSON objects which can contain expressions quite similar to the Mapbox Style Spec ones. These expressions allow reading feature attributes, comparing values, interpolating between steps, etc. Quite powerful, but keep in mind they still are experimental and subject to change.

Next step: adding some widget!

Everybody loves aggregations

Airship UI components are available as WebComponents. Adding them to the project is straightforward in our case: we will simply add the necessary CSS and JS files in the HTML template.

Let’s modify our code to do just that:

public/index.html

<html>
<head>
  <title>OL Airship bridge test</title>
  ...
  <link rel="stylesheet" href="https://libs.cartocdn.com/airship-style/v2.1.1/airship.css">
  <script src="https://libs.cartocdn.com/airship-components/v2.1.1/airship.js"></script>
  <script src="https://libs.cartocdn.com/airship-bridge/v2.1.1/asbridge.js"></script>
  <style>
    html, body {
      height: 100%;
      width: 100%;
      margin: 0;
      padding: 0;
      font-family: "Open Sans";
      font-size: 14px;
    }
    #map {
      position: absolute;
      width: 100%;
      height: 100%;
    }
    /* new styling for the widget container */
    .dataviz-container {
      position: absolute;
      background-color: white;
      box-shadow: 0px 2px 4px rgba(0, 0, 0, 0.11);
      right: 1.6em;
      bottom: 1.6em;
      padding: 0.8em;
      width: 40%;
    }
  </style>
</head>
<body>
  <div id="map"></div>
  <!-- this contains the dataviz widget and is styled to be positioned above the map -->
  <div class="dataviz-container">
    <!-- this is the Airship web component -->
    <as-time-series-widget
      animated
      responsive
      heading="Animation"
      time-format="%x"
    >
    </as-time-series-widget>
  </div>
  <script src="main.js"></script>
</body>
</html>

The widget should show up but be completely empty! That’s OK, we haven’t given it any data. Its API reference is here. It actually works just like a histogram widget (reference here) but with added functionality. A first step would be to give the widget some data on the features in the source, in the form of an array of objects looking like { start: startTime, end: endTime, value: amountOfFeatures }.

Let’s create a new file with a utility to compute such data from a collection of features:

src/aggregation.js

/**
 * Generates a list of buckets describing the repartition of features.
 * @param {Feature[]} features
 * @param {string} attributeName
 * @param {number} bucketCount
 * @returns {{start: number, end: number, value: number}[]}
 */
export function generateBuckets(features, attributeName, bucketCount) {
  let buckets
  let min = Infinity
  let max = -Infinity
  for (let i = 0; i < features.length; i++) {
    const attr = features[i].get(attributeName)
    if (attr > max) max = attr.valueOf()
    if (attr < min) min = attr.valueOf()
  }
  buckets = new Array(bucketCount).fill(0).map((value, index, arr) => {
    const ratioStart = index / arr.length
    const ratioEnd = (index + 1) / arr.length
    return {
      start: ratioStart * (max - min) + min,
      end: ratioEnd * (max - min) + min,
      value: 0,
    }
  })

  // count features
  for (let i = 0; i < features.length; i++) {
    const attr = features[i].get(attributeName)
    for (let j = 0; j < buckets.length; j++) {
      if (attr >= buckets[j].start && attr <= buckets[j].end) {
        buckets[j].value++
        break
      }
    }
  }

  return buckets
}

And then use this nice tool right after the features get loaded:

src/app.js

import { generateBuckets } from './aggregation'

  // ...

  // reference the widget using a query
  const timeWidget = document.querySelector('as-time-series-widget')

  // load map data
  fetch('https://raw.githubusercontent.com/uber-web/kepler.gl-data/master/earthquakes/data.csv')
    .then(response => response.text())
    .then(csv => {

      // ...

      vectorLayer.getSource().addFeatures(features)

      // generate the widget data
      timeWidget.data = generateBuckets(features, 'date', 20)
    })

Dataviz

This widget does work well! For instance, selecting a time range is super easy, simply a matter of clicking and dragging on it:

Dataviz

Hold on… Nothing on the map is happening on the map if I’m doing this! That makes sense, we update the widget once with static data from the map but haven’t really built a bridge per se: there is no interaction between the widget and the map. Let’s address this in the next section.

Bridging the gap

Now, what are our options for dynamically hiding/showing points based on the selected time range in the widget?

  • Adding/removing features in real time: considering how OpenLayers has to rebuild rendering buffers on each addition/deletion in the vector source, this would rapidly become impractical; besides, that would require to loop constantly on the features and recreate a new array, probably putting a lot of burden on the garbage collector. Clearly not the best way!
  • Changing the style of each feature to reflect their state: sadly, handling individual feature styles is not possible with the WebGL-accelerated layer; it would be possible with a standard vector layer, but this wouldn’t allow us to have smooth rendering of the points like we have now. Still not good!
  • Defining a filter expression on the points layer style to exclude features out of the range: that sounds better! Let’s try that.

Remember the style we gave initially to the points layer? We are going to use the variables and filter options to have a bit more control on it:

src/app.js

  const layerStyle = {
    // variables can be mutated by others!
    variables: {
      min: -Infinity,
      max: Infinity,
    },
    // if this resolves to false, the features are not drawn
    filter: ['between', ['get', 'date'], ['var', 'min'], ['var', 'max']],
    symbol: {
      // ...
    },
  }

Again, we are using expressions, this time the between operator which does pretty much what it says. Notice how we use the get operator to read a feature attribute, but the var operator to read from the variables dictionary.

Now, mutating layerStyle.variables should update the points visibility while keeping good performance (i.e. not rebuilding rendering buffers).

src/bridge.js

import { generateBuckets } from './aggregation'
/**
 * @param {Map} map
 * @param {VectorSource} source
 * @param {TimeSeriesWidget} widget
 * @param {string} attributeName
 * @param {function(min, max):void} updateSelection
 */
export function bindMapToTimeWidget(map, source, widget, attributeName, updateSelection) {
  let playing = false
  let currentSelection = null

  widget.data = generateBuckets(source.getFeatures(), attributeName, 20)

  // bind time widget to layer style
  widget.addEventListener('selectionInput', event => {
    currentSelection = event.detail === null ? null : event.detail.selection
    if (currentSelection !== null) {
      updateSelection(currentSelection[0], currentSelection[1])
    } else {
      updateSelection(null, null)
    }
  })
}

This utility is super simple for now. It feeds the widget with the content of the vector source only once, and calls an updateSelection callback when the selected time range changes. Note that we take in a map argument which we don’t use yet… that will come later.

We can the use this in our main file instead of generating the buckets ourselves:

src/app.js

import { bindMapToTimeWidget } from './bridge'

// ...

  // load map data
  fetch('https://raw.githubusercontent.com/uber-web/kepler.gl-data/master/earthquakes/data.csv')
    .then(response => response.text())
    .then(csv => {
      // ...

      vectorLayer.getSource().addFeatures(features)

      // we update the vector style variables on time selection change, and voila!
      bindMapToTimeWidget(
        olMap,
        vectorLayer.getSource(),
        timeWidget,
        'date',
        (min, max) => {
          layerStyle.variables.min = min || -Infinity
          layerStyle.variables.max = max || Infinity
          olMap.render()
        }
      )
    })

Sure enough, this is working fine!

Dataviz

Notice how we call olMap.render() after a time selection change? Without this, the map would not re-render automatically and nothing would update visually. See, OpenLayers does not redraw the map continuously: it only does so when the view is animating, or when something in the map changed (a source, a layer, etc.).

Our small app is now looking and feeling good, no one would deny that. Still, there is one step further we can take: updating the graph according to what is inside the viewport. This is something commonly seen in dataviz applications, as using the viewport as a spatial filter is both intuitive and easily done. Only, recomputing the buckets 60 times per second and more importantly filtering out features out of view is going to be taxing on the CPU… Fortunately, we have an amazing tool for that: WebWorkers.

Workers, unite!

This section is going to be the last and probably the most complex. For those who are not interested in this, I invite you to take a look at the final source code and the live example.

Back on track: WebWorkers, which we will improperly refer to as workers from now on, are objects created with a given source code, which will run in a separate thread and as such not block the execution of the main thread. Both threads will communicate using messages. Simple!

Simple, only… we want a worker to compute the features in the current view extent as well as regenerate the graph buckets, but the worker will not have access to the vector source as it is part of the main thread! We will have to somehow transfer the features to the worker so that it has a mirror copy of the vector source in the map and use it for its computations.

OK, the first step is writing the worker code. This will not be that hard actually: the worker will expect two kinds of messages, either one containing a list of features (in GeoJSON) to initialize its internal collection, or one containing an extent to do its computations on. The first one will have a type property of features, and the second one buckets.

src/aggregation.worker.js

import GeoJSON from 'ol/format/GeoJSON'
import VectorSource from 'ol/source/Vector'
import { generateBuckets } from './aggregation'

// this should be a mirror of the vector source in the main thread
let vectorSource = new VectorSource()

// this is used to decode the GeoJSON received from the main thread
const geojson = new GeoJSON({
  dataProjection: 'EPSG:4326',
  featureProjection: 'EPSG:4326',
})

self.onmessage = event => {
  const type = event.data.type
  const data = event.data

  switch (type) {
    case 'features':
      // here we’re simply filling our source with features
      vectorSource.addFeatures(geojson.readFeaturesFromObject(data.features))
      break
    case 'buckets':
      // this is the aggregation part: we’re looking for features
      // in the given extent and recomputing the buckets on these
      const features = vectorSource.getFeaturesInExtent(data.extent)
      const buckets = generateBuckets(features, data.attributeName, 20)

      // then we’re sending back the buckets to the main thread
      self.postMessage({
        type: 'buckets',
        buckets,
      })
      break
  }
}

Great! Let’s modify our bridge utility a bit to leverage this obedient worker:

src/bridge.js

import { throttle } from 'throttle-debounce'
import AggregationWorker from './aggregation.worker'
import GeoJSON from 'ol/format/GeoJSON'
import { transformExtent } from 'ol/proj'

// used to encode features before sending them to the worker
const geojson = new GeoJSON({
  featureProjection: 'EPSG:3857',
  dataProjection: 'EPSG:4326',
})

/**
 * @param {Map} map
 * @param {VectorSource} source
 * @param {TimeSeriesWidget} widget
 * @param {string} attributeName
 * @param {function(min, max):void} updateSelection
 */
export function bindMapToTimeWidget(
  map,
  source,
  widget,
  attributeName,
  updateSelection
) {
  const worker = new AggregationWorker()

  // send full source content to worker
  worker.postMessage({
    type: 'features',
    features: geojson.writeFeaturesObject(source.getFeatures()),
  })

  // worker messages trigger a `message` event
  worker.addEventListener('message', event => {
    const type = event.data.type
    const data = event.data
    switch (type) {
      // the worker is sending us buckets! give them to the time series widget
      case 'buckets':
        widget.data = data.buckets
        if (!widget.backgroundData.length) {
          widget.backgroundData = widget.data
        }
    }
  })

  function updateTimeWidget() {
    worker.postMessage({
      type: 'buckets',
      extent: transformExtent(
        map.getView().calculateExtent(),
        'EPSG:3857',
        'EPSG:4326'
      ),
      attributeName: 'date',
    })
  }
  
  // do the initial computation
  updateTimeWidget()

  // on view change ask the worker to recompute the buckets
  // note: this is throttled so that the message is not be sent more that once every 500ms
  map
    .getView()
    .on(['change:center', 'change:resolution'], throttle(500, updateTimeWidget))

  let currentSelection = null

  // bind time widget to layer style
  widget.addEventListener('selectionInput', event => {
    currentSelection = event.detail === null ? null : event.detail.selection
    if (currentSelection !== null) {
      updateSelection(currentSelection[0], currentSelection[1])
    } else {
      updateSelection(null, null)
    }
  })
}

Easy enough! Now the graph updates in real time when the view changes, and with no locking (although the recomputation can take up to 300ms).

Dataviz

Notice how we have access to the worker class using import AggregationWorker from './aggregation.worker'? This is using a Webpack utility called worker-loader which hides a bit of the complexity from us. Behind the scenes, the worker code is actually squashed into a very long string (we could also compile it as a separate JS file but that may cause other issues).

Then it is just a matter of calling const worker = new AggregationWorker() and the worker is up and running! We could also have several identical workers organized in a pool and use a different one each time to prevent tasks stacking up in the worker thread.

Well, looking back at where we started, the result is not so bad, is it? Performance should be more than acceptable and the interactivity of the graph allows for interesting analysis. Now, considering we could use these utilities for any kind of data source (well… provided it contains only points!), and with a bit more work on the styling, there sure are many possibilities to explore. Dataviz is also about being creative with the raw material and turning a big chunk of static data into a powerful interactive tool for analysis and decision making.

Sadly… This is where this tutorial ends. Again, I invite you to take a look at the final source code and the live example which contains a bit more than what we have seen here, namely: animation of the time range and usage of the backgroundData property on the graph.

If you have any questions or suggestions, please feel free to get in contact with us! Thanks a lot for reading and happy coding!