Wednesday, October 11, 2006

Sparklines: display 10K data points in one-half of a square inch

Digg!
Last week I was able to attend Edward Tufte's course in Chicago about Data and Information visualization. ET is out promoting his new book, Beautiful Evidence. A lot of my prior experience is in visualization so I found a lot of what he had to say really interesting. Having been working on the new relaunch of MyOutdoors.net, I really took an interest in one particular topic, Sparklines.

Sparklines are a visual way of displaying thousands of data points in a really small space. A typical example of their use is to display stock trends over long periods of time. Because we implemented elevation profiles in the new site redesign this was a perfect match for displaying a small elevation profile for each of our individual journal entries. For an example of the way we use it, check out http://myoutdoors.net/entry.php?id=2039. It's a hike that my brother Brian did across three different peaks in Wyoming. The sparkline is displayed in the lower-right corner of the page, right next to a link to see a larger graph of the elevation data. It shows the general elevation of the profile in great detail, in about one-half of a square inch. You can click on the elevation link to see what our implementation looks like as well. Here's what it looks like:


We used the Sparkline PHP library to implement the sparklines. It was incredibly easy and we recommend it to anyone. One thing that I didn't like, however, was that it seemed to work primarily for integer indices on the x-axis (I couldn't figure it out anyway if it was supposed to work with floats). Because we graph elevation along the route of a journal entry, we needed to interpolate the distances between waypoints.

Note that we could have just mapped the distance at each waypoint to an integer index in the image. However, a lot of our journal entries have way more than 100 waypoints because they come straight from GPX data (the width of the sparkline is 100 pixels). Because the Sparkline PHP library uses gd, we didn't want to be drawing lines on the server for each set of two waypoints (when you use RenderResampled in the sparkline library, it creates a larger image and scales it down for anti-aliasing, so this could get really time consuming). We want to display sparklines for every journal entry, so with a lot of people on the site, they could cause some potential performance issues (no idea if that will actually work).

So anyway, if you want to create sparklines and you have a ton of data points, doing something along these lines might help your performance if you have a ton of traffic to your site (I never really measured, it just seemed like this would be faster than drawing lines in software due to the actual line drawing and probably the memory access pattern). It's not very difficult, but we thought we'd provide the code here in case anyone has similar needs:

Notes:

$distance is an array of float-valued distances from the start to each waypoint
$elevation is an array of float-valued elevations at each waypoint
$width is the width in pixels of the sparkline, we use a size of 100x20




// For each pixel in the sparkline, compute the elevation at that pixel.

for ($i=0; $i<$width; $i++) {
    $cur_dist = ($i / $width) * $total_dist;

    // Advance the index into our distance array.
    // This determines which waypoint we're between.
    while ($cur_dist > $distance[$dist_index+1]) {
        $dist_index++;
    }

    // Compute distance along this interpolation interval
    $val = ($cur_dist - $distance[$dist_index]) /
        ($distance[$dist_index+1] - $distance[$dist_index]);

    // Inpolate the elevation at the pixel location
    $elev = $elevation[$dist_index] +
        ($val * ($elevation[$dist_index+1] - $elevation[$dist_index]));

    $sparkline->SetData($i, $elev);
}

$sparkline->SetFeaturePoint(($min_dist / $total_dist) * $width, $min+1, "red", 3);
$sparkline->SetFeaturePoint(($max_dist / $total_dist) * $width, $max-1, "green", 3);
$sparkline->SetFeaturePoint(0, $elevation[0], "gray", 3);
$sparkline->SetFeaturePoint($width-1, $elevation[$rows-1], "gray", 3);

$sparkline->SetLineSize(4);
$sparkline->RenderResampled($width, $height);
$sparkline->Output();

1 comment:

Anonymous said...

You should have a look on the Excel sparkline tool MicroCharts at www.bonavistasystems.com.
Great Stuff!