Ten Rules for Coding with D3

Anyone familiar with JavaScript who has tried their hand at D3 knows that coding in it is a little, well, different. For instance, take this snippet of code included in the dummy example I wrote for the UW-Madison Geography 575 lab my students were just assigned:

var provinces = map.selectAll(".provinces")
    .data(topojson.feature(europe, europe.objects.FranceProvinces).features)
    .enter()
    .append("g")
    .attr("class", "provinces")
    .append("path")
    .attr("class", function(d) { return d.properties.adm1_code })
    .attr("d", path)
    .style("fill", function(d) {
        return choropleth(d, colorize);
    })
    .on("mouseover", highlight)
    .on("mouseout", dehighlight)
    .on("mousemove", moveLabel)
    .append("desc")
    .text(function(d) {
        return choropleth(d, colorize);
    });

Now, someone who is familiar with jQuery or Leaflet will probably recognize the method chaining, and some of the methods may even look familiar. But what’s really going on here is somewhat more complex than the syntax lets on. This fall, I’ve had to put a lot of attention into figuring out how to teach this powerful data visualization library to Cartography majors, many of whom had never written a line of JavaScript before taking the class. Fortunately, I’ve had some great tools at my disposal, including Scott Murray’s excellent book, Mike Bostock’s thorough API documentation, and the awesome D3 Examples Gallery. In making use of these resources, it has come to my attention that there’s a set of unwritten but generally agreed-upon conventions for D3 code that go beyond those of ordinary JavaScript. I’ve also decided that there are a few practices that may not be used universally by D3 programmers but help make the workings of the code more clear for newbies, and therefore should become standard practice. Finally, while teaching this week, I found myself inventing a bit of terminology and combining it with other words defined by Mike Bostock to describe D3 coding to students. It dawned on me that sharing my own set of D3 rules via a blog post might be useful to others who are in the process of making heads or tails of the library, so I humbly offer these up as suggestions.

D3 Code Rules

Chain syntax is not a new term; it refers to the syntax pioneered by jQuery that allows you to piggyback methods in sequence. D3 raises method chaining to an art form, resulting in chains that can get quite long and unwieldy. As Scott Murray puts it, “Both I and your optometrist highly recommend putting each method on its own indented line.” As in the code above, this formatting practice is used universally in the examples posted to the D3 Gallery to make the code neat and understandable.

Rule 1: Put each new method on its own indented line.


When writing the lab tutorial, I took to calling these chunks of chained methods code blocks or just blocks, which makes sense given a) their nice rectangular gestalt and b) Bostock’s bl.ocks.org site, a viewer for code saved on Gist. I recognize that Bostock may have meant “blocks” as a synonym for “Gists,” that is, whole snippets of sharable code; but I think it works better as a term for the segments of chained methods within the code. Since he didn’t explicitly define what a block is, I am taking the liberty to do so in the way that’s most useful to me.

Two things about blocks have already been conceptual snags for my students. The one I expected and hopefully inoculated them against was misplaced semicolons. Since JavaScript is conveniently sloppy and lets you get away with not placing a semicolon at the end of a statement in unminified code, beginners tend to think that semicolon placement doesn’t really matter. One of my most common errors in writing D3 is to tack more methods on to the end of a block I finished earlier and accidentally forget to move the semicolon, which of course breaks the code because now you have orphan methods that don’t reference anything. For instance:

var provinces = map.selectAll(".provinces")
    .data(topojson.feature(europe, europe.objects.FranceProvinces).features)
    .enter()
    .append("g")
    .attr("class", "provinces"); //SEMICOLON FAIL
    .append("path")
    .attr("class", function(d) { return d.properties.adm1_code });

The code above will break at .append(“path”) because that method now references nothing, since the semicolon above it ended the block.

Rule 2: If your code breaks, look for a wayward semicolon.


The second conceptual snag, which was less anticipated, was the struggle it’s taking for students to get what the methods actually reference, and even how they can tell which methods belong to D3 versus native JavaScript or some other library. It’s true that lots of these methods—.on, .append, .attr, etc.—are written the same way in multiple code libraries. I’ve found myself explaining that you have to reach backwards through the sequence of methods to find the original operand (the thing being operated on) and determine how it was created or selected. Understanding the flow of the script is one of the hardest things for a beginning web developer to learn, and stepping through the code forwards and backwards is a good way to become more familiar with it. (One of the most popular mini-assignments I give my students is to comment every line of a code snippet). It’s like the game Mousetrap, or any other Rube Goldberg machine for those of a different generation. Find the first stimulus in the reaction chain and you should be able to see whether that operand starts with d3, $, L, or just a plain JavaScript object/value. This also will determine what methods are available to use to manipulate that object.

Rule 3: The methods depend on how the operand was created.


In D3, the operand is often either a selection or a new element. Selection is a D3 term defined by Bostock as “an array of [markup] elements pulled from the current document.” A new element is a markup element added to the document. .select and .selectAll create a new selection (.select puts only one element in the array), while .append and .insert create new elements. The methods that follow an operand and do things to it Bostock calls operators. Thus, a code block may contain several operands, with each operator referencing the most recently selected or created element, e.g.:

var provinces = map.selectAll(".provinces") //FIRST OPERAND--SELECTION
    .data(topojson.feature(europe, europe.objects.FranceProvinces).features) //OPERATOR ON SELECTION
    .enter() //OPERATOR ON SELECTION
    .append("g") //SECOND OPERAND--NEW ELEMENT
    .attr("class", "provinces") //OPERATOR ON NEW ELEMENT

This can result in confusion if too many new elements are created in a single block. It is a good idea to create only one new element with each block, so you know what the variable assigned to the block is referencing and can easily access it again without creating a new selection. You can always pick up the selection and add on to it in a new block.

The code above violates this principle; I wrote it before I had solidified my own practices. So let’s fix it:

var provinces = map.selectAll(".provinces") //SELECTION
    .data(topojson.feature(europe, europe.objects.FranceProvinces).features)
    .enter()
    .append("g") //NEW ELEMENT
    .attr("class", "provinces")
    .append("path") //NEW ELEMENT
    .attr("class", function(d) { return d.properties.adm1_code })
    .attr("d", path)
    .style("fill", function(d) {
        return choropleth(d, colorize);
    })
    .on("mouseover", highlight)
    .on("mouseout", dehighlight)
    .on("mousemove", moveLabel)
    .append("desc") //NEW ELEMENT
    .text(function(d) {
        return choropleth(d, colorize);
    });

…changes to…

var provinces = map.selectAll(".provinces") //SELECTION
    .data(topojson.feature(europe, europe.objects.FranceProvinces).features)
    .enter()
    .append("g") //NEW ELEMENT
    .attr("class", "provinces");

var provincesPath = provinces.append("path") //NEW ELEMENT
    .attr("class", function(d) { return d.properties.adm1_code })
    .attr("d", path)
    .style("fill", function(d) {
        return choropleth(d, colorize);
    })
    .on("mouseover", highlight)
    .on("mouseout", dehighlight)
    .on("mousemove", moveLabel);

var provincesDesc = provincesPath.append("desc") //NEW ELEMENT
    .text(function(d) {
        return choropleth(d, colorize);
    });

Sure, it’s a little longer, but now we have three variables instead of one, each referencing its own set of elements in the selection. All three of these blocks reference the same selection, and since this is a .selectAll selection, the methods in each will apply iteratively using the same data given to the selection in the first block (see this page of the API for more info on selections; or read the simplified explanation in the book).

Rule 4: Create only one new element (or element set) per block.


Notice that I assigned each block to its own variable, which I didn’t have to do for the code to work at this stage. Again, the variable will reference the last operand (selection or new element) in the block if operators are called on it in the future. I find that assigning each block a variable makes it easier to reference the operand as needed, both in future code and in tutorials that explains the code. In this sense, the variable each block is assigned to functions as the name of the block. For instance, if I am working with a student having difficulties, I can say something like, “take a look at your provinces block” or “check the syntax of the provincesPath block.”

Rule 5: Assign each block to a logical variable (the block’s ‘name’).


It sometimes happens that you need to create a new selection of elements that were placed in the document or otherwise reference those elements for styling with CSS. If you have a lot of elements being created by D3, inspecting the document can get confusing. To keep things consistent between the various parts of the DOM, I usually assign each new element a class name that is the same as the name of the block that creates it. That way, I know where the elements I create are coming from in the code.

One of the blocks above (the provinces block) does this; the other two new blocks do not. In the case of the provincesPath block, I needed to assign unique class names to each element in the array based on the data, as those class names are used later in the code to link these path elements to other elements in other graphics. At the time I wrote it, I didn’t think to give it two class names (separated by a space), but that is a logical solution. The desc element set probably should also get a class, now that it’s in its own block. Let me fix these issues now:

var provincesPath = provinces.append("path")
    .attr("class", function(d) { 
        return d.properties.adm1_code + " provincesPath"; //ADDED A SECOND CLASS
    })
    .attr("d", path)
    .style("fill", function(d) {
        return choropleth(d, colorize);
    })
    .on("mouseover", highlight)
    .on("mouseout", dehighlight)
    .on("mousemove", moveLabel);

var provincesDesc = provincesPath.append("desc")
    .text(function(d) {
        return choropleth(d, colorize);
    })
    .attr("class", "provincesDesc"); //NEW CLASS

Rule 6: Assign each new element a class name identical to the block name.


Using element classes (as opposed to ids) is especially important with D3, since you need multiple elements with identical names to use .selectAll and create a multiple-element selection. But what about using .selectAll to create an empty selection? An empty selection (again, Bostock’s term, though poorly explained in the API) happens when .selectAll is applied to a selector that does not yet exist in any elements in the document. One of the cognitively challenging concepts in D3, it essentially creates a placeholder in the DOM for elements-to-be. The provinces block above starts by creating an empty selection; it applies the “.provinces” selector, which does not match any existing elements at the time .selectAll is called. The elements (new <g> tags) are actually created three lines down and assigned their class name on the line below that. So why bother feeding .selectAll a selector in the first place? It actually does work to omit the selector, i.e.:

var provinces = map.selectAll(".provinces")

//WORKS THE SAME AS

var provinces = map.selectAll()

//WHEN CREATING EMPTY SELECTIONS

But the problem here is, say you call this method inside of a function that could be used to both create new elements and reset the matching elements if they exist? Without the selector, you’ll be stuck just creating more identical elements rather than grabbing any existing ones from the document to manipulate. Aside from this “just in case” scenario, there is something to be said here once again for human semantics—the selector links the .selectAll statement visually to the elements that will be created later in the block.

Rule 7: Always pass the block’s name as a class selector to the .selectAll method, even when creating an empty selection.


Making groovy visualizations is all about how you style the elements on the page. The great advantage of D3 is that it gives you massive power to dynamically assign and modify the positioning, size, color, effects, animations, etc. of the elements you use it to create based on the data you pass to it. In many instances, though, there may be some elements that do not need to be modified after they are created, and others for which it is helpful to have a default style that can be overridden by user interaction. In these instances, it makes sense just to assign the element(s) a class and use CSS to create some static styles. For instance:

//IN THE SCRIPT

var countries = map.append("path")
    .datum(topojson.feature(europe, europe.objects.EuropeCountries))
    .attr("class", "countries")
    .attr("d", path);

//IN A CSS STYLESHEET

.countries {
    fill: #fff;
    stroke: #ccc;
    stroke-width: 2px;
}

Rule 8: Assign static or default styles using a CSS stylesheet.


When styling dynamically, of course, you want to assign styles in your code blocks. SVG graphics can be styled by passing the style rules as either attributes or in-line CSS styles. You might think (as I did when I started) that passing the styles as individual attributes would take precedence over in-line CSS rules assigned to a single style attribute, but in fact it’s the other way around. For instance:

    .style("fill", function(d) {
        return choropleth(d, colorize);
    })

//OVERRIDES

    .attr("fill", function(d) {
        return choropleth(d, colorize);
    })

Things can get confusing if you assign a style rule as a style in one place and then try to re-assign it as an attribute in another. Thus, it’s best to pick one or the other, and style generally seems more appropriate to me. Note that this does not apply to element x/y positions or path d strings, which are only available as attributes.

Rule 9: For dynamic inline style rules, use .style instead of .attr.


Through all of these recommendations, I haven’t really touched on the data that is going into the element creation and manipulation. D3 works with data in the form of arrays. The combination of .select and .datum executes the operators following it once, treating the data passed to .datum as a single data point (or datum). The combination of .selectAll, .data, and .enter prime the selection to execute the following operators once for each value in the array that is passed to .data.

The three main data types for single values in JavaScript are Number (e.g., 42), String (e.g., “the answer to life, the universe, and everything”), and Boolean (e.g., true or false). As a weakly typed language, JavaScript doesn’t make you declare the data type of variables and lets you play fast and loose with the different datatypes. But since the outcome may differ for certain operations depending on the data type, it’s best to pay close attention to what type you are passing the data as (console.log(typeof d)), and force-type it before use if necessary.

Rule 10: Make sure the data are the correct type for the operations using them.


There is lots more that can be said—and hopefully will be said—about coding with D3. For instance, I haven’t even mentioned generator functions—functions that return other functions—which deserve a whole blog post to themselves. These rules and terms are suggestions, but I realize every developer has their own style and there could even be logic errors in mine. I don’t really care whether you start using what I’ve defined here. Rather, my take-home message is this: we should be thinking not only in terms of how to make sense of D3 ourselves, but also how to teach it to others in a logical and consistent fashion. I am sure I’ll come up with more ideas about this over the next few weeks of teaching experience, and I hope that others add theirs on as well.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s