Contents

Visualizing global exports and imports of conventional arms

Documentation of a data visualization research process for the United Nations Register of Conventional Arms (UNROCA)

Project Website: https://www.unroca.org

By Alexa Steinbrück. August 14, 2017

 

When I was working as an IT consultant for the United Nations I developed a data visualization concept for the United Nations Register of Conventional arms (UNROCA). The work was part of the overall redesign of the UNROCA platform, for which I had the role as the main developer (Frontend and Backend) and project lead. The project was realized between September 2015 and July 2017.

The final data visualizations were implemented using the data visualization library D3 and Angular.js. Details regarding the implementation will be discussed in the last section.

 

Here I am going to retrace parts of the creative-iterative process we underwent in pursuit of the ideal way to visualize UNROCA's data on the web.





What is UNROCA?

A database, a political mechanism, an instrument of the United Nations

According to SIPRI, the Stockholm International Peace Research Institute, UNROCA is "the key international mechanism of official transparency on arms transfers".

UNROCA was established by United Nations General Assembly resolution in 1991 and is since managed by the United Nations Office for Disarmament Affairs (UNODA) based at the Headquarters of the United Nations in New York.

UNROCA was founded to "enhance confidence, promote stability, help States to exercise restraint, ease tensions and strengthen regional and international peace and security" and "to prevent the excessive and destabilizing accumulation of arms (...) in order to promote stability and strengthen regional or international peace and security".

It was established when inter-State warfare had, for a long time, been considered the main threat to peace and security. It focuses on transfers of those major conventional weapons that typically can be used in offensive military operations carried out across international borders. Source: "UNODA occasional papers"

The term "conventional arms" refers to weapons that are not weapons of mass destruction (e.g. nuclear, biological or chemical). UNROCA covers 7 categories of arms, which are deemed the most lethal ones in the sector of Conventional Arms:

Battle tanks
Armoured combat vehicles
Large-calibre artillery systems
Combat aircraft (includes both manned and unmanned aerial vehicles, aka drones)
Attack helicopters (includes both manned and unmanned aerial vehicles, aka drones)
Warships
Missiles and missile launchers
 

In 2003, countries decided that small arms could be added to the Register. Many countries now include small arms and light weapons (SALW) in their yearly reports.

Member states are also invited to submit information on their military holdings and procurement from domestic production of major conventional weapons. States report voluntarily once a year.

The data dimensions

Core of the UNROCA database is data on arms exports and imports in the sector of "heavy arms".

A curious aspect resulting from the states' voluntary participation, are the discrepancies between the reports of exporters and importers: Since arms trade is a bilateral business, you can see a trade from two sides, the exporters side and the importers side. Do both countries report the same, does it match? UNROCA has been called a monitoring tool, but it must be emphasized that it is by no means an intention to blame countries. Discrepancies can have many reasons.

Not all countries report in an equally comprehensive way. Some countries have never reported to UNROCA. However, this doesn't mean a total lack of information regarding those countries. We can use the data that the trading partners of those countries submitted.

Data dimensions
 

I presented the above graphic to the client to illustrate how many dimensions are inside the UNROCA dataset and to suggest that there are numerous possibilities to unfold and visualize them.

I quickly realized that thinking in terms of dimensions was quite intuitive to me, but rather confusing to the client. I am going to stick to this term nevertheless because it is quite common in the field of statistics and data visualization literature.

If you look at the dataset "Exports (Heavy Arms)" alone, it looks high-dimensional:

  • 1st Dimension: Country that reported
  • 2nd Dimension: Year of a transaction
  • 3rd Dimension: Trading partner
  • 4th Dimension: Type of weapon
  • 5th Dimension: Number of items

And in the strict sense the following could also be included:

  • 6th Dimension: Comments/Details on the transfer
  • 7th Dimension: Discrepancy: What the trading partner reported as an import
 

The challenge of multi-dimensional datavisualization. Intermediate thoughts.

Of course there are datasets that have more dimensions, until thousands of dimensions (e.g. statistical models used in Machine Learning). But for data visualization 7 dimensions are definitely already a challenge!

Often, a good strategy is to focus on specific dimensions and to "curate" the data in order to answer certain questions about the data (or to tell a story). Often it is just not necessary to expose all dimensions or to provide "unfiltered" access to all dimensions.

Remember that those data visualizations are going to live inside a web browser which offers the magic skills of interactivity which the print medium doesn't have. This means we don't have to stick to a two-dimensional canvas only! Interactive graphics facilitate exploration.

The questions

Instead of speaking about data in terms of dimensions, it may be more beneficial to speak in terms of questions. There are many questions one could ask about UNROCA's data. For example:

 

How many battle tanks did Germany export last year?
To which countries and what were the exact models?
Where do most heavy weapons come from in a country?
Where do countries in conflict get their weapons from?
How did the trade relations between two countries change over the years?
Who are the biggest exporters of missile launchers today?

 

Those questions could be addressed by the data which UNROCA provides, but do they have to be addressed? What exactly needs to be made visible in the data? The data can be presented in a way that some questions are easier to be answered than others, or harder.

Some questions are more relevant than others. Some are just not covered by the mandates that a UN project is subject to and others are too politically sensitive. So which questions do we want to answer, are we allowed to answer (and how far can we stretch a mandate)?

Brute force (to the rescue)

I must confess, I have not previously been involved with military and defense topics and I do have no firsthand experience in trading arms. Hence, I figured that the best way to approach the topic would be rigorously mechanical:

I compiled a list of all questions you can possibly ask about the data, regardless if it would make sense or not to ask such a question. In computer science, this approach to problem solving is known as brute force and this is what machines do when they have no smart strategy. Sometimes we as humans can also decide to do it this mechanical way.

I presented the list to the client and asked him to select the questions that they deem relevant and to delete the questions that are irrelevant to them. The discussion that emerged from this list was actually an excellent introduction to the topic for me. It gave me a feeling for the political sensitivity of certain angles and topics, which helped me finding appropriate visual solutions and types of visualization.

During this meeting I understood: First and foremost we need to show who the trading partners of a particular country are. Then, what exactly, when and how much was traded. Furthermore, it is interesting to see how the import or export of a specific type of weapon has changed over time for a particular country.

To wrap up, we have two main focuses:

  • trading partners
  • yearly dynamics

... and a side effect aspect (which doesn't need to be spotlighted):

  • discrepancies

Prioritizing dimensions

Now that we know which dimensions we can focus on, it might be a good idea to split our data visualization effort on to two stages:

  • One focusing on the trading partners.
  • Another one, focusing on the yearly dynamics.

Visualizing connections with trading partners

Now we can finally get our hands dirty, make some some wild sketches and mess around with quick prototypes!

Why not a map!

If you think about connections between countries it is quite natural to think of a map in the first place. Lines could be drawn on top of countries to symbolize a trade connection. But this isn't all the information we need to convey! There are those other dimensions, e.g. the number of items that have been traded, the year dimension, and the different types of weapons. So how much information can we put into this special type of visualization?




 

Graphical features of the line (thickness, color, stroke) could signify another data dimension, e.g. the quantative aspect of a trade relation (number of items traded). This approach shows its limitation quite early. The visual result would appear messy, because the lines will hide other countries, and also relations between neighboring states will be de-emphasized and relations between states that are topographically further away will receive more importance.

Data visualization legend Edward Tufte hit the nail on the head:

„Maps wrongly equate the visual importance of each country with its geographic area (...). Our visual impression of the data is entangled with the circumstance of geographic boundaries, shapes and areas (...).“

What we need is more abstraction


 

What about this? It’s like a many-to-many chart.

But this is really just a bold simplification. In reality there are far too many states, not just those 7 on each side like in the sketch. Endless vertical scrolling could result in a bad user experience, because one could easilly loose the sense of the big picture.

[Comment: There are actually examples out there where it works quite well - For example Carlo Zapponi's global migration visualization "people move.in"]

And how to deal with those other dimensions (year, type of weapon, number of items)?

We could bring in the year-dimension with a slider, with which the user can select the year he's interested in. The type-of-weapon-dimension could be a simple drop-down menu. And what about the other data dimensions, e.g. the amount dimension? This one could be mapped to the thickness of the line. But still, it's quite a graphically messy mapping.

But I have deeper doubts about this type of visualization: Is it really logical to have two lists? Actually we have just one list of countries and their connections among each other. So what if we let 2 different regional groups (e.g. Western and Eastern) face each other. Would this type of visualization be a natural fit? Actually no, because states of the same regional group have trade relations among them. Mathematically speaking, it’s not a bipartite graph.

Does a circular layout fit the data?


 

Circles are good for visualizing relations between one single list of items.
As a side effect we economize some space here and make it more compact.

Unfortunately now we have again the same problem that we had with the geographical map:
This layout visually de-emphasizes connections that are neighbours and over-emphasizes connections of countries that are on opposite sides.

And also another challenge remains: How to represent those other data dimensions, most importantly the number of items traded?

The big picture is too big

Maybe it is just too much trying to give the big picture right away and to deal with all countries once! What if we set up the website as "countryprofiles", meaning that we focus on 1 specific country at a time. This substracts one dimension, namely the "reporting country" dimension.


 

The country we are focusing on could be placed in the middle of the circle, reaching out to the perimeter of the circle where the trading partners are situated

Doing it this way we have an equal distance to all other countries (which is fair and doesn’t distort the data). To add some more simplicity, we can decide to show not all countries, but only those to which a connection exists. This reduces the amount of countries, and will make more space available for the representation of other data dimensions.

Let's try to find a mapping for the amount dimension: Instead of mapping it to the thickness of the connection line, we map it to the arc length that a country takes up on the circle contour:


 

But there's a potential problem with this visualization: It can be misinterpreted as a pie-chart?!

A variant: The Retina diagram

We don't use the whole circle, but just a partial arc of it. Again, the arc partition of each country is reflecting the amount dimension of each trade connection.


 

In order to reduce the number of countries, we can group them by regional groups and have a "Retina" per regional group:


 

Overall Problem/Topic: The big amount of „categorial/ordinal choices“ (in our case: countries)

Possible solutions:

  • Arrange it vertically because this is the only dimension that is potentially endless in a web page world
  • Reduce the amount of choices by:
    • Grouping it (e.g. into regional groups)
    • Cutting away the clutter: Just show the relevant data (countries that have a real connection)

What I am having in mind all the time:
Do we really have to represent connections as connections (lines, strokes, links)? Isn't it enough to just list it and call them connections? How much does a visual representation need to match archetypical mental models that people have? The above ideas are still not satisfying enough. And a big point is, that we haven't prototyped them using real data. The sketches on paper represent visualizations for ideal data, not for the real data.

Stuck!

Alright, let’s admit - We are stuck here. Let’s approach it from another angle. Let’s ignore the trading partner dimension for a moment and start with visualizing the volume of import/export of a specific country first!

Things are flowing in and out, a direct comparison of import and export is now possible.
Mapping rule: Length of bars symbolizes volume/amount of traded arms

Let's group the countries into regional groups, and offer the possibility to "zoom in".



 

Clicking on different parts of the visualization ("Hot Areas") gets you one dimension deeper into it.

"Hot Area" 1:
A click on the "Region"-Hot-Area" lets you zoom into region, "exploding" the countries:



 

"Hot Area" 2:
A click on the bar, reveals the different weapon categories



 

"Hot Area" 3:
A click on the extensions on the left and right of each bar shows trading partners (import or exports respectively):

   
 

But! What's happening here is a sneaky change of mapping rules:

  • The first visualization maps the amount to the graphical property of line length
  • The second visualization maps the amount to the graphical property of line thickness

Changing mapping rules is confusing for the reader. Also: How we deal with the other dimensions

1. A slider to select a year

2. A select box to toggle between the importers vs. the exporters view ("discrepancy switch")

Okay, so far so good. The details haven't been worked out yet, but the overall concept seems stable.

Let's present the ideas (sketches) to the client. Feedback! Iterations!

Insights:

  • Don't compare countries with each other
  • Don't compare regional groups with each other
  • It's really country-based, not regional-group-based
  • User perspective: The typical user wants information about 1 specific country

Conclusion:

We should restructure the website more like country profiles. That's an idea that already came up at the beginning.
From a datavisualization perspective, this substracts one dimension from the dataset (the reporting country). Exposing one dimension of a dataset (the reporting country) as its own web page (a country profile) actually demonstrates that there is an overlap between the fields of website layout, information architecture and datavisualization.

The good old table (is dataviz too)!

Finally we went back to the good old table, back to raw numbers - no visualizations! The table is still the best visual solution, because:

  • We have a dataset with wide range values (bad for visualization, better for raw numbers)
  • The trading partner dimension (qualitative) can be quite lengthy, therefore a vertical table column is the most space economizing layout inside a website. Websites are made to be scrolled after all.

Insights:

Sometimes it's better to go one step back. There is absolutely no obligation to be fancy or graphically mind-blowing. The top goal should be to enhance the best understanding. If a simple table can do the job best, then it has to be this table. I like to cite Dieter Rams 10th principle of good design here: „Good design is as little design as possible“

Visualizing the discrepancies

There are often discrepancies between the numbers that a country and its trading partners report. How to visually present 2 numbers that are kind of opposed to each other? We don’t want to emphasize the discrepancies too much (Remember: „we don’t want to blame countries“), but rather show them as two sides of a coin and just slightly biased:

One visual solution could be to give one side a stronger contrast and make the other side slightly graphically dimmed:

Let's add an "How to read this"-illustration to the website to explain those opposing 2 numbers to the reader:

Visualizing temporal dynamics

The type of data we have here is Time series. Time series are very frequently plotted via line charts. But let's hold in for a second: What is the nature of the data? It is not continuum: A number from year x isn't the number from year x-1 plus a certain amount. This would be the case for "Military holdings": The military holding of a certain year, is the holding of the year before plus or minus an amount. Here a line chart would be appropriate. But not for the dataset of Exports/Imports. A barchart would be better, but given that we have 7 weapon categories, a stacked bar chart would be the logical conclusion. But stacked barcharts are bad. Better: Blob-Pauses-Blobs (like a heart rate or something)

Let’s ignore the different types of weapons for a moment. Then it would be one horizontal line with blobs on it:

Now the question is: How do we expose the Type-of-weapon dimension? How can we show the different types of weapons? That's easy: We just plot them underneath each other.

What we get then is what I call a „Bubble chart“. Other names are „Punchcard chart". This type of chart is kind of a „Scatter plot“, but with qualitative values on the y-axis.

Defining the mapping rules

We don’t map linearly. This is a compromise in favor of legibility. This kind of stuff would happen if we map linearly:


 

The compromise is: The bubble radius is just a hint, the written numbers on the labels are real. The graphical mapping not „reliable“ in this sense. There is a double representation: bubble radius AND number. It is not redundant but helps intuitive understanding of data. Access to reliable and correct numbers is always possible. Another compromise for readability: The minimum bubble size is 5px, so the surface can provide a colored background for the number.

Now we have still some remaining dimensions to deal with:

  • Who are the trading partners?
  • Transaction details
  • Discrepancies (reportings of the export country vs. the import country)

Let's start with the latter (the discrepancies)

Visualizing the discrepancies (again)

The idea: We plot the 2 bubbles on top of each other in different colors (a main color, and a shaded color). If there is a discrepancy between the numbers a „corona“ shows up.

Examples of bubblecharts for different countries

Let's add an "How to read this"-illustration to the website to explain those opposing 2 numbers to the reader:

Expanding the trading partner dimension

Now we need to show a breakdown of each number because it represents an accumulated number: Which trading partners where involved? And what exactly was traded (details)?


 

First idea: The bubble could blow up to something like a donut chart which shows the breakdown of the accumulative number by which trading partner accounts for which quantity. But:

  • We still need to go one dimension deeper (= details of the transaction other than amount number)
  • We would need a legend to explain the country-color-code. Too much complexity.

Seems like there is too much data to be packed inside the visual space that the donut provides us with.

Second idea: Use a modal!

A (rather big) modal pop up is a more reliable piece of UI


 

When a bubble is clicked, a (bootstrap) modal pops up to show the breakdown by trading partners and all transaction details including a link to the final report. We are going one dimension deeper into the data, but we keep the context within reach (dimmed in the background).

Are we done?

Now we've made all dimensions of the UNROCA dataset "Arms Exports/Imports (Heavy weapons)" available.
By organizing the website as country profiles we subtracted one dimension from the dataset, reducing complexity.
By splitting into 2 main visualizations (One focusing on trading partners, the other one on yearly dynamics) we provided two different points of action for the user, exploring the same data from different angles, with increasing level of detail.

Technical aspects

  • Javascript libraries: d3.js, angular.js, datamaps.js, bootstrap.js
  • CSV for vector graphics (e.g. bubble chart)
  • Map technology: TopoJSON
  • Prototyping tools: Sketch-app & Adobe Illustrator
  • Big printing on paper for wireframes

The Upshot: What I learned during this project

  • Sketching prototypes on paper is great to get feedback from clients fast
  • Building prototypes using real data is even better, because only then you can know if you have scale/variation problems and other things. Something can look great in a handmade sketch, but with real data it would most often look dramatically different! "You can't evaluate a visualization absent the data" (M. Bostock)
  • There's no obligation to do fancy visualization. Simple is most often better. More effective and better understandable! „Good design is as little design as possible“ (D. Rams)
  • Do more user testing: Does the visualization communicate? Is the interface intuitive?