Project Website: https://www.unroca.org
By Alexa Steinbrück. August 14, 2017
When I was working as an IT consultant for the United Nations I developed a data visualization concept for the United Nations Register of Conventional arms (UNROCA). The work was part of the overall redesign of the UNROCA platform, for which I had the role as the main developer (Frontend and Backend) and project lead. The project was realized between September 2015 and July 2017.
The final data visualizations were implemented using the data visualization library D3 and Angular.js. Details regarding the implementation will be discussed in the last section.
Here I am going to retrace parts of the creative-iterative process we underwent in pursuit of the ideal way to visualize UNROCA's data on the web.
A database, a political mechanism, an instrument of the United Nations
According to SIPRI, the Stockholm International Peace Research Institute, UNROCA is "the key international mechanism of official transparency on arms transfers".
UNROCA was established by United Nations General Assembly resolution in 1991 and is since managed by the United Nations Office for Disarmament Affairs (UNODA) based at the Headquarters of the United Nations in New York.
UNROCA was founded to "enhance confidence, promote stability, help States to exercise restraint, ease tensions and strengthen regional and international peace and security" and "to prevent the excessive and destabilizing accumulation of arms (...) in order to promote stability and strengthen regional or international peace and security".
It was established when inter-State warfare had, for a long time, been considered the main threat to peace and security. It focuses on transfers of those major conventional weapons that typically can be used in offensive military operations carried out across international borders. Source: "UNODA occasional papers"
The term "conventional arms" refers to weapons that are not weapons of mass destruction (e.g. nuclear, biological or chemical). UNROCA covers 7 categories of arms, which are deemed the most lethal ones in the sector of Conventional Arms:
Battle tanks | |
Armoured combat vehicles | |
Large-calibre artillery systems | |
Combat aircraft (includes both manned and unmanned aerial vehicles, aka drones) | |
Attack helicopters (includes both manned and unmanned aerial vehicles, aka drones) | |
Warships | |
Missiles and missile launchers |
In 2003, countries decided that small arms could be added to the Register. Many countries now include small arms and light weapons (SALW) in their yearly reports.
Member states are also invited to submit information on their military holdings and procurement from domestic production of major conventional weapons. States report voluntarily once a year.
Core of the UNROCA database is data on arms exports and imports in the sector of "heavy arms".
A curious aspect resulting from the states' voluntary participation, are the discrepancies between the reports of exporters and importers: Since arms trade is a bilateral business, you can see a trade from two sides, the exporters side and the importers side. Do both countries report the same, does it match? UNROCA has been called a monitoring tool, but it must be emphasized that it is by no means an intention to blame countries. Discrepancies can have many reasons.
Not all countries report in an equally comprehensive way. Some countries have never reported to UNROCA. However, this doesn't mean a total lack of information regarding those countries. We can use the data that the trading partners of those countries submitted.
I presented the above graphic to the client to illustrate how many dimensions are inside the UNROCA dataset and to suggest that there are numerous possibilities to unfold and visualize them.
I quickly realized that thinking in terms of dimensions was quite intuitive to me, but rather confusing to the client. I am going to stick to this term nevertheless because it is quite common in the field of statistics and data visualization literature.
If you look at the dataset "Exports (Heavy Arms)" alone, it looks high-dimensional:
And in the strict sense the following could also be included:
Of course there are datasets that have more dimensions, until thousands of dimensions (e.g. statistical models used in Machine Learning). But for data visualization 7 dimensions are definitely already a challenge!
Often, a good strategy is to focus on specific dimensions and to "curate" the data in order to answer certain questions about the data (or to tell a story). Often it is just not necessary to expose all dimensions or to provide "unfiltered" access to all dimensions.
Remember that those data visualizations are going to live inside a web browser which offers the magic skills of interactivity which the print medium doesn't have. This means we don't have to stick to a two-dimensional canvas only! Interactive graphics facilitate exploration.
Instead of speaking about data in terms of dimensions, it may be more beneficial to speak in terms of questions. There are many questions one could ask about UNROCA's data. For example:
How many battle tanks did Germany export last year?
To which countries and what were the exact models?
Where do most heavy weapons come from in a country?
Where do countries in conflict get their weapons from?
How did the trade relations between two countries change over the years?
Who are the biggest exporters of missile launchers today?
Those questions could be addressed by the data which UNROCA provides, but do they have to be addressed? What exactly needs to be made visible in the data? The data can be presented in a way that some questions are easier to be answered than others, or harder.
Some questions are more relevant than others. Some are just not covered by the mandates that a UN project is subject to and others are too politically sensitive. So which questions do we want to answer, are we allowed to answer (and how far can we stretch a mandate)?
I must confess, I have not previously been involved with military and defense topics and I do have no firsthand experience in trading arms. Hence, I figured that the best way to approach the topic would be rigorously mechanical:
I compiled a list of all questions you can possibly ask about the data, regardless if it would make sense or not to ask such a question. In computer science, this approach to problem solving is known as brute force and this is what machines do when they have no smart strategy. Sometimes we as humans can also decide to do it this mechanical way.
I presented the list to the client and asked him to select the questions that they deem relevant and to delete the questions that are irrelevant to them. The discussion that emerged from this list was actually an excellent introduction to the topic for me. It gave me a feeling for the political sensitivity of certain angles and topics, which helped me finding appropriate visual solutions and types of visualization.
During this meeting I understood: First and foremost we need to show who the trading partners of a particular country are. Then, what exactly, when and how much was traded. Furthermore, it is interesting to see how the import or export of a specific type of weapon has changed over time for a particular country.
To wrap up, we have two main focuses:
... and a side effect aspect (which doesn't need to be spotlighted):
Now that we know which dimensions we can focus on, it might be a good idea to split our data visualization effort on to two stages:
Now we can finally get our hands dirty, make some some wild sketches and mess around with quick prototypes!
If you think about connections between countries it is quite natural to think of a map in the first place. Lines could be drawn on top of countries to symbolize a trade connection. But this isn't all the information we need to convey! There are those other dimensions, e.g. the number of items that have been traded, the year dimension, and the different types of weapons. So how much information can we put into this special type of visualization?
Graphical features of the line (thickness, color, stroke) could signify another data dimension, e.g. the quantative aspect of a trade relation (number of items traded). This approach shows its limitation quite early. The visual result would appear messy, because the lines will hide other countries, and also relations between neighboring states will be de-emphasized and relations between states that are topographically further away will receive more importance.
Data visualization legend Edward Tufte hit the nail on the head:
„Maps wrongly equate the visual importance of each country with its geographic area (...). Our visual impression of the data is entangled with the circumstance of geographic boundaries, shapes and areas (...).“
What about this? It’s like a many-to-many chart.
But this is really just a bold simplification. In reality there are far too many states, not just those 7 on each side like in the sketch. Endless vertical scrolling could result in a bad user experience, because one could easilly loose the sense of the big picture.
[Comment: There are actually examples out there where it works quite well - For example Carlo Zapponi's global migration visualization "people move.in"]
And how to deal with those other dimensions (year, type of weapon, number of items)?
We could bring in the year-dimension with a slider, with which the user can select the year he's interested in. The type-of-weapon-dimension could be a simple drop-down menu. And what about the other data dimensions, e.g. the amount dimension? This one could be mapped to the thickness of the line. But still, it's quite a graphically messy mapping.
But I have deeper doubts about this type of visualization: Is it really logical to have two lists? Actually we have just one list of countries and their connections among each other. So what if we let 2 different regional groups (e.g. Western and Eastern) face each other. Would this type of visualization be a natural fit? Actually no, because states of the same regional group have trade relations among them. Mathematically speaking, it’s not a bipartite graph.
Circles are good for visualizing relations between one single list of items.
As a side effect we economize some space here and make it more compact.
Unfortunately now we have again the same problem that we had with the geographical map:
This layout visually de-emphasizes connections that are neighbours and over-emphasizes connections of countries that are on opposite sides.
And also another challenge remains: How to represent those other data dimensions, most importantly the number of items traded?
Maybe it is just too much trying to give the big picture right away and to deal with all countries once! What if we set up the website as "countryprofiles", meaning that we focus on 1 specific country at a time. This substracts one dimension, namely the "reporting country" dimension.
The country we are focusing on could be placed in the middle of the circle, reaching out to the perimeter of the circle where the trading partners are situated
Doing it this way we have an equal distance to all other countries (which is fair and doesn’t distort the data). To add some more simplicity, we can decide to show not all countries, but only those to which a connection exists. This reduces the amount of countries, and will make more space available for the representation of other data dimensions.
Let's try to find a mapping for the amount dimension: Instead of mapping it to the thickness of the connection line, we map it to the arc length that a country takes up on the circle contour:
But there's a potential problem with this visualization: It can be misinterpreted as a pie-chart?!
We don't use the whole circle, but just a partial arc of it. Again, the arc partition of each country is reflecting the amount dimension of each trade connection.
In order to reduce the number of countries, we can group them by regional groups and have a "Retina" per regional group:
Possible solutions:
What I am having in mind all the time:
Do we really have to represent connections as connections (lines, strokes, links)? Isn't it enough to just list it and call them connections? How much does a visual representation need to match archetypical mental models that people have? The above ideas are still not satisfying enough. And a big point is, that we haven't prototyped them using real data. The sketches on paper represent visualizations for ideal data, not for the real data.
Alright, let’s admit - We are stuck here. Let’s approach it from another angle. Let’s ignore the trading partner dimension for a moment and start with visualizing the volume of import/export of a specific country first!
Things are flowing in and out, a direct comparison of import and export is now possible.
Mapping rule: Length of bars symbolizes volume/amount of traded arms
Let's group the countries into regional groups, and offer the possibility to "zoom in".
Clicking on different parts of the visualization ("Hot Areas") gets you one dimension deeper into it.
"Hot Area" 1:
A click on the "Region"-Hot-Area" lets you zoom into region, "exploding" the countries:
"Hot Area" 2:
A click on the bar, reveals the different weapon categories
"Hot Area" 3:
A click on the extensions on the left and right of each bar shows trading partners (import or exports respectively):
But! What's happening here is a sneaky change of mapping rules:
Changing mapping rules is confusing for the reader. Also: How we deal with the other dimensions
1. A slider to select a year
2. A select box to toggle between the importers vs. the exporters view ("discrepancy switch")
Okay, so far so good. The details haven't been worked out yet, but the overall concept seems stable.
Insights:
Conclusion:
We should restructure the website more like country profiles. That's an idea that already came up at the beginning.
From a datavisualization perspective, this substracts one dimension from the dataset (the reporting country). Exposing one dimension of a dataset (the reporting country) as its own web page (a country profile) actually demonstrates that there is an overlap between the fields of website layout, information architecture and datavisualization.
Finally we went back to the good old table, back to raw numbers - no visualizations! The table is still the best visual solution, because:
Insights:
Sometimes it's better to go one step back. There is absolutely no obligation to be fancy or graphically mind-blowing. The top goal should be to enhance the best understanding. If a simple table can do the job best, then it has to be this table. I like to cite Dieter Rams 10th principle of good design here: „Good design is as little design as possible“
There are often discrepancies between the numbers that a country and its trading partners report. How to visually present 2 numbers that are kind of opposed to each other? We don’t want to emphasize the discrepancies too much (Remember: „we don’t want to blame countries“), but rather show them as two sides of a coin and just slightly biased:
One visual solution could be to give one side a stronger contrast and make the other side slightly graphically dimmed:
Let's add an "How to read this"-illustration to the website to explain those opposing 2 numbers to the reader:
The type of data we have here is Time series. Time series are very frequently plotted via line charts. But let's hold in for a second: What is the nature of the data? It is not continuum: A number from year x isn't the number from year x-1 plus a certain amount. This would be the case for "Military holdings": The military holding of a certain year, is the holding of the year before plus or minus an amount. Here a line chart would be appropriate. But not for the dataset of Exports/Imports. A barchart would be better, but given that we have 7 weapon categories, a stacked bar chart would be the logical conclusion. But stacked barcharts are bad. Better: Blob-Pauses-Blobs (like a heart rate or something)
Let’s ignore the different types of weapons for a moment. Then it would be one horizontal line with blobs on it:
Now the question is: How do we expose the Type-of-weapon dimension? How can we show the different types of weapons? That's easy: We just plot them underneath each other.
What we get then is what I call a „Bubble chart“. Other names are „Punchcard chart". This type of chart is kind of a „Scatter plot“, but with qualitative values on the y-axis.
We don’t map linearly. This is a compromise in favor of legibility. This kind of stuff would happen if we map linearly:
The compromise is: The bubble radius is just a hint, the written numbers on the labels are real. The graphical mapping not „reliable“ in this sense. There is a double representation: bubble radius AND number. It is not redundant but helps intuitive understanding of data. Access to reliable and correct numbers is always possible. Another compromise for readability: The minimum bubble size is 5px, so the surface can provide a colored background for the number.
Now we have still some remaining dimensions to deal with:
Let's start with the latter (the discrepancies)
The idea: We plot the 2 bubbles on top of each other in different colors (a main color, and a shaded color). If there is a discrepancy between the numbers a „corona“ shows up.
Let's add an "How to read this"-illustration to the website to explain those opposing 2 numbers to the reader:
Now we need to show a breakdown of each number because it represents an accumulated number: Which trading partners where involved? And what exactly was traded (details)?
First idea: The bubble could blow up to something like a donut chart which shows the breakdown of the accumulative number by which trading partner accounts for which quantity. But:
Seems like there is too much data to be packed inside the visual space that the donut provides us with.
Second idea: Use a modal!
When a bubble is clicked, a (bootstrap) modal pops up to show the breakdown by trading partners and all transaction details including a link to the final report. We are going one dimension deeper into the data, but we keep the context within reach (dimmed in the background).
Are we done?
Now we've made all dimensions of the UNROCA dataset "Arms Exports/Imports (Heavy weapons)" available.
By organizing the website as country profiles we subtracted one dimension from the dataset, reducing complexity.
By splitting into 2 main visualizations (One focusing on trading partners, the other one on yearly dynamics) we provided two different points of action for the user, exploring the same data from different angles, with increasing level of detail.