Example plots with country-level data
The previous post mentioned the BuRd theme and ColorBrewer. Here are some possible uses of both in a series of plots with cross-sectional country-level data. The code uses pooled WDI estimates for fertility and real GDP per capita as measured by the World Bank, and then adds UN region names to the data.
// package dependencies
ssc install wbopendata
ssc install kountry
// WDI data
wbopendata, indicator(SP.DYN.TFRT.IN; NY.GDP.PCAP.PP.CD) year(2008:2010) long clear
collapse ///
(mean) fr = sp_dyn_tfrt_in ///
(mean) gdpc = ny_gdp_pcap_pp_cd ///
, by(countrycode)
// geo indicators
kountry countrycode, from(iso3c) geo(un)
encode GEO, gen(region)
Geographical regions make it easy to plot the data over small multiples. I also often find it useful to look at a mosaic plot to diagnose how seriously missing data puts representativeness to threat in the sample.



// package dependencies
ssc install splineplot
// small multiples
gr hbox gdpc, over(region, sort(1) des) mark(1, ms(i) mlab(countrycode) mlabp(0)) name(boxes, replace)
hist fr, bin(4) by(region, total) name(bins, replace)
// missing data
gen full = !mi(fr, gdpc)
spineplot full region
Going a bit further with regression results, a variety of graphs can be useful for running diagnostics. The first one shown below is a LOESS fit across the residuals against the fitted values, and the second one is an example of weighted markers where the error term is shown along the linear fit.


// residuals
gen loggdpc = ln(gdpc)
reg fr loggdpc
predict r, resid
predict yhat
// residuals-versus-fitted values, plus LOESS
sc r yhat, mlab(countrycode) yline(0) ms(i) mlabp(0) || lowess r yhat, ///
name(residuals_loess, replace)
// linear fit with residually weighted points
sc fr loggdpc if abs(r) > .3 [w = abs(r)], ms(O) mc(gs14) mfc(gs12) || ///
lfit fr loggdpc || ///
sc fr loggdpc, ms(i) mlab(countrycode) mlabc(gs6) mlabp(0) legend(off) ///
name(residuals_rvf, replace)
Last, a map of the residuals can also be informative if there is suspicion of spatial dependence in the error term:

// package dependencies
ssc install spmap
// map of residuals (caution with intervals)
merge 1:1 countrycode using world-d, keep(match master) gen(mapmerge)
spmap r using world-c, id(_ID) clmethod(boxplot) ///
fcolor(RdYlGn) ndocolor(gs12) ndfcolor(gs14) ocolor(none ..) ///
legstyle(1) legend(ring(1) pos(3)) ///
name(residuals_map, replace)
Country-level data is an ideal candidate for plot tweaks such as using marker labels instead of observations. With survey data, there would be more work to do at the level of the data itsef, and text labels would have to be taken from aggregate measures like relative frequencies or averages, which makes it more complex to plot the data quickly and efficiently.