This might be of interest to a few readers: network plots with ggplot2 (via 339 députés sur Twitter | Polit’bistro : des politiques, du café — shameless plug)
This might be of interest to a few readers: network plots with ggplot2 (via 339 députés sur Twitter | Polit’bistro : des politiques, du café — shameless plug)
Francis Smart recently pointed to an important difference between R and Stata from a teaching perspective, which has to do with the additional learning costs of vectorization in R over the single-dataset orientation of Stata.
Stata makes it easy to manipulate names, or more specifically, variable names, as in a dataset with three variables for social expenditure called party1 party2 party3. This is common to many empirical preprocessed datasets.
// example
mvdecode party*, mv(999)
Furthermore, Stata works like an accountant’s book, so all variables belong to a same data object that never needs to be called beyond loading. This naturally suppresses a lot of possibilities, compensated in part by macros and scalars.
// example
loc regressors "age sex"
Macros in particular then branch with loops like the forval and foreach commands to allow more complex data processing. At that level of use, the software is flexible enough for most applied data cleaning.
// example
forval i = 1/3 {
replace socx`i' = socx`i' / 10^6
}
To access matrix notation, the Stata user needs to move to Mata syntax, while R immediately offers the user to manipulate objects through vectorization. Thinking in these terms is more demanding as there are more possibilities for errors, starting with calls to undeclared objects.
I teach both R and Stata. My experience with social science students is that the additional learning costs of R syntax need to be matched with other benefits to become valuable to them. To me, these benefits lie primordially in the more diverse array of data that R allows to access.
This will probably reach the syllabus soon after getting to print.
R Notebook with rCharts (by Ramnath Vaidyanathan) — and I suspect that this is only the beginning. Visualization is more and more interesting these days. Hat tip to KJH for linking to the video.
By using Excel, which was never designed for scientific research, they institutionalized mouse clicks and other untraceable actions into a scientific workflow, which must be avoided since it makes explaining to others (and to oneself) how to replicate the findings next to impossible and too easily introduces inadvertent mistakes.
Period. The replication was carried with R, and additional analysis (easily found online) was done with Stata.
Victoria Stodden at What the Reinhart & Rogoff Debacle Really Shows: Verifying Empirical Results Needs to be Routine — The Monkey Cage
From Patrick Burn’s presentation on the R Inferno. Interesting if you want some historical notes about the software.
Using R for causal inference in a study of expensive public policy decisions (by Jeromy Anglim, via)
Something’s wrong. The release notes for the last version of Rstudio state that error output is now shown “in distinct color for TextMate theme (and some others)”, so I was hoping for a change here, because students get confused by universal red ink.
“In this article we will show how the models look like, what kind of tools we used to build and visualise those and also providing a demo web application where anyone could compile a similar plot with a decent amount of annotations with a single click” (the straight-to-the-Web version of an exercise that we ran in class — via Olimpic predictions - from an R web service provider’s point of view | rapporter).