Today, we have rolled-out an alpha version of a new feature in RSocrata to export and archive data from Socrata open data portals. The new version is currently on the “nightly” branch of RSocrata repo on GitHub. Below, we’ll outline how the new
export.socrata function works as of version 1.8.0-3.
To install it, you’ll need R and the
devtools package to install it from the “nightly” branch:
A new function,
export.socrata() lets you download tabular data, maps, and attachment datasets by specifying the root domain, such as
export.socrata("https://data.cityofchicago.org"). Files will be downloaded into a subdirectory named after the target portal’s URL in your working directory. For instance,
export.socrata("https://data.cityofchicago.org") will be saved to
R’s default working directory is not too helpful, so change the working directory to something more helpful like your documents or home directory or a network drive with the
setwd() command. Future iterations will allow for this to be set from within the
Tabular data (e.g., csv’s) are downloaded and compressed as a
gz file. Windows users will need a compatible program, like 7-zip, to uncompress the files. Right now, map files are exported as KML files and are not compressed.
Data is saved and timestamped based on when the download began. Files are named by the Socrata unique “four-by-four”. The timestamps are saved in the timezone of the local time set on the computer performing the download.
export.socrata feature is in alpha and will be improved before submitting to CRAN. Some of the functionality may change in future iterations. As always, we would appreciate any feedback on our issues page or emailing firstname.lastname@example.org. Likewise, developers are free to also contribute directly to the project.