Last year, Ed Summers wrote a post called “Inside Out Library” about how rather than trying to pull in data from all over the world to present to local users, libraries should be finding and making accessible local data, both for use by their local community as well as the rest world. Rather than compete with Google, libraries should focus on their strength, authentic connection with and knowledge of local users and communities.
I’ve been thinking about this idea for a while, and one idea that seems obvious to me, but which I don’t see many libraries taking up, is getting involved with open civic data efforts. Cities, counties, states and even countries around the world are (slowly) starting to the embrace the idea of publishing budget data, crime data, land-holding data, and multitude of other datasets in publicly accessible and consumable ways. Governments see open data as an excellent opportunity to promote growth and commerce, spur entrepreneurship and innovation, and to meet transparency objectives.
Now, for all the best in intentions of providing transparency by publishing this data, many things remain murky. Some datasets are published in their raw, system specific format making them hard to understand for those not already initiated in the workings of that civic body. Trying to make connections across datasets is difficult at best, as labels vary wildly between datasets, let alone between different agencies/departments within the same civic jurisdiction. This is only meant as the mildest of rebukes for these civic data publishers. For civic bodies with already stretched budgets, getting data out is hard enough; getting it into some (as of yet non-existent standard) is a Sisyphean task.
So where can libraries fit into this? Where they always have, promoting information literacy (civic information in this case) for the community, providing stewardship for information’s discovery and long term preservation. And maybe new places, offering advice to civic bodies about ways to organize data so it can be easily discovered by users, advocating for the use of data standards across datasets, and maybe teaching users the technical skill sets required to understand and transform the civic data into information.
I think their a lots of possibilities for libraries to get involved here. Just a few of the top of my head:
- Run a class on how to pull the civic datasets into Excel and manipulate
- Take the next step and teach user the basics of OpenRefine to extract the data they need.
- Create documentation about the datasets showing where they overlap, guides explaining the types of data, and other “dejargonification” to make it more comprehensible for the public.
There is already a small but growing movement of civic hackers, data journalists, and non-profit organizations working realize the benefits of open civic data, and libraries can certainly make a contribution to that effort.