Wednesday, May 29, 2013

Free e-book on Data Science with R

A new book by Jeffrey Stanton from Syracuse Iniversity School of Information Studies, An Introduction to Data Science, is now available for free download. The book, developed for Syracuse's Certificate for Data Science, is available under a Creative Commons License as a PDF (20Mb) or as an interactive eBook from iTunes.
The book begins with the following clear definition of Data Science:
Data Science refers to an emerging area of work concerned with the collection, preparation, analysis, visualization, management and preservation of large collections of information. Although the name Data Science seems to connect most strongly with areas such as databases and computer science, many different kinds of skills - including non-mathematical skills, are needed.

Direct link to ebook page:

28 May 2013 - Registry of Research Data Repositories launched

An increasing number of universities and research organisations are starting to build research data repositories to allow permanent access in a trustworthy environment to data sets resulting from research at their institutions. Due to varying disciplinary requirements, the landscape of research data repositories is very heterogeneous. This makes it difficult for researchers, funding bodies, publishers, and scholarly institutions to select an appropriate repository for storage of research data or to search for data.

The registry allows the easy identification of appropriate research data repositories, both for data producers and users. The registry covers research data repositories from all academic disciplines. Information icons display the principal attributes of a repository, allowing users to identify the functionalities and qualities of a data repository. These attributes can be used for multi-faceted searches, for instance to find a repository for geoscience data using a Creative Commons license.

By April 2013, 338 research data repositories were indexed in 171 of these are described by a comprehensive vocabulary, which was developed by involving the data repository community (

The search at can be found at:
The information icons are explained at:

Repository operators can suggest their infrastructures to be listed in via a simple application form:   The team reviews and then lists the proposed repositories in the registry. A repository is indexed when the minimum requirements are met, i.e. mode of access to the data and repository, as well as the terms of use must be clearly explained on the repository pages. is funded by the German Research Foundation (DFG). Project partners are the Library and Information Services (LIS) of the GFZ German Research Centre for Geosciences, the Berlin School of Library and Information Science at the Humboldt-Universit├Ąt zu Berlin and the KIT Library at the Karlsruhe Institute of Technology (KIT). cooperates with the German Initiative for Network Information (DINI). The three partners with their expertise in information infrastructures guarantee the sustainability of the registry in the future.

Detailed information can be found in the following PeerJ preprint:

Tuesday, May 28, 2013

New Report: Digital Content: What's Next? - ALA Office for Information Technology Policy

May 22, 2013 - OITP released Digital Content: What’s Next?, a new American Libraries digital supplement that explores concepts recently introduced to the publishing arena, including self-publishing, digital preservation, ebook archiving and libraries as book publishers.

Friday, May 10, 2013

White House announcement on Open Data

On Thursday, May 9, 2013 "...President Obama signed an Executive Order directing historic steps to make government-held data more accessible to the public and to entrepreneurs and others as fuel for innovation and economic growth. The Executive Order declares that information is a valuable resource and strategic asset for the Nation. We couldn’t agree more.

Under the terms of the Executive Order and a new Open Data Policy released today by the Office of Science and Technology Policy and the Office of Management and Budget, all newly generated government data will be required to be made available in open, machine-readable formats, greatly enhancing their accessibility and usefulness, while ensuring privacy and security."

Open Data Policy

Executive Order