Kratz J and Strasser C (2014)
Data publication consensus and controversies [v1; ref status: approved with reservations 1,
http://f1000r.es/3ag] F1000Research 2014,
3:94 (doi:
10.12688/f1000research.4264)
Abstract
The movement to bring datasets into the scholarly
record as first class research products (validated, preserved, cited,
and credited) has been inching forward for some time, but now the pace
is quickening. As data publication venues proliferate,
significant debate continues over formats, processes, and terminology.
Here, we present an overview of data publication initiatives underway
and the current conversation, highlighting points of consensus and
issues still in contention. Data publication implementations
differ in a variety of factors, including the kind of documentation,
the location of the documentation relative to the data, and how the data
is validated. Publishers may present the data as supplemental material
to a journal article, with a descriptive “data
paper,” or independently. Complicating the situation, different
initiatives and communities use the same terms to refer distinct but
overlapping concepts. For instance, the term “published” means that the
data is publicly available and citable to virtually
everyone, but it may or may not imply that the data has been
peer-reviewed. In turn, what is meant by data peer review is far from
defined; standards and processes encompass the full range employed in
reviewing the literature, plus some novel variations. Basic
data citation is a point of consensus, but the general agreement on the
core elements of a dataset citation frays if the data is dynamic or
part of a larger set. Even as data publication is being defined, some
are looking past publication to other metaphors,
notably “data as software,” for solutions to the more stubborn
problems.
No comments:
Post a Comment