My collaborator, Dr. Mercè Crosas, who is Director of Data Science in the Institute of Quantitative Social Sciences (IQSS) at Harvard services presented this talk as part of the Program on Information Science Brown Bag Series.
- Trusted data repositories that guarantee long-term access
- Mechanisms for formal data citation
- Sufficient information to understand and reuse the data (e.g. metadata, documentation, code)
It is interesting to consider the extent to which current tools and practices support these pillars: There have been many recent efforts focused trusted repositories and data citation. And many of the projects described by Dr. Crosas promise to advance the state of the practice in these areas.
Determining what is sufficient information for understanding and reuse, and making that information easy to extract from the research process and research process, to record in structured ways, and to expose and present to different audiences seems to be an area with many opportunities, and few broad solution. And it is also interesting to consider the extent to which the workflows for publishing data integrate with the workflows managing data during research, prior to publication — I discuss some of these tools and integration points in a previous post…