According the the Registry of Research Data Repositories (re3data.org) a data repository is a
subtype of a sustainable information infrastructure which provides long-term storage and access to research data that is the basis for a scholarly publication. Research data means information objects generated by scholarly projects for example through experiments, measurements, surveys or interviews.
In other words, a data repository provides long-term storage to the data that supports scholarly publications. Data repositories are institutional efforts to provide sustainable preservation to the data created by researchers. Data repositories serve to ensure research data is accessible beyond the life of a grant, research project, or individual careers.
Many data repositories exist today. Some will be a better fit for your needs more than others. Here are some tips on selecting a data repository for your research:
Is the repository a reputable source? Check to see if it is endorsed by a funding agency, scholarly journal, professional society, library, or if it is listed in the Registry of Research Data Repositories. Publishing your data, like publishing an article, is best done with a reputable partner that is backed by an institution or your research community.
Having you data deposited in a repository that is unsustainable defeats the point of depositing it. This is why it is important to make sure your repository has the support of an institution, community, or funder. You’ll want to ensure the depository you select will be providing access to your data for well over 5 years. Many repositories will also have preservation plans and contingency plans on the outside chance funding is ceased. Lastly, don’t be afraid to ask about these plans.
One of the primary reasons to deposit your data in a repository is to obtain a unique identifier that others can use to cite your data. This service will increase the visibility of your data within the scholarly literature and allows researchers to find it later on. Ensure your data repository offers a DOI (digital object identifier), handle, or another unique indentifer.
Another way to think about visibility is to ask if researchers in your field use a repository. Some disciplines have an agreed upon repository that everyone uses and knows about. Ensure that you’re putting your data where the appropriate researchers are likely to find it (and hopefully use it).
The usability of a data repository is also important in ensuring that others will be able to access your data. Unfortunately, not all repositories have the funding to create great web interfaces with simple, intuitive interactions. However, if your peers are unable to find and download your data it will limit the effectiveness of sharing your data. A usable data repository should allow for users to easily upload, download, and cite data sets.
Some data repositories have really great features like integrations with Open Science Framework, GitHub, or other commercial storage solutions. While these feature may not be the keystones to providing long-term access to your data, they can help you share your data more frequently and effectively. Additionally, an author dashboard (a place you can view statistics, like downloads, on your data sets) or easy-to-understand licensing, like Creative Commons, can make your life a little easier. Lastly, you’ll want to review the upload and storage limits. Some repositories offer limited free storage before a fee is charged. Be sure to look over each data repository’s features and compare them with comparable services.
Most data repositories are able to handle most formats; however, this doesn’t always guarantee that they’ll be able to work with your data. Be sure to take a look at the repositories documentation to ensure they can store the data you’ve generated. In addition, see if the repository can generate previews or provide other user interactions with your data. While these features are not essential from a preservation perspective, they do help users understand and access your data.
For additional help selecting a data repository you can contact us or review the following site and materials:
OpenBU is the institutional repository for all creative and scholarly research outputs of Boston University. BU Research Data is an archive collection within OpenBU for digital research data generated by the university’s faculty, researchers, students, alumni, and staff. OpenBU provides long-term digital preservation and open access to data. All data in the collection are curated to increase potential for access and are assigned permanent links (Handles).
For help with using OpenBU, please contact us!
There are a number of data repositories available to scholars beyond BU. But there are a few things to keep in mind when you deposit your data.
A “general” data repository is subject independent and will have data from many fields. General data repositories are often well-known solutions with large user communities. General repositories are great places to store all your data because they tend to have robust features (like simple GitHub integration), strong institutional backing, and are indexed by major search engines like Google and Bing. However, the downside of general repositories is that because there is a lot of everything, users might have more difficulty finding your work.
A few examples of general data repositories are:
Many subject-specific data repositories exist today. Unlike a general data repository, discipline-based repositories can be very specific and well-known within a particular field. This can be both a good thing and a bad thing. On the upside, if your field has a specific repository you’re data will likely be seen by the right people - increasing its chance for reuse and further influence. The downside is that researchers outside of that discipline might not know where to look for your data. Generally speaking, if a subject-specific data repository exists for your research it is a good idea to use.
Finding, listing, and keeping up with all the repositories in existence is best done by directories. A few we recommend: