Data Repositories

Data repositories are platforms that hold data, organize it in a logical way, and can make it available for reuse. They are used by research communities to share and discover data.

There are three main types of data repositories:

  • Disciplinary repositories focus on a particular area of research or type of data. They often have requirements for data formats, documentation, and metadata. You can find disciplinary data repositories by checking in with your peers, reviewing relevant journals for recommendations, or reviewing re3data, a registry of research data repositories.
  • Multidisciplinary/generalist repositories are not focused on a particular field and typically accept all types of data. Some examples of multidisciplinary repositories include FRDR, Dryad, Zenodo, and figshare.
  • Institutional data repositories are generalist repositories provided by a specific institution. U of T Dataverse in Borealis is the University of Toronto’s institutional data repository. It accepts research data from research conducted at or under the auspices of the University of Toronto. For more information go to About U of T Dataverse.

Selecting a data repository

Different data repositories offer different services and functions. Some things to consider before selecting a repository include:

  • Funder or journal requirements. Some funders, journals, or publishers may require or recommend that data be deposited in specific repositories. 
  • Disciplinary research data. Depositing your data in a disciplinary repository can make it easier for other researchers in your field to find and use it. It may also be common or expected practices to deposit data in certain repositories. 
  • Persistent identifiers (PID). Most repositories provide datasets with a persistent identifier, such as a DOI, that makes it easier for others to cite your work. Check what type of PID the repository offers and whether it meets your needs.
  • Access restrictions. Some repositories allow you to apply access restrictions to certain files or whole datasets. This means that you (or in some cases, the repository) can review and approve requests to access the data before it can be viewed or downloaded. 
  • Data licensing. Different repositories will allow you to apply different types of licenses. Check to see what options are available and ensure they work for you. Common examples of licenses that can be applied to data include Creative Commons and Open Data Commons
  • Cost. Some repositories are free to use while others may have an associated cost. Reviewing potential repositories at the beginning of your project will allow you to budget accordingly. U of T Dataverse is available to U of T researchers at no cost. 
  • Retention and preservation. Repositories may differ in how long they will keep your data and what preservation actions they perform to ensure it remains accessible and usable. 
  • Ability to update dataset. Some repositories allow you to update your dataset yourself (e.g. adding new versions, updating metadata) while others do not.
  • Curation services. Some repositories provide added value to your data through curation to meet deposit standards. The type of curation activities will vary, but may include things like metadata creation, applying keywords, validating files, indexing data for discoverability, and more. These actions can make your data easier for others to find, understand, and reuse. 

Resources

Library services
The library provides support for:

  • Selecting a data repository
  • U of T Dataverse in Borealis, the institutional data repository

External resources