U of T Dataverse FAQs

1. Is there somewhere I can practice using U of T Dataverse?
Yes! If you want to practice using U of T Dataverse you can use the demo repository. The demo repository allows you to practice things like creating a dataverse, adding data, organizing your data, and seeing how things will look when published. Please note that we recommend using a dummy dataset in the demo repository.

2. What file types can I add? 
U of T Dataverse supports the uploading of any file type, including documents, images, video, audio, tabular files, compressed files, and more. We also encourage you to include a README file and documentation to help users navigate and reuse your data.

3. Why can’t I find the “Add Data” button?
The Add Data button is located on the U of T Dataverse home page to the right of the search bar. If you cannot find it, make sure you are logged in and that you are in the U of T Dataverse and not the main Borealis dataverse repository. If you still cannot find the Add Data button once you're in the U of T Dataverse Collection, please contact rdm@utoronto.ca 

4. When is a DOI assigned?
A DOI is assigned once you publish your dataset. If you need a DOI before you're ready to make your data available you can use access restrictions. 

5. Why is my dataverse collection/dataset not showing up in U of T Dataverse? 
The most likely reason your dataverse collection or dataset is not showing up is because it is unpublished. Your unpublished dataverse collections and datasets are visible to you when you are logged in, but will not be visible to anyone else. For more information go to publishing your data.

6. Why do my files convert to .tab files when I upload them?
When uploading tabular file types (e.g., XLS, SPSS, CSV, etc.), U of T Dataverse creates a .tab version of the files so they can be read and used in Dataverse Explorer. This process does not damage the original file, and users can download the file in both the original and.tab format. 

7. Why do I get a “Tabular Ingest Error” when I try to upload my dataset?
The most likely reason you’ve received a "Tabular Ingest Error" is a formatting issue with your original file, such as commas within cells or inconsistent column headers. 

If you receive a "Tabular Ingest Error" but do not want to reformat your cells, you can simply ignore this message. Users will not get an error message after your dataset is published and will still be able to access and download your data. However, you will not be able to use the Data Explorer Tool or Data Curation Tool if your files are not converted to the .tab format. 

8. What is the Data Curation Tool and how do I use it?
The Data Curation Tool allows a dataset administrator, curator, or contributor to add and edit variable-level metadata. For example, you can add information about weighted variables or how a variable was collected. Adding this metadata helps make your data easier to understand, interpret, and reuse. For more information on the Data Curation tool go to the Borealis User Guide.

9. What is Data Explorer and how do I use it?
The Data Explorer allows users to visualize and analyze data contained in tabular data files (.tab). Users can use this tool to cross-tabulate data and view summary statistics and charts. For more information on Data Explorer go to the Borealis User Guide.

10. I have a large number of files to add, is there a way to batch upload them?
Yes! You can use the DVUploader to batch upload files and automate parts of your deposit workflow. DVUploader is a command-line bulk uploader that uses the existing Dataverse application programming interface (API) to upload files from a specified directory into a specified Dataset. For more information on DVUploader visit the advanced guide

11. How do I add large files (over 5 GB) to U of T Dataverse?
U of T Dataverse can currently accept individual files that are up to 5 GB. If you need to deposit files that are larger than 5 GB you can compress your files into a ZIP or TAR file format. Note that U of T Dataverse will automatically unzip the first level of compression when you add your data files, so if you are depositing files larger than 5 GB you will need to double zip your file(s).

We are currently working to increase the file size limitation. If you are working with a particularly large dataset we recommend contacting rdm@utoronto.ca before uploading your files.

12. What does it mean to add an embargo to my dataset?
Adding an embargo to your dataset or to specific data files means those files will not be accessible until the embargo date has passed. On the date set, the files will be released (publicly or with restricted access as set) automatically by the system. 

You may choose to apply an embargo to indicate that a dataset exists and will be available at a specified future date. The metadata records of your published dataset, including a DOI, will be available, but users will not be able to preview, access, or request access to the embargoed files until the date has passed. This can allow researchers to satisfy journal or funder requirements to publish research data while also protecting a researcher’s data and intellectual property rights for a set period of time.  Note that once you apply an embargo, you will not be able to change the embargo end date. If you need to update this information contact rdm@utoronto.ca.  Go to embargos for more information. 

13. Should I put all files from a project in one dataset, or should I break it down into multiple datasets?
There are a number of factors that may influence how you decide to organize your files, including:

  • What makes the most sense for the data, or for someone using the data? If the datasets are related and would be used together, then it may make most sense to keep them together. If the datasets may be useful independently and you want to include more detailed descriptions of each, then you may want to separate them.
  • Is linking to the data from a publication (or elsewhere) a priority? If it is, then you may want to think about how you want to reference or refer to it. For example, if you were writing a data availability statement, deciding if you wanted to provide one DOI or multiple DOIs may be a determining factor. If you want to include multiple DOIs, your data would need to be organized into multiple Datasets (note that you can provide individual citations for data files within a Dataset, but they will all have the same DOI).
  • Do your data files have different authors? If yes, you may want to create datasets based on authorship to ensure contributions are properly acknowledged.

For more information on how to structure a dataset, go to structuring your data.

14. How should I reference my data in a publication?
Different publications may have different guidelines around how to reference data. This information can usually be found on the publication’s website under “submission guidelines” or “guide for authors”. Typically, you would include the following information: 

  • Author(s)
  • Year
  • Dataset title
  • Version (if applicable)
  • Repository name
  • Persistent identifier

U of T Dataverse uses DOIs as persistent identifiers, which are assigned to all published datasets. 

If a publication does not specify how your data should be referenced, you can use the recommended citation automatically generated by U of T Dataverse. This citation block can be found in the blue box at the top of the dataset homepage. In addition to the citation block provided, you can also download the XML, RIS, or BIB file for ingestion into your citation manager.

15. How do I link datasets to publications?
You can link publications and other related outputs through the dataset-level metadata in U of T Dataverse. Specifically, you can use the fields Related Publication, Related Material, and Related Dataset fields to input information. This can be done through a formatted citation, a URL, or a variety of identifiers (e.g., DOI, ISSN, arXiv, Handle, etc.). 

16. Can I reuse, add to, and make derivatives of other data in U of T Dataverse?
This will depend on the license and/or terms of access applied to the data you are using, which can be found in the Terms tab of a specific dataset. If you plan on using a dataset created by someone else you will need to ensure the license allows you to create and share derivative products.

Note that some licenses require that anything derived from the data must be shared under the same open license, which will determine what license is most appropriate to apply to your dataset.