Skip to content

cryo-data/cryo-data

Repository files navigation

Table of contents

Introduction

  • This project provides a data meta-portal (not a meta-data portal) for cryospheric data.
  • A meta-portal is a data portal that doesn’t exist, but links instead to 3rd-party portals where the data resides.
  • Our goal is a single point of access for all data you might want to download.

Instructions

Install datalad

To access the data, you need to install datalad.

Install cryo-data

First use

To download the complete cryo-data data set as ‘empty’ files:

datalad install -r https://github.com/cryo-data/cryo-data

You can then download the actual data for an individual file or product with

datalad get path/to/file                            # get one file
datalad get -r author/someone_yyyy                  # get all files associated with one paper
datalad get -r cryo-data/data_provider/NSIDC/0642/  # get all files from provider dataset

See https://docs.datalad.org/en/stable/generated/man/datalad-get.html for more details.

Updates

The following command merges in all new updates, and the -r option does this recursively through all child datasets.

datalad update --merge -r .

Remove downloaded files

To free up space after downloading the data file contents, use datalad drop. See https://docs.datalad.org/en/stable/generated/man/datalad-drop.html.

Advanced usage

  • Any datalad dataset can be a child of another dataset
  • Any datalad child can be pruned from the tree/folder structure and considered as the top-level dataset

You can exploit the latter of these two features to check out only a subset of cryo-data. The following clones one author, and all NSIDC data

datalad install -r https://github.com/cryo-data/author/someone_yyyy
datalad install -r https://github.com/cryo-data/author/data_provider/NSIDC 

However, install or clone do not take up any disk space. Only get downloads the file contents and uses disk space, so you can install everything and only get the subset you want, as shown further up in this document.

Contribute

See ./CONTRIBUTING.org

Cite

Datalad

@article{halchenko_2021,
  author    = {Halchenko, Yaroslav and Meyer, Kyle and Poldrack, Benjamin and Solanky, Debanjum and
                  Wagner, Adina and Gors, Jason and MacFarlane, Dave and Pustina, Dorian and Sochat,
                  Vanessa and Ghosh, Satrajit and Mönc, Christian and Markiewicz, Christopher J. and
                  Waite, Laura and Shlyakhter, Ilya and de la Vega, Alejandro and Hayashi, Soichi
                  and Häusler, Christian Olaf and Poline, Jean-Baptiste and Kadelka, Tobias and
                  Skytén, Kusti and Jarecka, Dorota and Kennedy, David and Strauss, Ted and Cieslak,
                  Matt and Vavra, Peter and Ioanas, Horea-Ioan and Schneider, Robin and Pflüger,
                  Mika and Haxby, James V. and Eickhoff, Simon B. and Hanke, Michael},
  title	    = {DataLad: distributed system for joint management of code, data, and their
                  relationship},
  journal   = {Journal of Open Source Software},
  year	    = 2021,
  volume    = 6,
  number    = 63,
  pages	    = 3262,
  month	    = {Jul},
  ISSN	    = {2475-9066},
  url	    = {http://dx.doi.org/10.21105/joss.03262},
  DOI	    = {10.21105/joss.03262},
  publisher = {The Open Journal}}

About

cryo-data development

Resources

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors