datalad.api.publish

datalad.api.publish(path=None, dataset=None, to=None, since=None, missing='fail', force=False, transfer_data='auto', recursive=False, recursion_limit=None, git_opts=None, annex_opts=None, annex_copy_opts=None, jobs=None)

Publish a dataset to a known sibling.

This makes the last saved state of a dataset available to a sibling or special remote data store of a dataset. Any target sibling must already exist and be known to the dataset.

Optionally, it is possible to limit publication to change sets relative to a particular point in the version history of a dataset (e.g. a release tag). By default, the state of the local dataset is evaluated against the last known state of the target sibling. Actual publication is only attempted if there was a change compared to the reference state, in order to speed up processing of large collections of datasets. Evaluation with respect to a particular “historic” state is only supported in conjunction with a specified reference dataset. Change sets are also evaluated recursively, i.e. only those subdatasets are published where a change was recorded that is reflected in to current state of the top-level reference dataset. See “since” option for more information.

Only publication of saved changes is supported. Any unsaved changes in a dataset (hierarchy) have to be saved before publication.

Note

Power-user info: This command uses git push, and git annex copy to publish a dataset. Publication targets are either configured remote Git repositories, or git-annex special remotes (if they support data upload).

Parameters:
  • path (sequence of str or None, optional) – path(s), that may point to file handle(s) to publish including their actual content or to subdataset(s) to be published. If a file handle is published with its data, this implicitly means to also publish the (sub)dataset it belongs to. ‘.’ as a path is treated in a special way in the sense, that it is passed to subdatasets in case recursive is also given. [Default: None]
  • dataset (Dataset or None, optional) – specify the (top-level) dataset to be published. If no dataset is given, the datasets are determined based on the input arguments. [Default: None]
  • to (str or None, optional) – name of the target sibling. If no name is given an attempt is made to identify the target based on the dataset’s configuration (i.e. a configured tracking branch, or a single sibling that is configured for publication). [Default: None]
  • since (str or None, optional) – When publishing dataset(s), specifies commit (treeish, tag, etc) from which to look for changes to decide whether updated publishing is necessary for this and which children. If empty argument is provided, then we would take from the previously published to that remote/sibling state (for the current branch). [Default: None]
  • missing ({'fail', 'inherit', 'skip'}, optional) – action to perform, if a sibling does not exist in a given dataset. By default it would fail the run (‘fail’ setting). With ‘inherit’ a ‘create-sibling’ with ‘–inherit-settings’ will be used to create sibling on the remote. With ‘skip’ - it simply will be skipped. [Default: ‘fail’]
  • force (bool, optional) – enforce doing publish activities (git push etc) regardless of the analysis if they seemed needed. [Default: False]
  • transfer_data ({'auto', 'none', 'all'}, optional) – ADDME. [Default: ‘auto’]
  • recursive (bool, optional) – if set, recurse into potential subdataset. [Default: False]
  • recursion_limit (int or None, optional) – limit recursion into subdataset to the given number of levels. [Default: None]
  • git_opts (str or None, optional) – option string to be passed to git calls. [Default: None]
  • annex_opts (str or None, optional) – option string to be passed to git annex calls. [Default: None]
  • annex_copy_opts (str or None, optional) – option string to be passed to git annex copy calls. [Default: None]
  • jobs (int or None or {'auto'}, optional) – how many parallel jobs (where possible) to use. [Default: None]