datalad.api.annotate_paths

datalad.api.annotate_paths(path=None, dataset=None, recursive=False, recursion_limit=None, action=None, unavailable_path_status='', unavailable_path_msg=None, nondataset_path_status='error', force_parentds_discovery=True, force_subds_discovery=True, force_no_revision_change_discovery=True, force_untracked_discovery=True, modified=None)

Analyze and act upon input paths

Given paths (or more generally location requests) are inspected and annotated with a number of properties. A list of recognized properties is provided below.

Input paths for this command can either be un-annotated (raw) path strings, or already (partially) annotated paths. In the latter case, further annotation is limited to yet-unknown properties, and is potentially faster than initial annotation.

Recognized path properties

“action”
label of the action that triggered the path annotation
“annexkey”
annex key for the content of a file
“logger”
logger for reporting a message
“message”
message (plus possible tsring expansion arguments)
“orig_request”
original input by which a path was determined
“parentds”
path of dataset containing the annotated path (superdataset for subdatasets)
“path”
absolute path that is annotated
“process_content”
flag that content underneath the path is to be processed
“process_updated_only”
flag that only known dataset components are to be processed
“raw_input”
flag whether this path was given as raw (non-annotated) input
“refds”
path of a reference/base dataset the annotated path is part of
“registered_subds”
flag whether a dataset is known to be a true subdataset of parentds
“revision”
the recorded commit for a subdataset in a superdataset
“revision_descr”
a human-readable description of revision
“source_url”
URL a dataset was installed from
“staged”
flag whether a path is known to be “staged” in its containing dataset
“state”
state indicator for a path in its containing dataset (clean, modified, absent (also for files), conflict)
“status”
action result status (ok, notneeded, impossible, error)
“type”
nature of the path (file, directory, dataset)
“url”
registered URL for a subdataset in a superdataset

In the case of enabled modification detection the results may contain additional properties regarding the nature of the modification. See the documentation of the diff command for details.

Parameters:
  • path (sequence of str or None, optional) – path to be annotated. [Default: None]
  • dataset (Dataset or None, optional) – an optional reference/base dataset for the paths. [Default: None]
  • recursive (bool, optional) – if set, recurse into potential subdataset. [Default: False]
  • recursion_limit (int or None, optional) – limit recursion into subdataset to the given number of levels. [Default: None]
  • action (str or None, optional) – an “action” property value to include in the path annotation. [Default: None]
  • unavailable_path_status (str or None, optional) – a “status” property value to include in the annotation for paths that are underneath a dataset, but do not exist on the filesystem. [Default: ‘’]
  • unavailable_path_msg (str or None, optional) – a “message” property value to include in the annotation for paths that are underneath a dataset, but do not exist on the filesystem. [Default: None]
  • nondataset_path_status (str or None, optional) – a “status” property value to include in the annotation for paths that are not underneath any dataset. [Default: ‘error’]
  • force_parentds_discovery (bool, optional) – Flag to disable reports of parent dataset information for any path, in particular dataset root paths. Disabling saves on command run time, if this information is not needed. [Default: True]
  • force_subds_discovery (bool, optional) – Flag to disable reporting type=’dataset’ for subdatasets, even when they are not installed, or their mount point directory doesn’t exist. Disabling saves on command run time, if this information is not needed. [Default: True]
  • force_no_revision_change_discovery (bool, optional) – Flag to disable discovery of changes which were not yet committed. Disabling saves on command run time, if this information is not needed. [Default: True]
  • force_untracked_discovery (bool, optional) – Flag to disable discovery of untracked changes. Disabling saves on command run time, if this information is not needed. [Default: True]
  • modified (str or bool or None, optional) – comparison reference specification for modification detection. This can be (mostly) anything that git diff understands (commit, treeish, tag, etc). See the documentation of datalad diff –revision for details. Unmodified paths will not be annotated. If a requested path was not modified but some content underneath it was, then the request is replaced by the modified paths and those are annotated instead. This option can be used with True as an argument to test against changes that have been made, but have not yet been staged for a commit. [Default: None]
  • on_failure ({'ignore', 'continue', 'stop'}, optional) – behavior to perform on failure: ‘ignore’ any failure is reported, but does not cause an exception; ‘continue’ if any failure occurs an exception will be raised at the end, but processing other actions will continue for as long as possible; ‘stop’: processing will stop on first failure and an exception is raised. A failure is any result with status ‘impossible’ or ‘error’. Raised exception is an IncompleteResultsError that carries the result dictionaries of the failures in its failed attribute. [Default: ‘continue’]
  • proc_post – Like proc_pre, but procedures are executed after the main command has finished. [Default: None]
  • proc_pre – DataLad procedure to run prior to the main command. The argument a list of lists with procedure names and optional arguments. Procedures are called in the order their are given in this list. It is important to provide the respective target dataset to run a procedure on as the dataset argument of the main command. [Default: None]
  • result_filter (callable or None, optional) – if given, each to-be-returned status dictionary is passed to this callable, and is only returned if the callable’s return value does not evaluate to False or a ValueError exception is raised. If the given callable supports **kwargs it will additionally be passed the keyword arguments of the original API call. [Default: None]
  • result_renderer ({'default', 'json', 'json_pp', 'tailored'} or None, optional) – format of return value rendering on stdout. [Default: None]
  • result_xfm ({'paths', 'relpaths', 'datasets', 'successdatasets-or-none', 'metadata'} or callable or None, optional) – if given, each to-be-returned result status dictionary is passed to this callable, and its return value becomes the result instead. This is different from result_filter, as it can perform arbitrary transformation of the result value. This is mostly useful for top- level command invocations that need to provide the results in a particular format. Instead of a callable, a label for a pre-crafted result transformation can be given. [Default: None]
  • return_type ({'generator', 'list', 'item-or-list'}, optional) – return value behavior switch. If ‘item-or-list’ a single value is returned instead of a one-item return value list, or a list in case of multiple return values. None is return in case of an empty list. [Default: ‘list’]