Python module reference
This module reference extends the manual with a comprehensive overview of the available functionality built into datalad. Each module in the package is documented by a general summary of its purpose and the list of classes and functions it provides.
High-level user interface
Dataset operations
|
Representation of a DataLad dataset/repository |
|
Create a new dataset from scratch. |
|
Create a dataset sibling on a UNIX-like Shell (local or SSH)-accessible machine |
|
Create dataset sibling on GitHub.org (or an enterprise deployment). |
|
Create dataset sibling at a GitLab site |
|
Create a dataset sibling on a GOGS site |
|
Create a dataset sibling on a Gitea site |
|
Create a dataset sibling on a GIN site (with content hosting) |
|
Creates a sibling to a dataset in a RIA store |
|
Drop content of individual files or entire (sub)datasets |
|
Get any dataset content (files/directories/subdatasets). |
|
Install one or many datasets from remote URL(s) or local PATH source(s). |
|
Push a dataset to a known sibling. |
|
Remove components from datasets |
|
Save the current state of a dataset |
|
Report on the state of dataset content. |
|
Update a dataset from a sibling. |
|
Unlock file(s) of a dataset |
Reproducible execution
|
Run an arbitrary shell command and record its impact on a dataset. |
|
Re-execute previous datalad run commands. |
|
Run prepared procedures (DataLad scripts) on a dataset |
Plumbing commands
|
Clean up after DataLad (possible temporary files etc.) |
|
Obtain a dataset (copy) from a URL or local directory |
|
Copy files and their availability metadata from one dataset to another. |
|
Create test (meta-)dataset. |
|
Report differences between two states of a dataset (hierarchy) |
|
Download content |
|
Run a command or Python code on the dataset and/or each of its sub-datasets. |
|
Manage sibling configuration |
|
Run command on remote machines via SSH. |
|
Report subdatasets and their properties. |
Miscellaneous commands
|
Add content of an archive under git annex control. |
|
Add basic information about DataLad datasets to a README file |
|
Create and update a dataset from a list of URLs. |
|
Find repository dates that are more recent than a reference date. |
|
Get and set dataset, dataset-clone-local, or global configuration |
|
Export the content of a dataset as a TAR/ZIP archive. |
|
Export an archive of a local annex object store for the ORA remote. |
|
Export the content of a dataset as a ZIP archive to figshare |
|
Configure a dataset to never put some content into the dataset's annex |
Display shell script for enabling shell completion for DataLad. |
|
|
Generate a report about the DataLad installation and configuration |
Support functionality
Class the starts a subprocess and keeps it around to communicate with it via stdin. |
|
constants for datalad |
|
Logging setup and utilities, including progress reporting |
|
Internal low-level interface to Git repositories |
|
Interface to git-annex by Joey Hess. |
|
Various handlers/functionality for different types of files (e.g. for archives). |
|
Support functionality for extension development |
|
Base classes to custom git-annex remotes (e.g. extraction from archives). |
|
Custom remote to get the load from archives present under annex |
|
Thread based subprocess execution with stdout and stderr passed to protocol objects |
|
Base class of a protocol to be used with the DataLad runner |
Configuration management
Test infrastructure
Miscellaneous utilities to assist with testing |
|
Helper to provide heavy load on stdout and stderr |
Command interface
High-level interface generation |
Command line interface infrastructure
Call a command interface |
|
This is the main() CLI entryproint |
|
Components to build the parser instance for the CLI |
|
Render results in a terminal |