Renku Command Line¶
The base command for interacting with the Renku platform.
renku
(base command)¶
To list the available commands, either run renku
with no parameters or
execute renku help
:
$ renku help
Usage: renku [OPTIONS] COMMAND [ARGS]...
Check common Renku commands used in various situations.
Options:
--version Print version number.
--config PATH Location of client config files.
--config-path Print application config path.
--path <path> Location of a Renku repository. [default: .]
--renku-home <path> Location of Renku directory. [default: .renku]
-h, --help Show this message and exit.
Commands:
# [...]
Configuration files¶
Depending on your system, you may find the configuration files used by Renku command line in a different folder. By default, the following rules are used:
- MacOS:
~/Library/Application Support/Renku
- Unix:
~/.config/renku
- Windows:
C:\Users\<user>\AppData\Roaming\Renku
If in doubt where to look for the configuration file, you can display its path
by running renku --config-path
.
You can specify a different location via the RENKU_CONFIG
environment
variable or the --config
command line option. If both are specified, then
the --config
option value is used. For example:
$ renku --config ~/renku/config/ init
instructs Renku to store the configuration files in your ~/renku/config/
directory when running the init
command.
renku init
¶
Create an empty Renku project or reinitialize an existing one.
Starting a Renku project¶
If you have an existing directory which you want to turn into a Renku project, you can type:
$ cd ~/my_project
$ renku init
or:
$ renku init ~/my_project
This creates a new subdirectory named .renku
that contains all the
necessary files for managing the project configuration.
renku datasets
¶
Work with datasets in the current repository.
Manipulating datasets¶
Creating an empty dataset inside a Renku project:
$ renku dataset create my-dataset
Adding data to the dataset:
$ renku dataset add my-dataset http://data-url
This will copy the contents of data-url
to the dataset and add it
to the dataset metadata.
renku run
¶
Track provenance of data created by executing programs.
Capture command line execution¶
Tracking exection of your command line script is done by simply adding
renku run
command before the previous arguments. The command will detect
- arguments (flags),
- string and integer options,
- input files if linked to existing files in the repository,
- output files if modified or created while running the command.
Note
If there were uncommitted changes then the command fails. Check git status to see details.
Detecting input files¶
The arguments are identified as input file only if they path matches an existing file in the repository. There might be several situations when the detection might not work as expected:
- If the file is modified during the execution, then it is stored as an output;
- If the path points to a directory, then it is stored as a string option.
Detecting output files¶
Any file which is modified or created after the execution will be added as an
output. If the program does not have any output file, you can specify the
--no-output
option.
There might be situations where an existing output file has not been changed
when the command has been executed with different parameters. The execution
ends with an error: Error: There are not any detected outputs in the
repository.
In order to resolve it remove any proposed input file from the
list first.
Detecting standard streams¶
Often the program expect inputs as a standard input stream. This is detected
and recorded in the tool specification when invoked by renku run cat < A
.
Similarly, both redirects to standard output and standard error output can be done when invoking a command:
$ renku run grep "test" B > C 2> D
Note
Detecting inputs and outputs from pipes |
is not supported.
renku log
¶
Show provenance of data created by executing programs.
File provenance¶
Unlike the traditional file history format, which shows previous revisions of the file, this format presents tool inputs together with their revision identifiers.
A *
character shows to which lineage the specific file belongs to.
A @
character in the graph lineage means that the corresponding file does
not have any inputs and the history starts there.
When called without file names, it shows the history of latest created files.
With the --revision <refname>
option the output is show as it was in the
specified revision.
Provenance examples¶
renku log --revision HEAD~5
- Show the history of files that have been created or modified 5 commits ago.
renku log B
- Show the history of file
B
since its last creation or modification. renku log --revision e3f0bd5a D E
- Show the history of files
D
andE
as it looked in the commite3f0bd5a
.
renku status
¶
Show status of data files created in the repository.
Inspecting a repository¶
Displays paths of outputs which were generated from newer inputs files and paths of files that have been used in diverent versions.
The first paths are what need to be recreated by running renku update
.
See more in section about renku update.
The paths mentioned in the output are made relative to the current directory
if you are working in a subdirectory (this is on purpose, to help
cutting and pasting to other commands). They also contain first 8 characters
of the corresponding commit identifier after the #
(hash). If the file was
imported from another repository, the short name of is shown together with the
filename before @
.
renku update
¶
Update outdated files created by “run” command.
Recreating outdated files¶
The information about dependencies for each file in the repository is generated from information stored in the underlying Git repository.
A minimal dependency graph is generated for each outdated file stored in the repository. It means that only the necessary steps will be executed and the workflow used to orchestrate these steps is stored in the repository.
Assume the following history for the file H
exists.
C---D---E
/ \
A---B---F---G---H
The first example shows situation when D
is modified and files E
and
H
become outdated.
C--*D*--(E)
/ \
A---B---F---G---(H)
** - modified
() - needs update
In this situation, you can do efectively two things:
Recreate a single file by running
$ renku update E
Update all files by simply running
$ renku update
Note
If there were uncommitted changes then the command fails. Check git status to see details.
Pre-update checks¶
In the next example, files A
or B
are modified, hence the majority
of dependent files must be recreated.
(C)--(D)--(E)
/ \
*A*--*B*--(F)--(G)--(H)
To avoid excesive recreation of the large portion of files which could have
been affected by a simple change of an input file, consider speficing a single
file (e.g. renku update G
). See also renku status.
renku workflow
¶
Workflow operations.