Commandline Tools Usage

sefara-select

Select fields from a sefara collection. To select all fields, just give the path to the collection: sefara-select my_datasets.sefara.py To get the ‘name’ and ‘path’ fields from a resource collection called “collection.sefara.py”: sefara-select collection.sefara.py name path Fields are interpreted as Python expressions and evaluated in a context that includes the resource’s attributes as local variables. For example: sefara-select collection.sefara.py “name.lower()” “os.path.abspath(path)” In csv output, a header row is included by default when more than one field is selected. To customize the label, give a label of the form “LABEL: EXPRESSION”. For example: sefara-select rc.py “resource: name” “full_path: os.path.abspath(path)”

collection

Resource collection path or URL. Specify ‘-‘ for stdin.

field

Expressions to select from each resource. Specify one or more times.

-h, --help

show this help message and exit

-f <filter>, --filter <filter>

Filter expression. Can be specified multiple times; the result is the intersection of the filters.

--transform <transform>

Path to Python file with transform function to run. Can be specified multiple times.

--no-environment-transforms

Do not run transforms configured in environment variables.

--format <format>

Output format. Default: csv.

--header <header>

Whether to output a header row in csv format. Defaults to ‘on’ if more than one field is selected, ‘off’ otherwise.

--out <out>

Output file. Default: stdout.

--all-fields

Select all fields

--stop-on-error

If an error occurs processing a resource, quit. By default, the problematic field is set to None, and any errors are printed at the end.

--skip-errors

If an error occurs processing a resource, silently skip that resource.

--if-error <if_error>

How to handle exceptions raised in expression evaluation. If ‘raise’, the script will halt with a traceback. If ‘skip’, the problematic resource(s) will be silently omitted from the result. If ‘none’, the Python None value will be silently used in place of the expression. Default: raise.

sefara-dump

Read and then write a sefara resource collection, possibly into a different format or after applying filters or transforms. As an experimental feature, the resources can be mutated with Python code specified with the –code argument. For example: sefara-dump data.py –code ‘name = name.upper()’ would capitalize all the resource names. Multiple code arguments can be given, and any new variables defined become attributes of the resources: sefara-dump data.py –code ‘upper_name = name.upper()’ ‘lower_name = name.lower()’ This would add two new attributes to each resource, upper_name and lower_name. Note that the input file is never modified. The –code argument only affects the output.

collection

Resource collection path or URL. Specify ‘-‘ for stdin.

-h, --help

show this help message and exit

-f <filter>, --filter <filter>

Filter expression. Can be specified multiple times; the result is the intersection of the filters.

--transform <transform>

Path to Python file with transform function to run. Can be specified multiple times.

--no-environment-transforms

Do not run transforms configured in environment variables.

--format <format>

Output format

--out <out>

Output file. Default: stdout.

--indent <indent>

Number of spaces for indentation in output. Default: 4.

--code <code>

Code to run in the context of each resource. Any new varialbes defined become attributes of each resource. Any number of arguments may be specified, each giving one line of code.

sefara-check

Validate a sefara resource collection and report the results. Routines for validating resources are called “checkers”, and are usually specified in the SEFARA_CHECKER environment variable. Additional checkers may be specified with the –checker argument to this script.

collection

Resource collection path or URL. Specify ‘-‘ for stdin.

-h, --help

show this help message and exit

-f <filter>, --filter <filter>

Filter expression. Can be specified multiple times; the result is the intersection of the filters.

--transform <transform>

Path to Python file with transform function to run. Can be specified multiple times.

--no-environment-transforms

Do not run transforms configured in environment variables.

--checker <checker>

Path to checker to run. Can be specified multiple times.

--no-environment-checkers

Run only the checkers explicitly specified. Do not run checkers configured in environment variables.

-v, --verbose
-q, --quiet

Print only a summary of errors.

--width <width>

Line width. Default: 100.

sefara-env

Give current values of sefara environment variables.

-h, --help

show this help message and exit