Commandline Tools Usage¶
sefara-select¶
Select fields from a sefara collection. To select all fields, just give the path to the collection: sefara-select my_datasets.sefara.py To get the ‘name’ and ‘path’ fields from a resource collection called “collection.sefara.py”: sefara-select collection.sefara.py name path Fields are interpreted as Python expressions and evaluated in a context that includes the resource’s attributes as local variables. For example: sefara-select collection.sefara.py “name.lower()” “os.path.abspath(path)” In csv output, a header row is included by default when more than one field is selected. To customize the label, give a label of the form “LABEL: EXPRESSION”. For example: sefara-select rc.py “resource: name” “full_path: os.path.abspath(path)”
-
collection
¶
Resource collection path or URL. Specify ‘-‘ for stdin.
-
field
¶
Expressions to select from each resource. Specify one or more times.
-
-h
,
--help
¶
show this help message and exit
-
-f
<filter>
,
--filter
<filter>
¶ Filter expression. Can be specified multiple times; the result is the intersection of the filters.
-
--transform
<transform>
¶ Path to Python file with transform function to run. Can be specified multiple times.
-
--no-environment-transforms
¶
Do not run transforms configured in environment variables.
-
--format
<format>
¶ Output format. Default: csv.
-
--header
<header>
¶ Whether to output a header row in csv format. Defaults to ‘on’ if more than one field is selected, ‘off’ otherwise.
-
--out
<out>
¶ Output file. Default: stdout.
-
--all-fields
¶
Select all fields
-
--stop-on-error
¶
If an error occurs processing a resource, quit. By default, the problematic field is set to None, and any errors are printed at the end.
-
--skip-errors
¶
If an error occurs processing a resource, silently skip that resource.
-
--if-error
<if_error>
¶ How to handle exceptions raised in expression evaluation. If ‘raise’, the script will halt with a traceback. If ‘skip’, the problematic resource(s) will be silently omitted from the result. If ‘none’, the Python None value will be silently used in place of the expression. Default: raise.
sefara-dump¶
Read and then write a sefara resource collection, possibly into a different format or after applying filters or transforms. As an experimental feature, the resources can be mutated with Python code specified with the –code argument. For example: sefara-dump data.py –code ‘name = name.upper()’ would capitalize all the resource names. Multiple code arguments can be given, and any new variables defined become attributes of the resources: sefara-dump data.py –code ‘upper_name = name.upper()’ ‘lower_name = name.lower()’ This would add two new attributes to each resource, upper_name and lower_name. Note that the input file is never modified. The –code argument only affects the output.
-
collection
¶
Resource collection path or URL. Specify ‘-‘ for stdin.
-
-h
,
--help
¶
show this help message and exit
-
-f
<filter>
,
--filter
<filter>
¶ Filter expression. Can be specified multiple times; the result is the intersection of the filters.
-
--transform
<transform>
¶ Path to Python file with transform function to run. Can be specified multiple times.
-
--no-environment-transforms
¶
Do not run transforms configured in environment variables.
-
--format
<format>
¶ Output format
-
--out
<out>
¶ Output file. Default: stdout.
-
--indent
<indent>
¶ Number of spaces for indentation in output. Default: 4.
-
--code
<code>
¶ Code to run in the context of each resource. Any new varialbes defined become attributes of each resource. Any number of arguments may be specified, each giving one line of code.
sefara-check¶
Validate a sefara resource collection and report the results. Routines for validating resources are called “checkers”, and are usually specified in the SEFARA_CHECKER environment variable. Additional checkers may be specified with the –checker argument to this script.
-
collection
¶
Resource collection path or URL. Specify ‘-‘ for stdin.
-
-h
,
--help
¶
show this help message and exit
-
-f
<filter>
,
--filter
<filter>
¶ Filter expression. Can be specified multiple times; the result is the intersection of the filters.
-
--transform
<transform>
¶ Path to Python file with transform function to run. Can be specified multiple times.
-
--no-environment-transforms
¶
Do not run transforms configured in environment variables.
-
--checker
<checker>
¶ Path to checker to run. Can be specified multiple times.
-
--no-environment-checkers
¶
Run only the checkers explicitly specified. Do not run checkers configured in environment variables.
-
-v
,
--verbose
¶
-
-q
,
--quiet
¶
Print only a summary of errors.
-
--width
<width>
¶ Line width. Default: 100.