command line script: pydabu¶
pydabu has a few subcommands:
-
analyse_data_structure
¶
Analyse the data stucture of a directory tree.
-
check_nasa_ames_format
¶
This command checks a file in the nasa ames format.
-
check_netcdf_file
¶
This command checks a file in the format netCDF. It uses the CF Checker: https://github.com/cedadev/cf-checker
-
check_file_format
¶
This command checks the file formats in a directory tree.
-
common_json_format
¶
This command read the given json file and writes it in a common format to stdout.
-
create_data_bubble
¶
This command creates a data bubble in the give directory.
-
check_data_bubble
¶
This command checks a data bubble in the given directory.
-
listschemas
¶
This command lists the provided and used json schemas.
-
data_bubble2jsonld
¶
This command reads the data bubble (.dabu.json and .dabu.schema) and creates a json-ld data bubble (.dabu.json-ld and .dabu.json-ld.schema).
These commands are explained in more detail in the following (help output):
pydabu is a script to check a data bubble.
usage: pydabu [-h]
{analyse_data_structure,check_nasa_ames_format,check_netcdf_file,check_file_format,common_json_format,create_data_bubble,check_data_bubble,listschemas,data_bubble2jsonld}
...
Positional Arguments¶
subparser_name | Possible choices: analyse_data_structure, check_nasa_ames_format, check_netcdf_file, check_file_format, common_json_format, create_data_bubble, check_data_bubble, listschemas, data_bubble2jsonld There are different sub-commands with there own flags. |
Sub-commands:¶
analyse_data_structure¶
see also: analyse_data_structure_output.schema
For more help: pydabu analyse_data_structure -h
pydabu analyse_data_structure [-h] [-output_format f] [-directory d [d ...]]
Named Arguments¶
-output_format | Possible choices: human_readable, json, json1 Set the output format to use. human_readable gives a nice json output with skipped data. json is the normal json output. json1 is the full data with nice output like human_readable. default: json1 Default: [‘json1’] |
-directory | Set the directory to use. You can also give a list of directories separated by space. default: . Default: [‘.’] |
check_nasa_ames_format¶
This command checks a file in the nasa ames format.
pydabu check_nasa_ames_format [-h] [-output_format f] -file f [f ...]
Named Arguments¶
-output_format | Possible choices: human_readable, json, json1 Set the output format to use. human_readable gives a nice json output with skipped data. json is the normal json output. json1 is the full data with nice output like human_readable. default: json1 Default: [‘json1’] |
-file | Set the file(s) to use. |
check_netcdf_file¶
This command checks a file in the format netCDF. It uses the CF Checker: https://github.com/cedadev/cf-checker
pydabu check_netcdf_file [-h] [-output_format f] -file f [f ...]
Named Arguments¶
-output_format | Possible choices: human_readable, json, json1 Set the output format to use. human_readable gives a nice json output with skipped data. json is the normal json output. json1 is the full data with nice output like human_readable. default: json1 Default: [‘json1’] |
-file | Set the file(s) to use. |
check_file_format¶
see also: dabu.schema
This command checks the file formats. In a first step the data structure is analysed like the command “analyse_data_structure” does. Each file is checked by a tool choosen by the file extension. For the file extension “.nc” the command check_netcdf_file is used.
pydabu check_file_format [-h] [-output_format f] [-directory d [d ...]]
[-skip_creating_checksums] [-checksum_from_file f]
Named Arguments¶
-output_format | Possible choices: human_readable, json, json1 Set the output format to use. human_readable gives a nice json output with skipped data. json is the normal json output. json1 is the full data with nice output like human_readable. default: json1 Default: [‘json1’] |
-directory | Set the directory to use. You can also give a list of directories separated by space. default: . Default: [‘.’] |
-skip_creating_checksums | |
Skip creating checksums, which could take a while. Default: False | |
-checksum_from_file | |
Try to get checksums from the given file. |
common_json_format¶
This command read the given json file and writes it in a common format to stdout.
pydabu common_json_format [-h] -file f [f ...] [-indent i]
Named Arguments¶
-file | Set the file(s) to use. |
-indent | In the output the elements will be indented by this number of spaces. Default: [4] |
create_data_bubble¶
see also: dabu.schema and dabu_requires.schema
This command creates a data bubble in the give directory. The data is generated with the command “check_file_format” from the data in the directory. Also the resulting files are not a data management plan, you can enhance it to become one.
pydabu create_data_bubble [-h] -directory d [d ...] [-indent i]
[-skip_creating_checksums] [-checksum_from_file f]
[-dabu_instance_file f] [-dabu_schema_file f]
Named Arguments¶
-directory | Set the directory to use. You can also give a list of directories separated by space. |
-indent | In the output the elements will be indented by this number of spaces. Default: [4] |
-skip_creating_checksums | |
Skip creating checksums, which could take a while. Default: False | |
-checksum_from_file | |
Try to get checksums from the given file. | |
-dabu_instance_file | |
Gives the name of the file describing the content of a data bubble. If this file already exists an error is raised. The name is relative to the given directory. Default: [‘.dabu.json’] | |
-dabu_schema_file | |
Gives the name of the file describing the necessary content of a data bubble. If this file already exists an error is raised. The name is relative to the given directory. Default: [‘.dabu.schema’] |
check_data_bubble¶
This command checks a data bubble in the given directory. The data bubble should be created with “pydabu create_data_bubble” and manually enhanced. Instead of this script you can also use your preferred tool to check a json instance (e. g. .dabu.json) against a json schema (e. g. .dabu.schema) – see examples.
pydabu check_data_bubble [-h] -directory d [d ...] [-dabu_instance_file f]
[-dabu_schema_file f]
Named Arguments¶
-directory | Set the directory to use. You can also give a list of directories separated by space. |
-dabu_instance_file | |
Gives the name of the file describing the content of a data bubble. The name is relative to the given directory. Default: [‘.dabu.json’] | |
-dabu_schema_file | |
Gives the name of the file describing the necessary content of a data bubble. The name is relative to the given directory. Default: [‘.dabu.schema’] |
listschemas¶
see also: Provided and used json schemas
This command lists the provided and used json schemas.
pydabu listschemas [-h] [-output_format f]
Named Arguments¶
-output_format | Possible choices: simple, json Set the output format to use. simple lists the json schmeas in lines. json leads to a json output. default: simple Default: [‘simple’] |
data_bubble2jsonld¶
This command reads the data bubble (.dabu.json and .dabu.schema) and creates a json-ld data bubble (.dabu.json-ld and .dabu.json-ld.schema). If you are fine with these new files, you should delete the old ones by youself.
pydabu data_bubble2jsonld [-h] -directory d [d ...] [-indent i]
[-dabu_instance_file f] [-dabu_schema_file f]
[-dabu_jsonld_instance_file f]
[-dabu_jsonld_schema_file f] [-vocabulary v]
[-cachefilename f] [-cachefilepath p] [-author p]
Named Arguments¶
-directory | Set the directory to use. You can also give a list of directories separated by space. |
-indent | In the output the elements will be indented by this number of spaces. Default: [4] |
-dabu_instance_file | |
Gives the name of the file describing the content of a data bubble. The name is relative to the given directory. Default: [‘.dabu.json’] | |
-dabu_schema_file | |
Gives the name of the file describing the necessary content of a data bubble. The name is relative to the given directory. Default: [‘.dabu.schema’] | |
-dabu_jsonld_instance_file | |
Gives the name of the file describing the content of a data bubble as jsonld. If this file already exists an error is raised. The name is relative to the given directory. default: .dabu.json-ld Default: [‘.dabu.json-ld’] | |
-dabu_jsonld_schema_file | |
Gives the name of the file describing the necessary content of a data bubble with json-ld. If this file already exists an error is raised. The name is relative to the given directory. default: .dabu.json-ld.schema Default: [‘.dabu.json-ld.schema’] | |
-vocabulary | Possible choices: schema.org Sets the vocabulary to use. At the moment only schema.org is implemented. default: schema.org Default: [‘schema.org’] |
-cachefilename | We need data from schema.org. If you set cachefilename to an empty string, nothing is cached. If the file ends with common extension for compression, this comperession is used (e. g.: .gz, .lzma, .xz, .bz2). The file is created in the cachefilepath (see this option). default: “schemaorg-current-https.jsonld.bz2” Default: [‘schemaorg-current-https.jsonld.bz2’] |
-cachefilepath | This path is used for the cachefilename. If necessary, this directory will be created (not the directory tree!). default: “/tmp/json_schema_from_schema_org_runner” Default: [‘/tmp/json_schema_from_schema_org_runner’] |
-author | Sets the author of the data bubble. If not given, it is not added to the dabu_jsonld_instance_file. Anyway the dabu_jsonld_schema_file will require it. You can just give a string or any json object. |
You can few the json output for example in firefox, e. g. in bash:
output=$(tempfile –suffix=’.json’); pydabu analyse_data_structure -output_format json > $output && firefox $output; sleep 3; rm $output
output=$(tempfile –suffix=’.json’); pydabu check_netcdf_file -f $(find . -iname ‘*.nc’) -output_format json > $output && firefox $output; sleep 3; rm $output
Author: Daniel Mohr Date: 2021-07-01 License: GNU GENERAL PUBLIC LICENSE, Version 3, 29 June 2007.