command line script: pydabu¶
pydabu has a few subcommands:
- 
analyse_data_structure¶
- Analyse the data stucture of a directory tree. 
- 
check_nasa_ames_format¶
- This command checks a file in the nasa ames format. 
- 
check_netcdf_file¶
- This command checks a file in the format netCDF. It uses the CF Checker: https://github.com/cedadev/cf-checker 
- 
check_file_format¶
- This command checks the file formats in a directory tree. 
- 
common_json_format¶
- This command read the given json file and writes it in a common format to stdout. 
- 
create_data_bubble¶
- This command creates a data bubble in the give directory. 
- 
check_data_bubble¶
- This command checks a data bubble in the given directory. 
- 
listschemas¶
- This command lists the provided and used json schemas. 
- 
data_bubble2jsonld¶
- This command reads the data bubble (.dabu.json and .dabu.schema) and creates a json-ld data bubble (.dabu.json-ld and .dabu.json-ld.schema). 
These commands are explained in more detail in the following (help output):
pydabu is a script to check a data bubble.
usage: pydabu [-h]
              {analyse_data_structure,check_nasa_ames_format,check_netcdf_file,check_file_format,common_json_format,create_data_bubble,check_data_bubble,listschemas,data_bubble2jsonld}
              ...
Positional Arguments¶
| subparser_name | Possible choices: analyse_data_structure, check_nasa_ames_format, check_netcdf_file, check_file_format, common_json_format, create_data_bubble, check_data_bubble, listschemas, data_bubble2jsonld There are different sub-commands with there own flags. | 
Sub-commands:¶
analyse_data_structure¶
see also: analyse_data_structure_output.schema
For more help: pydabu analyse_data_structure -h
pydabu analyse_data_structure [-h] [-output_format f] [-directory d [d ...]]
Named Arguments¶
| -output_format | Possible choices: human_readable, json, json1 Set the output format to use. human_readable gives a nice json output with skipped data. json is the normal json output. json1 is the full data with nice output like human_readable. default: json1 Default: [‘json1’] | 
| -directory | Set the directory to use. You can also give a list of directories separated by space. default: . Default: [‘.’] | 
check_nasa_ames_format¶
This command checks a file in the nasa ames format.
pydabu check_nasa_ames_format [-h] [-output_format f] -file f [f ...]
Named Arguments¶
| -output_format | Possible choices: human_readable, json, json1 Set the output format to use. human_readable gives a nice json output with skipped data. json is the normal json output. json1 is the full data with nice output like human_readable. default: json1 Default: [‘json1’] | 
| -file | Set the file(s) to use. | 
check_netcdf_file¶
This command checks a file in the format netCDF. It uses the CF Checker: https://github.com/cedadev/cf-checker
pydabu check_netcdf_file [-h] [-output_format f] -file f [f ...]
Named Arguments¶
| -output_format | Possible choices: human_readable, json, json1 Set the output format to use. human_readable gives a nice json output with skipped data. json is the normal json output. json1 is the full data with nice output like human_readable. default: json1 Default: [‘json1’] | 
| -file | Set the file(s) to use. | 
check_file_format¶
see also: dabu.schema
This command checks the file formats. In a first step the data structure is analysed like the command “analyse_data_structure” does. Each file is checked by a tool choosen by the file extension. For the file extension “.nc” the command check_netcdf_file is used.
pydabu check_file_format [-h] [-output_format f] [-directory d [d ...]]
                         [-skip_creating_checksums] [-checksum_from_file f]
Named Arguments¶
| -output_format | Possible choices: human_readable, json, json1 Set the output format to use. human_readable gives a nice json output with skipped data. json is the normal json output. json1 is the full data with nice output like human_readable. default: json1 Default: [‘json1’] | 
| -directory | Set the directory to use. You can also give a list of directories separated by space. default: . Default: [‘.’] | 
| -skip_creating_checksums | |
| Skip creating checksums, which could take a while. Default: False | |
| -checksum_from_file | |
| Try to get checksums from the given file. | |
common_json_format¶
This command read the given json file and writes it in a common format to stdout.
pydabu common_json_format [-h] -file f [f ...] [-indent i]
Named Arguments¶
| -file | Set the file(s) to use. | 
| -indent | In the output the elements will be indented by this number of spaces. Default: [4] | 
create_data_bubble¶
see also: dabu.schema and dabu_requires.schema
This command creates a data bubble in the give directory. The data is generated with the command “check_file_format” from the data in the directory. Also the resulting files are not a data management plan, you can enhance it to become one.
pydabu create_data_bubble [-h] -directory d [d ...] [-indent i]
                          [-skip_creating_checksums] [-checksum_from_file f]
                          [-dabu_instance_file f] [-dabu_schema_file f]
Named Arguments¶
| -directory | Set the directory to use. You can also give a list of directories separated by space. | 
| -indent | In the output the elements will be indented by this number of spaces. Default: [4] | 
| -skip_creating_checksums | |
| Skip creating checksums, which could take a while. Default: False | |
| -checksum_from_file | |
| Try to get checksums from the given file. | |
| -dabu_instance_file | |
| Gives the name of the file describing the content of a data bubble. If this file already exists an error is raised. The name is relative to the given directory. Default: [‘.dabu.json’] | |
| -dabu_schema_file | |
| Gives the name of the file describing the necessary content of a data bubble. If this file already exists an error is raised. The name is relative to the given directory. Default: [‘.dabu.schema’] | |
check_data_bubble¶
This command checks a data bubble in the given directory. The data bubble should be created with “pydabu create_data_bubble” and manually enhanced. Instead of this script you can also use your preferred tool to check a json instance (e. g. .dabu.json) against a json schema (e. g. .dabu.schema) – see examples.
pydabu check_data_bubble [-h] -directory d [d ...] [-dabu_instance_file f]
                         [-dabu_schema_file f]
Named Arguments¶
| -directory | Set the directory to use. You can also give a list of directories separated by space. | 
| -dabu_instance_file | |
| Gives the name of the file describing the content of a data bubble. The name is relative to the given directory. Default: [‘.dabu.json’] | |
| -dabu_schema_file | |
| Gives the name of the file describing the necessary content of a data bubble. The name is relative to the given directory. Default: [‘.dabu.schema’] | |
listschemas¶
see also: Provided and used json schemas
This command lists the provided and used json schemas.
pydabu listschemas [-h] [-output_format f]
Named Arguments¶
| -output_format | Possible choices: simple, json Set the output format to use. simple lists the json schmeas in lines. json leads to a json output. default: simple Default: [‘simple’] | 
data_bubble2jsonld¶
This command reads the data bubble (.dabu.json and .dabu.schema) and creates a json-ld data bubble (.dabu.json-ld and .dabu.json-ld.schema). If you are fine with these new files, you should delete the old ones by youself.
pydabu data_bubble2jsonld [-h] -directory d [d ...] [-indent i]
                          [-dabu_instance_file f] [-dabu_schema_file f]
                          [-dabu_jsonld_instance_file f]
                          [-dabu_jsonld_schema_file f] [-vocabulary v]
                          [-cachefilename f] [-cachefilepath p] [-author p]
Named Arguments¶
| -directory | Set the directory to use. You can also give a list of directories separated by space. | 
| -indent | In the output the elements will be indented by this number of spaces. Default: [4] | 
| -dabu_instance_file | |
| Gives the name of the file describing the content of a data bubble. The name is relative to the given directory. Default: [‘.dabu.json’] | |
| -dabu_schema_file | |
| Gives the name of the file describing the necessary content of a data bubble. The name is relative to the given directory. Default: [‘.dabu.schema’] | |
| -dabu_jsonld_instance_file | |
| Gives the name of the file describing the content of a data bubble as jsonld. If this file already exists an error is raised. The name is relative to the given directory. default: .dabu.json-ld Default: [‘.dabu.json-ld’] | |
| -dabu_jsonld_schema_file | |
| Gives the name of the file describing the necessary content of a data bubble with json-ld. If this file already exists an error is raised. The name is relative to the given directory. default: .dabu.json-ld.schema Default: [‘.dabu.json-ld.schema’] | |
| -vocabulary | Possible choices: schema.org Sets the vocabulary to use. At the moment only schema.org is implemented. default: schema.org Default: [‘schema.org’] | 
| -cachefilename | We need data from schema.org. If you set cachefilename to an empty string, nothing is cached. If the file ends with common extension for compression, this comperession is used (e. g.: .gz, .lzma, .xz, .bz2). The file is created in the cachefilepath (see this option). default: “schemaorg-current-https.jsonld.bz2” Default: [‘schemaorg-current-https.jsonld.bz2’] | 
| -cachefilepath | This path is used for the cachefilename. If necessary, this directory will be created (not the directory tree!). default: “/tmp/json_schema_from_schema_org_runner” Default: [‘/tmp/json_schema_from_schema_org_runner’] | 
| -author | Sets the author of the data bubble. If not given, it is not added to the dabu_jsonld_instance_file. Anyway the dabu_jsonld_schema_file will require it. You can just give a string or any json object. | 
You can few the json output for example in firefox, e. g. in bash:
output=$(tempfile –suffix=’.json’); pydabu analyse_data_structure -output_format json > $output && firefox $output; sleep 3; rm $output
output=$(tempfile –suffix=’.json’); pydabu check_netcdf_file -f $(find . -iname ‘*.nc’) -output_format json > $output && firefox $output; sleep 3; rm $output
Author: Daniel Mohr Date: 2021-07-01 License: GNU GENERAL PUBLIC LICENSE, Version 3, 29 June 2007.