# Usage Guide NeXusCreator can generate NeXus definitions (`.nxd` or YAML) from a variety of inputs and convert them into NeXus HDF5 files (`.nxs`). This guide shows the common CLI flows and illustrates what happens under the hood. ## Installation Recap ```bash python3 -m venv .venv source .venv/bin/activate python3 -m pip install . ``` This provides both the `nexuscreator` and `nxc` entry points. You can also invoke the module directly using `python3 -m nexuscreator`. ## Choose Options by Relevance Use this quick map before running commands: - Always relevant: `-i/--input`, `-o/--output_path`, `-b/--beamline_name` - Converting to `.nxs` (`-n`): `-n/--nexus_definition`, `-f/--file_per_scan`, `-I/--icat_proposal_number`, `--auto-generate-nxd`, `--pair-dta-raw` - Generating definitions (`-g`): `-g/--generate_nexus_definition`, `-t/--template`, `--single-file`, `--multi-file`, `--yaml`, `--hdf5-option` - Directory/batch work: `-r/--recursive`, `--glob`, `--glob-spec`, `--glob-dta`, `--dry-run`, `--summary-only`, `--no-group-dta-folders` - Metadata/schema placement: `--metadata-csv`, `--jsonld-structure`, `--nxdl-root`, `--app-def`, `--export-vars-csv`, `-d/--dictionary`, `-D/--debug` - Validation/value export: `--validate`, `--export-values-csv`, `--export-values-prefix`, `--csv-delimiter` ## Generate a NeXus Definition ```bash # From a SPEC file nexuscreator -g out.nxd -i data.spec # Produce a single-scan template for later multi-scan conversion nexuscreator -g template.nxd -i data.spec -t # Batteries workflow (DTA/DAT folder) nexuscreator -g out.nxd -i /path/to/folder -b batteries ``` Add `--yaml` to build YAML templates instead of `.nxd`. When generating from directories you can choose how many outputs to emit: ```bash # Single combined .nxd for the entire folder nexuscreator -g out.nxd -i /path/to/folder --single-file # One .nxd per supported input file nexuscreator -g out_prefix_ -i /path/to/folder --multi-file # YAML variants nexuscreator -g out_dir/ -i /path/to/folder --single-file --yaml nexuscreator -g out_dir/ -i /path/to/folder --multi-file --yaml ``` When targeting HDF5 or NeXus inputs, pick how placeholders should be generated: ```bash # Links mode (default for generation): keep references to datasets nexuscreator -g out.nxd -i data.nxs --hdf5-option links # Extract mode: replace values with placeholders and emit a variable library nexuscreator -g out.nxd -i data.nxs --hdf5-option extract ``` ## Convert to NeXus HDF5 ```bash # Single output from SPEC nexuscreator -n def.nxd -i data.spec -o out.nxs # One file per scan plus a master with external links nexuscreator -n def.nxd -i data.spec -o out.nxs -f # Batteries or single DTA/DAT nexuscreator -n def.nxd -i /path/to/folder -b batteries -o out.nxs nexuscreator -n def.nxd -i data.dta -o out.nxs # Single DTA + matching RAW sibling as one output nexuscreator -n def.nxd -i EIS_CH_#1_#1.dta -o out.nxs --pair-dta-raw # Diamond B18 XAFS (ASCII or NeXus input) nexuscreator -g diamond.nxd -i "data/cdi-ddi/.../ascii/263814_PtSn_OCA_1.dat" nexuscreator -n diamond.nxd -i "data/cdi-ddi/.../nexus/263814_PtSn_OCA_1.nxs" -o out.nxs # MPES workflow nexuscreator -g mpes.nxd -i /path/to/mpes_data -b mpes nexuscreator -n mpes.nxd -i /path/to/mpes_data -b mpes -o out.nxs ``` Generation mode also supports pairing for a single `.dta` input: ```bash nexuscreator -g pair.nxd -i EIS_CH_#1_#1.dta --pair-dta-raw ``` To inject ICAT proposal folders or auto-generate definitions during conversion: ```bash # Append ICAT proposal number to the output path # Outputs are placed under /proposal_12345/ nexuscreator -n def.nxd -i data.spec -o ./outputs -I 12345 # Directory conversion with auto-generated definitions per file nexuscreator -n def.nxd -i /path/to/folder -o ./out/ --auto-generate-nxd ``` ## Batteries Electrochemical Analysis Add `--batteries-analysis` to any batteries conversion to enrich the output with derived electrochemical datasets — state-of-charge (SoC), current/voltage curves, and EIS metrics (`libraries/eis_processing.py`): ```bash # Generate definition with EIS structure nexuscreator -g eis.nxd -i /path/to/batteries_folder -b batteries # Convert with full electrochemical analysis nexuscreator -n eis.nxd -i /path/to/batteries_folder -b batteries -o out.nxs \ --batteries-analysis ``` Without `--batteries-analysis`, the batteries parser produces the raw DTA/DAT variable library only. With it, additional datasets (charge, state-of-charge, IV curves) are computed and injected automatically. ### DTA/RAW pair templates When generating a paired DTA/RAW template, optionally supply the user name to avoid an interactive prompt: ```bash nexuscreator -g pair.nxd -i EIS_CH_#1_#1.dta --pair-dta-raw --user-name "Jane Smith" ``` If `--user-name` is omitted and the template requires a user field, the CLI prompts once and caches the answer for the remainder of the run. ## MPES Workflow The MPES plugin handles multi-photon emission spectroscopy HDF5 data. Activate it with `-b mpes`: ```bash # Generate a NeXus definition from MPES data nexuscreator -g mpes.nxd -i /path/to/mpes_data -b mpes # Convert to NeXus HDF5 nexuscreator -n mpes.nxd -i /path/to/mpes_data -b mpes -o out.nxs ``` The plugin uses `libraries/mpes_utils.py` to locate the HDF5 file within the input path and `parsers/mpes_parser.py` to extract the variable library. The generator emits a definition via `generators/mpes_to_nexus.py`. ## Directory Scans and Batch Processing ```bash # Process a directory non-recursively; one output per input nexuscreator -n def.nxd -i /path/to/dir -o ./out/ # Recurse into subfolders nexuscreator -n def.nxd -i /data/root -o ./out/ -r # Restrict to files matching patterns nexuscreator -n def.nxd -i /data/root -o ./out/ -r --glob "*.spec" nexuscreator -n def.nxd -i /data/root -o ./out/ -r \ --glob-spec "*.spec" --glob-dta "EIS_*.dta" # Dry run with a concise summary nexuscreator -n def.nxd -i /data/root -o ./out/ -r \ --glob-spec "*.spec" --glob-dta "EIS_*.dta" \ --dry-run --summary-only ``` For DTA/DAT inputs with `-r`, NeXusCreator groups files by parent folder and produces one combined output per folder. Disable this behaviour with `--no-group-dta-folders`. ## Placement Controls and Metadata ```bash # Bias placement using base classes (always) + an application definition when requested nexuscreator -g out.nxd -i data.spec \ --nxdl-root /path/to/nxdl --app-def NXxas # without --app-def only base classes are searched # Enrich variables with metadata from CSV (name, description, units) nexuscreator -g out.nxd -i /path/to/folder --metadata-csv femtospex.csv # Use a JSON-LD descriptor + text file nexuscreator -g out.nxd -i data/Se_Na2SeO4_rt_01.xdi \ --jsonld-structure data/se_na2so4-testschemaorg-cdiv3.jsonLD # Convert using the JSON-LD parser nexuscreator -n out.nxd -i data/Se_Na2SeO4_rt_01.xdi \ --jsonld-structure data/se_na2so4-testschemaorg-cdiv3.jsonLD # Combine JSON-LD with SchemaPlacer to map variables into NXDL classes nexuscreator -g out.nxd -i data/Se_Na2SeO4_rt_01.xdi \ --jsonld-structure data/se_na2so4-testschemaorg-cdiv3.jsonLD \ --nxdl-root /path/to/nxdl --app-def NXxas ``` ### JSON-LD Parser - Supply `--jsonld-structure FILE` alongside `-i FILE` to describe how to read fixed-width or delimited text files using CDI/Schema.org JSON-LD. - When combined with `--nxdl-root`/`--app-def`, the parsed variables are run through SchemaPlacer before the fallback logic kicks in. - If `--nxdl-root` is omitted, NeXusCreator uses `external_references/nexus/nexus_definitions` when present. - Even without NXDL hints, `energy` is mapped to `/entry/instrument/monochromator/energy`, `i0` to `/entry/instrument/incoming_beam/data`, and `itrans` to `/entry/instrument/absorbed_beam/data`; any remaining arrays land in `entry/instrument/logs/`. Enable debug output or inspect the variable library: ```bash # Print the NeXus definition line currently being processed nexuscreator -n def.nxd -i data.spec -D # Inspect the variable dictionary produced by the parser nexuscreator -g out.nxd -i data.spec -d ``` ### Inspecting variables: `-d` vs `--export-vars-csv` | Flag | Output | Includes metadata | |------|--------|-------------------| | `-d, --dictionary` | Prints to terminal during run | No | | `--export-vars-csv FILE` | CSV file (variable\_name, variable\_description, units) | Yes | Use `--export-vars-csv` when you want a machine-readable catalogue of all variables produced by the parser, including units and descriptions from `__attrs__` or a `--metadata-csv` source: ```bash nexuscreator -i data.spec --export-vars-csv vars.csv nexuscreator -i /path/to/folder --export-vars-csv vars.csv -r --glob-dta "EIS_*.dta" ``` ## Prompt Literals Prompt literals allow `.nxd` templates to interactively ask for a value at conversion time. A dataset value that starts with `?` followed by a quoted string is a prompt literal: ```text sample_name:NX_CHAR = ?"Sample name" user:NX_CHAR = ?'User name' ``` When NeXusCreator encounters a prompt literal during conversion, it prints the prompt text and waits for user input. The entered value is **cached for the entire run**, so batch processing a folder of 50 files will prompt only once per unique literal — not once per file. Prompt literals are especially useful for fields that vary per experiment but are not present in the data file (user name, sample description, proposal ID, etc.). See `nexus-description-syntax` for the full specification. ## Validate Outputs Use `--validate` to run `punx` validation immediately after writing a `.nxs` file. This is skipped for `.nxd`/YAML generation. If `punx` is not installed, NeXusCreator looks for it under `external_references/punx`. ```bash nexuscreator -n def.nxd -i data.spec -o out.nxs --validate ``` ## Export Dataset Values to CSV Export a `.nxs` (or any HDF5) file to CSV with datasets expanded row-by-row. Scalars repeat to match the longest column. ```bash # Export every dataset (can be large) nexuscreator -i data.nxs --export-values-csv values.csv # Filter to a single HDF5 prefix and change the delimiter nexuscreator -i data.nxs --export-values-csv values.csv \ --export-values-prefix /entry/experiments/open_circuit_potential/ \ --csv-delimiter ';' ``` ## Links in `.nxd` and YAML ```text # .nxd internal link scopeP: --> /entry/instrument/detector/scope/scopeP # YAML internal link scopeP: link: /entry/instrument/detector/scope/scopeP # .nxd external link calibration: --> ../calibration/run_001.nxs | /entry/ # YAML external link calibration: external: file: ../calibration/run_001.nxs path: /entry/ ``` For a complete syntax reference, see `nexus-description-syntax`. During conversion, internal links become HDF5 `SoftLink`s and external links become `ExternalLink`s. ## Python API (Advanced) ```python from nexuscreator.plugins import get_plugin_manager from nexuscreator.libraries.NeXusHDF5 import NexusValueInjector, NexusHDF5Writer pm = get_plugin_manager() flags = {"dictionary": False, "template_for_all_scans": False} # One-liner helpers: find the right plugin and run it nexus_object = pm.generate_definition("data.spec", beamline=None, flags=flags) library = pm.parse_to_library("data.spec", beamline=None, flags={}) NexusValueInjector(library).inject(nexus_object) NexusHDF5Writer(nexus_object).write("out.nxs") ``` ### Using Parsers Directly Every parser can be imported and called from your own scripts. They all return the flat variable dictionary that the CLI would inject: ```python from nexuscreator.parsers.diamond_ascii_parser import DiamondAsciiParser parser = DiamondAsciiParser() library = parser.parse("data/cdi-ddi/.../ascii/263814_PtSn_OCA_1.dat") print(library["diamond_qexafs_energy"][:3]) print(library["general_sample_name"]) ``` The `library` dict is what `NexusValueInjector` consumes. Keys such as `__attrs__` hold units and descriptions that will become dataset attributes automatically. Typical structure: ```python { "general_sample_name": "PtSn_OCA", "general_command": "qexafs_energy 11364.0 13000.0 3266 63.59 qexafs_counterTimer01", "diamond_qexafs_energy": [11364.22, 11364.66, ...], "diamond_lni0it": [-0.77336, -0.77270, ...], "nexus_entry1_instrument_qexafs_counterTimer01_time": [0.019014, 0.018883, ...], "__attrs__": { "diamond_qexafs_energy": {"@units": "eV"}, "nexus_entry1_instrument_qexafs_counterTimer01_time": {"@units": "s"}, }, } ``` ### Building NeXus Objects Programmatically To generate a definition object without touching the CLI, use the plugin generators or call the helpers directly: ```python from nexuscreator.plugins import get_plugin_manager pm = get_plugin_manager() nexus_object = pm.generate_definition("data/cdi-ddi/.../ascii/263814_PtSn_OCA_1.dat", beamline=None, flags={"app_def": "NXxas"}) ``` If you already have an `.nxd` file, load it with `NexusDefinitionReader` (see previous snippet). You can also build the object manually; it’s just a nested dict mimicking the NeXus hierarchy: ```python nexus_object = { "@default": "entry", "entry": { "@NX_class": "NXentry", "@default": "data", "instrument": { "@NX_class": "NXinstrument", "source": { "@NX_class": "NXsource", "name": {"@dtype": "NX_CHAR", "@value": '"synchrotron"'}, }, }, "data": { "@NX_class": "NXdata", "@signal": "counts", "@axes": "energy", "energy": {"@dtype": "NX_FLOAT64[]", "@value": "diamond_qexafs_energy"}, "counts": {"@dtype": "NX_FLOAT64[]", "@value": "diamond_it"}, }, }, } ``` This hand-written object can be fed straight into `NexusHDF5Writer` after `NexusValueInjector` replaces the placeholders (`diamond_*` keys) with actual arrays from your parser. For plotting-friendly outputs, expose an `NXdata` view that links to the real datasets: ```python nexus_object = { "entry": { "@NX_class": "NXentry", "measurement": { # real data lives here "@NX_class": "NXcollection", "energy": {"@dtype": "NX_FLOAT64[]", "@value": "diamond_qexafs_energy"}, "counts": {"@dtype": "NX_FLOAT64[]", "@value": "diamond_it"}, }, "data": { # plotting-friendly NXdata group "@NX_class": "NXdata", "@signal": "counts", "@axes": ["energy"], "energy": {"@link": "/entry/measurement/energy"}, "counts": {"@link": "/entry/measurement/counts"}, }, } } ``` Here `@signal` and `@axes` define the plot, while `@link` ensures the `NXdata` group points at the datasets stored elsewhere in the file. ### Creating and Modifying `nexus_object` Safely In practice, you usually modify `nexus_object` at one of these stages: 1. **Before injection** (recommended for structure edits): add/remove groups, add links, set `@NX_class`, `@signal`, `@axes`, `@units`, and placeholder `@value` keys. 2. **After injection** (for final tweaks): adjust literal metadata fields only (for example `@long_name`, attributes), not placeholder keys expected by the injector. Typical lifecycle: ```python from nexuscreator.libraries.NeXusDefinition import NexusDefinitionReader from nexuscreator.libraries.NeXusHDF5 import NexusValueInjector, NexusHDF5Writer obj = NexusDefinitionReader().read("template.nxd") # create/load # modify structure before injection entry = obj.setdefault("entry", {"@NX_class": "NXentry"}) entry.setdefault("instrument", {"@NX_class": "NXinstrument"}) entry["instrument"]["name"] = {"@dtype": "NX_CHAR", "@value": '"My instrument"'} # inject parser library values into placeholders NexusValueInjector(library).inject(obj) # optional post-injection metadata tweaks entry.setdefault("data", {"@NX_class": "NXdata"}) entry["data"]["@long_name"] = "Processed dataset" NexusHDF5Writer(obj).write("out.nxs") ``` Recommended rules: - Keep dataset descriptors as dicts with both `@dtype` and `@value`. - Use `@link` / `@extlink` for links instead of embedding resolved paths as plain strings. - For `NX_CHAR`, use literal text in `@value`; for numeric arrays, use placeholders that exist in the parsed `library`. - When batch-processing folders, prompt literals (`?"..."`) are resolved once per run and reused for all files in that run. - If you mutate placeholder names, update the parser library keys (or template placeholders) to match, otherwise injection will miss values. ### Describing Links in `nexus_object` Use `@link` for internal soft links and `@extlink` for external links: ```python nexus_object = { "entry": { "@NX_class": "NXentry", "data": { "@link": "/entry/measurement/counts" # SoftLink inside this file }, "calibration": { "@extlink": { # ExternalLink to another file "file": "../calibration/run_001.nxs", "path": "/entry/" } }, } } ``` `NexusHDF5Writer` turns `@link` into an HDF5 `SoftLink` and `@extlink` into an `ExternalLink`. ### Writing `.nxs` Files from a Script You can reuse the high-level class that powers the CLI when you want the whole conversion pipeline in Python code: ```python from nexuscreator.creator import NeXusCreator creator = NeXusCreator() creator.execute_conversion({ "input_path": "data/spec/sample.spec", "nexus_definition_file": "defs/sample.nxd", "output_path": "out/sample.nxs", }) ``` This accepts the same flags as the CLI (`beamline_name`, `auto_generate_nxd`, etc.). For finer control, keep using `get_plugin_manager()` + `NexusValueInjector` + `NexusHDF5Writer` directly as shown above. ## CLI Reference (Grouped) ### Core - `-h, --help` — show help and exit. - `-v, --version` — print the package version. - `--license`, `--notice` — print licensing information. - `--list-beamlines` — list accepted values for `-b/--beamline`. ### Input / Output - `-i, --input PATH` — source data (file or directory). - `-o, --output_path PATH` — target `.nxs` file or output directory. ### Conversion (use an existing `.nxd`) - `-n, --nexus_definition FILE` - `-f, --file_per_scan` — for SPEC, one `.nxs` per scan plus master. - `-I, --icat_proposal_number NUM` — append ICAT subfolder to outputs. - `--auto-generate-nxd` — with directory inputs, generate per-file `.nxd` automatically. ### Generation (build a new definition) - `-g, --generate_nexus_definition FILE` - `-t, --template` — emit a single-scan template. - `--single-file`, `--multi-file` — control directory generation mode. - `--yaml` — write YAML definitions instead of `.nxd`. - `--hdf5-option MODE` — `links` (default for `-g`) or `extract` (default for `-n`). ### Scanning & Batch Flags - `-r, --recursive` - `--glob PATTERN` — limit matches for all supported types. - `--glob-spec`, `--glob-dta` — type-specific patterns. - `--dry-run`, `--summary-only` - `--no-group-dta-folders` — process each DTA/DAT file individually. ### Metadata & Placement - `-b, --beamline_name NAME` — beamline-specific context (for example `ikft`, `batteries`). - `--metadata-csv FILE` — supply descriptions/units for variables. - `--export-vars-csv FILE` — export parsed variables to CSV. - `--nxdl-root PATH`, `--app-def NAME` — NXDL placement hints. Base classes are always indexed; `--app-def` adds the selected application definition to the search. Defaults to `external_references/nexus/nexus_definitions` when present. - `--jsonld-structure FILE` — CDI/Schema.org JSON-LD document describing how to read the input file; pairs nicely with `--nxdl-root`. - `-d, --dictionary` — print the parsed variable dictionary. - `-D, --debug` — show the current `.nxd` line processed during conversion. ### Validation & Export - `--validate` — validate the generated `.nxs` with `punx` when available. - `--export-values-csv FILE` — export dataset values from a `.nxs`/HDF5 file to CSV. - `--export-values-prefix PATH` — restrict CSV export to datasets under a prefix (must start with `/`). - `--csv-delimiter CHAR` — override CSV delimiter (default `,`).