Usage Guide

NeXusCreator can generate NeXus definitions (.nxd or YAML) from a variety of inputs and convert them into NeXus HDF5 files (.nxs). This guide shows the common CLI flows and illustrates what happens under the hood.

Installation Recap

python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install .

This provides both the nexuscreator and nxc entry points. You can also invoke the module directly using python3 -m nexuscreator.

Choose Options by Relevance

Use this quick map before running commands:

  • Always relevant: -i/--input, -o/--output_path, -b/--beamline_name

  • Converting to .nxs (-n): -n/--nexus_definition, -f/--file_per_scan, -I/--icat_proposal_number, --auto-generate-nxd, --pair-dta-raw

  • Generating definitions (-g): -g/--generate_nexus_definition, -t/--template, --single-file, --multi-file, --yaml, --hdf5-option

  • Directory/batch work: -r/--recursive, --glob, --glob-spec, --glob-dta, --dry-run, --summary-only, --no-group-dta-folders

  • Metadata/schema placement: --metadata-csv, --jsonld-structure, --nxdl-root, --app-def, --export-vars-csv, -d/--dictionary, -D/--debug

  • Validation/value export: --validate, --export-values-csv, --export-values-prefix, --csv-delimiter

Generate a NeXus Definition

# From a SPEC file
nexuscreator -g out.nxd -i data.spec

# Produce a single-scan template for later multi-scan conversion
nexuscreator -g template.nxd -i data.spec -t

# Batteries workflow (DTA/DAT folder)
nexuscreator -g out.nxd -i /path/to/folder -b batteries

Add --yaml to build YAML templates instead of .nxd. When generating from directories you can choose how many outputs to emit:

# Single combined .nxd for the entire folder
nexuscreator -g out.nxd -i /path/to/folder --single-file

# One .nxd per supported input file
nexuscreator -g out_prefix_ -i /path/to/folder --multi-file

# YAML variants
nexuscreator -g out_dir/ -i /path/to/folder --single-file --yaml
nexuscreator -g out_dir/ -i /path/to/folder --multi-file --yaml

When targeting HDF5 or NeXus inputs, pick how placeholders should be generated:

# Links mode (default for generation): keep references to datasets
nexuscreator -g out.nxd -i data.nxs --hdf5-option links

# Extract mode: replace values with placeholders and emit a variable library
nexuscreator -g out.nxd -i data.nxs --hdf5-option extract

Convert to NeXus HDF5

# Single output from SPEC
nexuscreator -n def.nxd -i data.spec -o out.nxs

# One file per scan plus a master with external links
nexuscreator -n def.nxd -i data.spec -o out.nxs -f

# Batteries or single DTA/DAT
nexuscreator -n def.nxd -i /path/to/folder -b batteries -o out.nxs
nexuscreator -n def.nxd -i data.dta -o out.nxs

# Single DTA + matching RAW sibling as one output
nexuscreator -n def.nxd -i EIS_CH_#1_#1.dta -o out.nxs --pair-dta-raw

# Diamond B18 XAFS (ASCII or NeXus input)
nexuscreator -g diamond.nxd -i "data/cdi-ddi/.../ascii/263814_PtSn_OCA_1.dat"
nexuscreator -n diamond.nxd -i "data/cdi-ddi/.../nexus/263814_PtSn_OCA_1.nxs" -o out.nxs

# MPES workflow
nexuscreator -g mpes.nxd -i /path/to/mpes_data -b mpes
nexuscreator -n mpes.nxd -i /path/to/mpes_data -b mpes -o out.nxs

Generation mode also supports pairing for a single .dta input:

nexuscreator -g pair.nxd -i EIS_CH_#1_#1.dta --pair-dta-raw

To inject ICAT proposal folders or auto-generate definitions during conversion:

# Append ICAT proposal number to the output path
# Outputs are placed under <output_path>/proposal_12345/
nexuscreator -n def.nxd -i data.spec -o ./outputs -I 12345

# Directory conversion with auto-generated definitions per file
nexuscreator -n def.nxd -i /path/to/folder -o ./out/ --auto-generate-nxd

Batteries Electrochemical Analysis

Add --batteries-analysis to any batteries conversion to enrich the output with derived electrochemical datasets — state-of-charge (SoC), current/voltage curves, and EIS metrics (libraries/eis_processing.py):

# Generate definition with EIS structure
nexuscreator -g eis.nxd -i /path/to/batteries_folder -b batteries

# Convert with full electrochemical analysis
nexuscreator -n eis.nxd -i /path/to/batteries_folder -b batteries -o out.nxs \
  --batteries-analysis

Without --batteries-analysis, the batteries parser produces the raw DTA/DAT variable library only. With it, additional datasets (charge, state-of-charge, IV curves) are computed and injected automatically.

DTA/RAW pair templates

When generating a paired DTA/RAW template, optionally supply the user name to avoid an interactive prompt:

nexuscreator -g pair.nxd -i EIS_CH_#1_#1.dta --pair-dta-raw --user-name "Jane Smith"

If --user-name is omitted and the template requires a user field, the CLI prompts once and caches the answer for the remainder of the run.

MPES Workflow

The MPES plugin handles multi-photon emission spectroscopy HDF5 data. Activate it with -b mpes:

# Generate a NeXus definition from MPES data
nexuscreator -g mpes.nxd -i /path/to/mpes_data -b mpes

# Convert to NeXus HDF5
nexuscreator -n mpes.nxd -i /path/to/mpes_data -b mpes -o out.nxs

The plugin uses libraries/mpes_utils.py to locate the HDF5 file within the input path and parsers/mpes_parser.py to extract the variable library. The generator emits a definition via generators/mpes_to_nexus.py.

Directory Scans and Batch Processing

# Process a directory non-recursively; one output per input
nexuscreator -n def.nxd -i /path/to/dir -o ./out/

# Recurse into subfolders
nexuscreator -n def.nxd -i /data/root -o ./out/ -r

# Restrict to files matching patterns
nexuscreator -n def.nxd -i /data/root -o ./out/ -r --glob "*.spec"
nexuscreator -n def.nxd -i /data/root -o ./out/ -r \
  --glob-spec "*.spec" --glob-dta "EIS_*.dta"

# Dry run with a concise summary
nexuscreator -n def.nxd -i /data/root -o ./out/ -r \
  --glob-spec "*.spec" --glob-dta "EIS_*.dta" \
  --dry-run --summary-only

For DTA/DAT inputs with -r, NeXusCreator groups files by parent folder and produces one combined output per folder. Disable this behaviour with --no-group-dta-folders.

Placement Controls and Metadata

# Bias placement using base classes (always) + an application definition when requested
nexuscreator -g out.nxd -i data.spec \
  --nxdl-root /path/to/nxdl --app-def NXxas  # without --app-def only base classes are searched

# Enrich variables with metadata from CSV (name, description, units)
nexuscreator -g out.nxd -i /path/to/folder --metadata-csv femtospex.csv

# Use a JSON-LD descriptor + text file
nexuscreator -g out.nxd -i data/Se_Na2SeO4_rt_01.xdi \
  --jsonld-structure data/se_na2so4-testschemaorg-cdiv3.jsonLD

# Convert using the JSON-LD parser
nexuscreator -n out.nxd -i data/Se_Na2SeO4_rt_01.xdi \
  --jsonld-structure data/se_na2so4-testschemaorg-cdiv3.jsonLD

# Combine JSON-LD with SchemaPlacer to map variables into NXDL classes
nexuscreator -g out.nxd -i data/Se_Na2SeO4_rt_01.xdi \
  --jsonld-structure data/se_na2so4-testschemaorg-cdiv3.jsonLD \
  --nxdl-root /path/to/nxdl --app-def NXxas

JSON-LD Parser

  • Supply --jsonld-structure FILE alongside -i FILE to describe how to read fixed-width or delimited text files using CDI/Schema.org JSON-LD.

  • When combined with --nxdl-root/--app-def, the parsed variables are run through SchemaPlacer before the fallback logic kicks in.

  • If --nxdl-root is omitted, NeXusCreator uses external_references/nexus/nexus_definitions when present.

  • Even without NXDL hints, energy is mapped to /entry/instrument/monochromator/energy, i0 to /entry/instrument/incoming_beam/data, and itrans to /entry/instrument/absorbed_beam/data; any remaining arrays land in entry/instrument/logs/<name>.

Enable debug output or inspect the variable library:

# Print the NeXus definition line currently being processed
nexuscreator -n def.nxd -i data.spec -D

# Inspect the variable dictionary produced by the parser
nexuscreator -g out.nxd -i data.spec -d

Inspecting variables: -d vs --export-vars-csv

Flag

Output

Includes metadata

-d, --dictionary

Prints to terminal during run

No

--export-vars-csv FILE

CSV file (variable_name, variable_description, units)

Yes

Use --export-vars-csv when you want a machine-readable catalogue of all variables produced by the parser, including units and descriptions from __attrs__ or a --metadata-csv source:

nexuscreator -i data.spec --export-vars-csv vars.csv
nexuscreator -i /path/to/folder --export-vars-csv vars.csv -r --glob-dta "EIS_*.dta"

Prompt Literals

Prompt literals allow .nxd templates to interactively ask for a value at conversion time. A dataset value that starts with ? followed by a quoted string is a prompt literal:

sample_name:NX_CHAR = ?"Sample name"
user:NX_CHAR        = ?'User name'

When NeXusCreator encounters a prompt literal during conversion, it prints the prompt text and waits for user input. The entered value is cached for the entire run, so batch processing a folder of 50 files will prompt only once per unique literal — not once per file.

Prompt literals are especially useful for fields that vary per experiment but are not present in the data file (user name, sample description, proposal ID, etc.).

See nexus-description-syntax for the full specification.

Validate Outputs

Use --validate to run punx validation immediately after writing a .nxs file. This is skipped for .nxd/YAML generation. If punx is not installed, NeXusCreator looks for it under external_references/punx.

nexuscreator -n def.nxd -i data.spec -o out.nxs --validate

Export Dataset Values to CSV

Export a .nxs (or any HDF5) file to CSV with datasets expanded row-by-row. Scalars repeat to match the longest column.

# Export every dataset (can be large)
nexuscreator -i data.nxs --export-values-csv values.csv

# Filter to a single HDF5 prefix and change the delimiter
nexuscreator -i data.nxs --export-values-csv values.csv \
  --export-values-prefix /entry/experiments/open_circuit_potential/ \
  --csv-delimiter ';'

Python API (Advanced)

from nexuscreator.plugins import get_plugin_manager
from nexuscreator.libraries.NeXusHDF5 import NexusValueInjector, NexusHDF5Writer

pm = get_plugin_manager()
flags = {"dictionary": False, "template_for_all_scans": False}

# One-liner helpers: find the right plugin and run it
nexus_object = pm.generate_definition("data.spec", beamline=None, flags=flags)
library = pm.parse_to_library("data.spec", beamline=None, flags={})

NexusValueInjector(library).inject(nexus_object)
NexusHDF5Writer(nexus_object).write("out.nxs")

Using Parsers Directly

Every parser can be imported and called from your own scripts. They all return the flat variable dictionary that the CLI would inject:

from nexuscreator.parsers.diamond_ascii_parser import DiamondAsciiParser

parser = DiamondAsciiParser()
library = parser.parse("data/cdi-ddi/.../ascii/263814_PtSn_OCA_1.dat")

print(library["diamond_qexafs_energy"][:3])
print(library["general_sample_name"])

The library dict is what NexusValueInjector consumes. Keys such as __attrs__ hold units and descriptions that will become dataset attributes automatically.

Typical structure:

{
    "general_sample_name": "PtSn_OCA",
    "general_command": "qexafs_energy 11364.0 13000.0 3266 63.59 qexafs_counterTimer01",
    "diamond_qexafs_energy": [11364.22, 11364.66, ...],
    "diamond_lni0it": [-0.77336, -0.77270, ...],
    "nexus_entry1_instrument_qexafs_counterTimer01_time": [0.019014, 0.018883, ...],
    "__attrs__": {
        "diamond_qexafs_energy": {"@units": "eV"},
        "nexus_entry1_instrument_qexafs_counterTimer01_time": {"@units": "s"},
    },
}

Building NeXus Objects Programmatically

To generate a definition object without touching the CLI, use the plugin generators or call the helpers directly:

from nexuscreator.plugins import get_plugin_manager

pm = get_plugin_manager()
nexus_object = pm.generate_definition("data/cdi-ddi/.../ascii/263814_PtSn_OCA_1.dat",
                                      beamline=None,
                                      flags={"app_def": "NXxas"})

If you already have an .nxd file, load it with NexusDefinitionReader (see previous snippet). You can also build the object manually; it’s just a nested dict mimicking the NeXus hierarchy:

nexus_object = {
    "@default": "entry",
    "entry": {
        "@NX_class": "NXentry",
        "@default": "data",
        "instrument": {
            "@NX_class": "NXinstrument",
            "source": {
                "@NX_class": "NXsource",
                "name": {"@dtype": "NX_CHAR", "@value": '"synchrotron"'},
            },
        },
        "data": {
            "@NX_class": "NXdata",
            "@signal": "counts",
            "@axes": "energy",
            "energy": {"@dtype": "NX_FLOAT64[]", "@value": "diamond_qexafs_energy"},
            "counts": {"@dtype": "NX_FLOAT64[]", "@value": "diamond_it"},
        },
    },
}

This hand-written object can be fed straight into NexusHDF5Writer after NexusValueInjector replaces the placeholders (diamond_* keys) with actual arrays from your parser.

For plotting-friendly outputs, expose an NXdata view that links to the real datasets:

nexus_object = {
    "entry": {
        "@NX_class": "NXentry",
        "measurement": {  # real data lives here
            "@NX_class": "NXcollection",
            "energy": {"@dtype": "NX_FLOAT64[]", "@value": "diamond_qexafs_energy"},
            "counts": {"@dtype": "NX_FLOAT64[]", "@value": "diamond_it"},
        },
        "data": {  # plotting-friendly NXdata group
            "@NX_class": "NXdata",
            "@signal": "counts",
            "@axes": ["energy"],
            "energy": {"@link": "/entry/measurement/energy"},
            "counts": {"@link": "/entry/measurement/counts"},
        },
    }
}

Here @signal and @axes define the plot, while @link ensures the NXdata group points at the datasets stored elsewhere in the file.

Creating and Modifying nexus_object Safely

In practice, you usually modify nexus_object at one of these stages:

  1. Before injection (recommended for structure edits): add/remove groups, add links, set @NX_class, @signal, @axes, @units, and placeholder @value keys.

  2. After injection (for final tweaks): adjust literal metadata fields only (for example @long_name, attributes), not placeholder keys expected by the injector.

Typical lifecycle:

from nexuscreator.libraries.NeXusDefinition import NexusDefinitionReader
from nexuscreator.libraries.NeXusHDF5 import NexusValueInjector, NexusHDF5Writer

obj = NexusDefinitionReader().read("template.nxd")  # create/load

# modify structure before injection
entry = obj.setdefault("entry", {"@NX_class": "NXentry"})
entry.setdefault("instrument", {"@NX_class": "NXinstrument"})
entry["instrument"]["name"] = {"@dtype": "NX_CHAR", "@value": '"My instrument"'}

# inject parser library values into placeholders
NexusValueInjector(library).inject(obj)

# optional post-injection metadata tweaks
entry.setdefault("data", {"@NX_class": "NXdata"})
entry["data"]["@long_name"] = "Processed dataset"

NexusHDF5Writer(obj).write("out.nxs")

Recommended rules:

  • Keep dataset descriptors as dicts with both @dtype and @value.

  • Use @link / @extlink for links instead of embedding resolved paths as plain strings.

  • For NX_CHAR, use literal text in @value; for numeric arrays, use placeholders that exist in the parsed library.

  • When batch-processing folders, prompt literals (?"...") are resolved once per run and reused for all files in that run.

  • If you mutate placeholder names, update the parser library keys (or template placeholders) to match, otherwise injection will miss values.

Writing .nxs Files from a Script

You can reuse the high-level class that powers the CLI when you want the whole conversion pipeline in Python code:

from nexuscreator.creator import NeXusCreator

creator = NeXusCreator()
creator.execute_conversion({
    "input_path": "data/spec/sample.spec",
    "nexus_definition_file": "defs/sample.nxd",
    "output_path": "out/sample.nxs",
})

This accepts the same flags as the CLI (beamline_name, auto_generate_nxd, etc.). For finer control, keep using get_plugin_manager() + NexusValueInjector + NexusHDF5Writer directly as shown above.

CLI Reference (Grouped)

Core

  • -h, --help — show help and exit.

  • -v, --version — print the package version.

  • --license, --notice — print licensing information.

  • --list-beamlines — list accepted values for -b/--beamline.

Input / Output

  • -i, --input PATH — source data (file or directory).

  • -o, --output_path PATH — target .nxs file or output directory.

Conversion (use an existing .nxd)

  • -n, --nexus_definition FILE

  • -f, --file_per_scan — for SPEC, one .nxs per scan plus master.

  • -I, --icat_proposal_number NUM — append ICAT subfolder to outputs.

  • --auto-generate-nxd — with directory inputs, generate per-file .nxd automatically.

Generation (build a new definition)

  • -g, --generate_nexus_definition FILE

  • -t, --template — emit a single-scan template.

  • --single-file, --multi-file — control directory generation mode.

  • --yaml — write YAML definitions instead of .nxd.

  • --hdf5-option MODElinks (default for -g) or extract (default for -n).

Scanning & Batch Flags

  • -r, --recursive

  • --glob PATTERN — limit matches for all supported types.

  • --glob-spec, --glob-dta — type-specific patterns.

  • --dry-run, --summary-only

  • --no-group-dta-folders — process each DTA/DAT file individually.

Metadata & Placement

  • -b, --beamline_name NAME — beamline-specific context (for example ikft, batteries).

  • --metadata-csv FILE — supply descriptions/units for variables.

  • --export-vars-csv FILE — export parsed variables to CSV.

  • --nxdl-root PATH, --app-def NAME — NXDL placement hints. Base classes are always indexed; --app-def adds the selected application definition to the search. Defaults to external_references/nexus/nexus_definitions when present.

  • --jsonld-structure FILE — CDI/Schema.org JSON-LD document describing how to read the input file; pairs nicely with --nxdl-root.

  • -d, --dictionary — print the parsed variable dictionary.

  • -D, --debug — show the current .nxd line processed during conversion.

Validation & Export

  • --validate — validate the generated .nxs with punx when available.

  • --export-values-csv FILE — export dataset values from a .nxs/HDF5 file to CSV.

  • --export-values-prefix PATH — restrict CSV export to datasets under a prefix (must start with /).

  • --csv-delimiter CHAR — override CSV delimiter (default ,).