# NeXus Description Syntax This page defines the NeXus description syntax used by NeXusCreator for `.nxd` files, with YAML equivalents where relevant. ## File Basics - Blank lines are ignored. - Lines starting with `#` are comments. - Indentation is tab-based in `.nxd` (the reader uses leading tabs to determine nesting). - The top-level object is typically a single `entry:` group. ## Groups Create a group with: ```text group_name: ``` You can also omit the trailing `:` and the parser will still treat it as a group name. Typical NeXus groups include an `@NX_class` attribute: ```text entry: @NX_class = NXentry instrument: @NX_class = NXinstrument ``` ## Attributes Any line starting with `@` is an attribute attached to the current group or dataset: ```text @default = "entry" @NX_class = NXentry @units = "eV" ``` Attribute syntax: ```text @attribute_name = value ``` ## Datasets Datasets use: ```text name:TYPE = value ``` Examples: ```text energy:NX_FLOAT64[] = scan1_energy title:NX_CHAR = "Example run" scan_number:NX_INT32 = 1 axis:NX_FLOAT64[] = [1.0, 2.0, 3.0] ``` The descriptor keys represented internally are: - `@dtype` from `TYPE` - `@value` from `value` Dataset attributes can be added by indenting below the dataset: ```text energy:NX_FLOAT64[] = scan1_energy @units = "eV" @long_name = "Monochromator energy" ``` ## Value Handling NeXusCreator accepts both placeholders and literals in `@value`. - Numeric literals are parsed (`1`, `3.14`, `True`, `None`). - List/dict literals are parsed (`[1, 2]`, `{"a": 1}`). - Quoted strings are preserved as strings. - Unquoted strings are treated as placeholder keys (for injection from parser output). Examples: ```text temperature:NX_FLOAT64 = 298.15 labels:NX_CHAR[] = ["a", "b", "c"] sample_name:NX_CHAR = "Ni foil" monitor:NX_FLOAT64[] = scan1_i0 ``` ## Prompt Literals A prompt literal causes NeXusCreator to ask the user for a value at conversion time. Use it for fields that are not present in the data file — user name, sample description, proposal identifier, etc. Syntax: a dataset value starting with `?` followed by a quoted string. ```text sample_name:NX_CHAR = ?"Sample name" user:NX_CHAR = ?'User name' experiment:NX_CHAR = ?"Experiment description" ``` When the converter encounters a prompt literal it prints the prompt text, waits for the user to type a value, and stores that value in a per-run cache. If the same prompt text appears in multiple files during a batch run, it is resolved **once** and the cached answer is reused for every subsequent file — so processing a folder of 50 files produces only as many prompts as there are distinct prompt texts. Both single and double quotes are accepted. The prompt text is the only part shown to the user; the dataset name in the `.nxd` is used as the HDF5 field name as usual. ## Internal Links Internal links point to another path in the same output file: ```text signal: --> /entry/data/counts ``` This becomes an HDF5 soft link. YAML equivalent: ```yaml signal: link: /entry/data/counts ``` ## External Links External links point to another file and a path inside it: ```text calibration: --> ../calibration/run_001.nxs | /entry/ ``` This becomes an HDF5 external link. YAML equivalent: ```yaml calibration: external: file: ../calibration/run_001.nxs path: /entry/ ``` ## Placeholder Expansion NeXusCreator supports placeholder lookup in values, attributes, and links. - Plain placeholder key: `scan1_energy` - Braced placeholder: `${scan1_energy}` - Partial string expansion: `run_${scan_number}.nxs` Example: ```text output_file:NX_CHAR = "run_${scan_number}.nxs" energy:NX_FLOAT64[] = ${scan1_energy} calibration: --> ../ref/${calibration_file} | /entry/ ``` For non-`NX_CHAR` datasets, resolved placeholders are injected as native values/arrays. Practical example (from `testing_area/06_placeholder_expansion.nxd`): ```text entry: @NX_class = NXentry title:NX_CHAR = "run_${general_date}.nxs" sample: @NX_class = NXsample name:NX_CHAR = ${general_comment} data: @NX_class = NXdata @signal = it @axes = energy energy:NX_FLOAT64[] = ${scan1_energy} it:NX_FLOAT64[] = scan1_apd1 ``` With input `data/spec/femtospex/2022-01-Schuessler-01_S1.spec`, this resolves to values like: - `title -> run_2022-01-31T19:06:00.nxs` - `sample/name -> fourc User = specuser` - numeric arrays for `energy` and `it`. ## Scan Templates Groups can be used as templates and expanded per scan if library keys include `scan_...`. Template markers: - Group attribute `@scan_template = True`, or - Group name contains `{num}` or `{scan}`. Template values can use `scan{num}_...` (or `scan{scan}_...`): ```text entry: @NX_class = NXentry scan_{num}: @scan_template = True instrument: @NX_class = NXinstrument energy:NX_FLOAT64[] = scan{num}_energy ``` If the library has `scan1_energy` and `scan2_energy`, this expands to concrete groups (for example `scan_01`, `scan_02`) before value injection. Practical multi-scan pattern: ```text entry: @NX_class = NXentry measurements: @NX_class = NXcollection scan_{num}: @scan_template = True @NX_class = NXcollection data: @NX_class = NXdata @signal = it @axes = energy energy:NX_FLOAT64[] = scan{num}_energy it:NX_FLOAT64[] = scan{num}_apd1 ``` Use with a multi-scan SPEC file: ```bash python3 NeXusCreator.py \ -n testing_area/07_scan_template.nxd \ -i data/spec/femtospex/2022-01-Schuessler-01.spec \ -o testing_area/out_multi_scan.nxs ``` For one file per scan (plus master links), add `-f`. ## YAML Shape YAML uses explicit fields rather than `@` prefixes: - `attributes` map for attributes - `dtype` and `value` for datasets - `link` for internal links - `external.file` and `external.path` for external links Example: ```yaml entry: attributes: NX_class: NXentry instrument: attributes: NX_class: NXinstrument energy: dtype: NX_FLOAT64[] value: scan1_energy attributes: units: eV ``` ## Minimal Complete `.nxd` Example ```text entry: @NX_class = NXentry @default = data data: @NX_class = NXdata @signal = counts @axes = energy energy:NX_FLOAT64[] = scan1_energy @units = "eV" counts:NX_FLOAT64[] = scan1_counts title:NX_CHAR = "Simple NeXusCreator example" ``` ## Creating a nexus_object Programmatically You can create `nexus_object` structures directly in Python code, which is particularly useful for dynamic generation or when integrating with other Python tools. ### Basic Structure A `nexus_object` is a nested dictionary that follows specific conventions: ```python nexus_object = { '@default': 'entry', 'entry': { '@NX_class': 'NXentry', # Groups and datasets go here } } ``` ### Adding Groups Groups are represented as nested dictionaries with an `@NX_class` attribute: ```python nexus_object = { 'entry': { '@NX_class': 'NXentry', 'instrument': { '@NX_class': 'NXinstrument', # Instrument components }, 'sample': { '@NX_class': 'NXsample', # Sample information } } } ``` ### Adding Datasets Datasets are dictionaries with specific keys: ```python nexus_object = { 'entry': { '@NX_class': 'NXentry', 'temperature': { '@dtype': 'NX_FLOAT64', '@value': 298.15, '@units': 'K', '@description': 'Sample temperature' }, 'pressure': { '@dtype': 'NX_FLOAT64', '@value': 1.0, '@units': 'atm' } } } ``` ### Using Placeholders Placeholders allow for dynamic value injection during conversion: ```python nexus_object = { 'entry': { '@NX_class': 'NXentry', 'energy': { '@dtype': 'NX_FLOAT64[]', '@value': 'scan1_energy', # Placeholder '@units': 'eV' }, 'title': { '@dtype': 'NX_CHAR', '@value': 'Experiment ${scan_number}' # Expression with placeholder } } } ``` ### Adding Links #### Internal Link Example ```python nexus_object = { 'entry': { '@NX_class': 'NXentry', 'data': { '@NX_class': 'NXdata', 'original': { '@dtype': 'NX_FLOAT64[]', '@value': [1.0, 2.0, 3.0] }, 'linked': { '@link': '/entry/data/original' # key must be '@link' } } } } ``` #### External Link Example ```python nexus_object = { 'entry': { '@NX_class': 'NXentry', 'calibration': { '@extlink': { # key must be '@extlink' 'file': 'calibration.nxs', 'path': '/entry/calibration/data' } } } } ``` ### Complete Example Here's a complete example of creating a `nexus_object` programmatically: ```python nexus_object = { '@default': 'entry', 'entry': { '@NX_class': 'NXentry', '@default': 'data', 'sample': { '@NX_class': 'NXsample', 'name': { '@dtype': 'NX_CHAR', '@value': 'Ni foil', '@description': 'Sample name' }, 'temperature': { '@dtype': 'NX_FLOAT64', '@value': 298.15, '@units': 'K', '@description': 'Sample temperature' } }, 'instrument': { '@NX_class': 'NXinstrument', 'detector': { '@NX_class': 'NXdetector', 'distance': { '@dtype': 'NX_FLOAT64', '@value': 1.5, '@units': 'm', '@description': 'Detector distance' } } }, 'data': { '@NX_class': 'NXdata', '@signal': 'counts', '@axes': 'energy', 'energy': { '@dtype': 'NX_FLOAT64[]', '@value': 'scan1_energy', # Placeholder '@units': 'eV', '@long_name': 'Photon energy' }, 'counts': { '@dtype': 'NX_FLOAT64[]', '@value': 'scan1_counts', # Placeholder '@units': 'counts', '@long_name': 'Detector counts' } }, 'title': { '@dtype': 'NX_CHAR', '@value': 'Ni foil measurement', '@description': 'Experiment title' } } } ``` ### Using the nexus_object Once created, inject a variable library and write the HDF5 file: ```python from nexuscreator.libraries.NeXusHDF5 import NexusValueInjector, NexusHDF5Writer # library is the flat dict produced by any parser NexusValueInjector(library).inject(nexus_object) NexusHDF5Writer(nexus_object).write("output.nxs") ``` Or use the high-level `create_nexus` helper when you have a saved `.nxd` file: ```python from nexuscreator import create_nexus create_nexus( nexus_definition_file='template.nxd', input_path='data.spec', output_path='output.nxs', beamline_name='ikft', ) ``` Note: `create_nexus` drives the full CLI conversion pipeline from a `.nxd` file — it does not accept a `nexus_object` dict directly. Use `NexusHDF5Writer` when you have a dict in memory. ### Modifying a nexus_object You can easily modify a `nexus_object` by working with the dictionary: ```python # Add a new dataset nexus_object['entry']['sample']['pressure'] = { '@dtype': 'NX_FLOAT64', '@value': 1.0, '@units': 'atm', '@description': 'Sample pressure' } # Modify an existing dataset nexus_object['entry']['sample']['temperature']['@value'] = 300.0 # Remove a dataset del nexus_object['entry']['sample']['temperature'] ``` ### Converting Between Formats You can convert between `.nxd` files and `nexus_object` dictionaries: ```python from nexuscreator.libraries.NeXusDefinition import NexusDefinitionReader, NexusDefinitionWriter # Read .nxd → nexus_object dict nexus_object = NexusDefinitionReader().read('example.nxd') # Write nexus_object dict → .nxd (constructor takes the object; write() takes the path) NexusDefinitionWriter(nexus_object).write('output.nxd') ``` ### Best Practices for Programmatic Creation 1. **Structure**: Follow the NeXus standard hierarchy (NXentry, NXsample, NXinstrument, etc.) 2. **Attributes**: Always include units and descriptions for datasets 3. **Placeholders**: Use descriptive placeholder names that match your data keys 4. **Validation**: Test your generated structure with sample data 5. **Documentation**: Add comments to explain complex structures 6. **Modularity**: Create helper functions for common patterns Programmatic creation is particularly useful when: - Generating NeXus structures dynamically based on experimental conditions - Integrating with data analysis pipelines - Creating templates for specific experimental setups - Automating the creation of complex NeXus structures ## Detailed `.nxd` File Format The `.nxd` file format is a custom text-based syntax designed for concise representation of NeXus file structures. It uses a hierarchical structure with indentation to define groups and datasets. ### Syntax Rules 1. **Indentation**: Uses tabs (not spaces) for indentation. Each level of indentation represents a deeper level in the NeXus hierarchy. 2. **Comments**: Lines starting with `#` are treated as comments and ignored during parsing. 3. **Groups**: Defined by a name followed by a colon (`:`). Groups can contain other groups, datasets, and attributes. 4. **Datasets**: Defined by `name:TYPE = value` where: - `name` is the dataset name - `TYPE` is the NeXus data type (e.g., `NX_FLOAT64`, `NX_INT32`, `NX_CHAR`) - `value` can be a literal value, placeholder, or array 5. **Attributes**: Defined by `@attribute_name = value`. Attributes can be attached to groups or datasets. 6. **Links**: Defined using the arrow syntax `name: --> target_path` for internal links or `name: --> file_path | internal_path` for external links. ### Data Types The `.nxd` format supports all standard NeXus data types: - **Numeric types**: `NX_INT8`, `NX_INT16`, `NX_INT32`, `NX_INT64`, `NX_UINT8`, `NX_UINT16`, `NX_UINT32`, `NX_UINT64`, `NX_FLOAT32`, `NX_FLOAT64` - **String type**: `NX_CHAR` - **Boolean type**: `NX_BOOL` - **Complex types**: `NX_COMPLEX64`, `NX_COMPLEX128` Arrays are indicated by appending `[]` to the type (e.g., `NX_FLOAT64[]`). ### Value Formats Values in `.nxd` files can be: 1. **Literal values**: - Numbers: `42`, `3.14159` - Strings: `"Hello World"` (quoted) - Booleans: `True`, `False` - Arrays: `[1, 2, 3]`, `["a", "b", "c"]` - Dictionaries: `{"key": "value"}` 2. **Placeholders**: Unquoted strings that will be replaced with values from the input data during conversion. Placeholders can use brace notation: `${placeholder_name}`. 3. **Expressions**: Simple string expressions that combine literals and placeholders: `"run_${scan_number}.nxs"`. ### Special Constructs #### Internal Link Syntax Internal links create references to other datasets within the same NeXus file: ```text linked_dataset: --> /entry/original_dataset ``` #### External Link Syntax External links create references to datasets in other NeXus/HDF5 files: ```text external_dataset: --> path/to/file.nxs | /entry/data ``` #### Scan Template Syntax Scan templates allow for creating multiple similar structures based on scan data: ```text scan_{num}: @scan_template = True energy:NX_FLOAT64[] = scan{num}_energy counts:NX_FLOAT64[] = scan{num}_counts ``` The `{num}` placeholder in the group name and dataset values will be replaced with scan numbers during processing. ### File Structure Example Here's a complete example showing the structure of a `.nxd` file: ```text # This is a comment @default = entry entry: @NX_class = NXentry @default = data # Sample information sample: @NX_class = NXsample name:NX_CHAR = "Sample Name" temperature:NX_FLOAT64 = 298.15 @units = "K" # Instrument configuration instrument: @NX_class = NXinstrument detector: @NX_class = NXdetector distance:NX_FLOAT64 = 1.5 @units = "m" # Data collection data: @NX_class = NXdata @signal = counts @axes = energy energy:NX_FLOAT64[] = scan1_energy @units = "eV" @long_name = "Photon energy" counts:NX_FLOAT64[] = scan1_counts @units = "counts" @long_name = "Detector counts" # Internal soft link energy_link: --> /entry/data/energy # External link (file | /path) calibration: --> calibration.nxs | /entry/calibration/data ``` ### Best Practices 1. **Indentation**: Always use tabs for indentation, not spaces. Mixing tabs and spaces can cause parsing errors. 2. **Comments**: Use comments liberally to document the purpose of different sections of your NeXus structure. 3. **Grouping**: Organize related datasets into appropriate NeXus groups (NXsample, NXinstrument, etc.). 4. **Attributes**: Include units and descriptions as attributes for all datasets to ensure proper metadata. 5. **Placeholders**: Use descriptive placeholder names that match the keys in your input data library. 6. **Validation**: After creating a `.nxd` file, test it with sample data to ensure it produces the expected NeXus structure. ### Conversion to YAML The `.nxd` format can be converted to YAML format, which uses a more verbose but equally expressive syntax: ```yaml entry: attributes: NX_class: NXentry default: data sample: attributes: NX_class: NXsample name: dtype: NX_CHAR value: "Sample Name" temperature: dtype: NX_FLOAT64 value: 298.15 attributes: units: "K" instrument: attributes: NX_class: NXinstrument detector: attributes: NX_class: NXdetector distance: dtype: NX_FLOAT64 value: 1.5 attributes: units: "m" ``` Both formats are fully supported by NeXusCreator and can be used interchangeably depending on your preference and use case.