# NeXus Description Syntax

This page defines the NeXus description syntax used by NeXusCreator for `.nxd` files, with YAML equivalents where relevant.

## File Basics

- Blank lines are ignored.
- Lines starting with `#` are comments.
- Indentation is tab-based in `.nxd` (the reader uses leading tabs to determine nesting).
- The top-level object is typically a single `entry:` group.

## Groups

Create a group with:

```text
group_name:
```

You can also omit the trailing `:` and the parser will still treat it as a group name.

Typical NeXus groups include an `@NX_class` attribute:

```text
entry:
	@NX_class = NXentry
	instrument:
		@NX_class = NXinstrument
```

## Attributes

Any line starting with `@` is an attribute attached to the current group or dataset:

```text
@default = "entry"
@NX_class = NXentry
@units = "eV"
```

Attribute syntax:

```text
@attribute_name = value
```

## Datasets

Datasets use:

```text
name:TYPE = value
```

Examples:

```text
energy:NX_FLOAT64[] = scan1_energy
title:NX_CHAR = "Example run"
scan_number:NX_INT32 = 1
axis:NX_FLOAT64[] = [1.0, 2.0, 3.0]
```

The descriptor keys represented internally are:

- `@dtype` from `TYPE`
- `@value` from `value`

Dataset attributes can be added by indenting below the dataset:

```text
energy:NX_FLOAT64[] = scan1_energy
	@units = "eV"
	@long_name = "Monochromator energy"
```

## Value Handling

NeXusCreator accepts both placeholders and literals in `@value`.

- Numeric literals are parsed (`1`, `3.14`, `True`, `None`).
- List/dict literals are parsed (`[1, 2]`, `{"a": 1}`).
- Quoted strings are preserved as strings.
- Unquoted strings are treated as placeholder keys (for injection from parser output).

Examples:

```text
temperature:NX_FLOAT64 = 298.15
labels:NX_CHAR[] = ["a", "b", "c"]
sample_name:NX_CHAR = "Ni foil"
monitor:NX_FLOAT64[] = scan1_i0
```

## Prompt Literals

A prompt literal causes NeXusCreator to ask the user for a value at conversion time. Use it
for fields that are not present in the data file — user name, sample description, proposal
identifier, etc.

Syntax: a dataset value starting with `?` followed by a quoted string.

```text
sample_name:NX_CHAR = ?"Sample name"
user:NX_CHAR        = ?'User name'
experiment:NX_CHAR  = ?"Experiment description"
```

When the converter encounters a prompt literal it prints the prompt text, waits for the user to
type a value, and stores that value in a per-run cache. If the same prompt text appears in
multiple files during a batch run, it is resolved **once** and the cached answer is reused for
every subsequent file — so processing a folder of 50 files produces only as many prompts as
there are distinct prompt texts.

Both single and double quotes are accepted. The prompt text is the only part shown to the user;
the dataset name in the `.nxd` is used as the HDF5 field name as usual.

## Internal Links

Internal links point to another path in the same output file:

```text
signal: --> /entry/data/counts
```

This becomes an HDF5 soft link.

YAML equivalent:

```yaml
signal:
  link: /entry/data/counts
```

## External Links

External links point to another file and a path inside it:

```text
calibration: --> ../calibration/run_001.nxs | /entry/
```

This becomes an HDF5 external link.

YAML equivalent:

```yaml
calibration:
  external:
    file: ../calibration/run_001.nxs
    path: /entry/
```

## Placeholder Expansion

NeXusCreator supports placeholder lookup in values, attributes, and links.

- Plain placeholder key: `scan1_energy`
- Braced placeholder: `${scan1_energy}`
- Partial string expansion: `run_${scan_number}.nxs`

Example:

```text
output_file:NX_CHAR = "run_${scan_number}.nxs"
energy:NX_FLOAT64[] = ${scan1_energy}
calibration: --> ../ref/${calibration_file} | /entry/
```

For non-`NX_CHAR` datasets, resolved placeholders are injected as native values/arrays.

Practical example (from `testing_area/06_placeholder_expansion.nxd`):

```text
entry:
	@NX_class = NXentry
	title:NX_CHAR = "run_${general_date}.nxs"
	sample:
		@NX_class = NXsample
		name:NX_CHAR = ${general_comment}
	data:
		@NX_class = NXdata
		@signal = it
		@axes = energy
		energy:NX_FLOAT64[] = ${scan1_energy}
		it:NX_FLOAT64[] = scan1_apd1
```

With input `data/spec/femtospex/2022-01-Schuessler-01_S1.spec`, this resolves to values like:
- `title -> run_2022-01-31T19:06:00.nxs`
- `sample/name -> fourc  User = specuser`
- numeric arrays for `energy` and `it`.

## Scan Templates

Groups can be used as templates and expanded per scan if library keys include `scan<N>_...`.

Template markers:

- Group attribute `@scan_template = True`, or
- Group name contains `{num}` or `{scan}`.

Template values can use `scan{num}_...` (or `scan{scan}_...`):

```text
entry:
	@NX_class = NXentry
	scan_{num}:
		@scan_template = True
		instrument:
			@NX_class = NXinstrument
			energy:NX_FLOAT64[] = scan{num}_energy
```

If the library has `scan1_energy` and `scan2_energy`, this expands to concrete groups (for example `scan_01`, `scan_02`) before value injection.

Practical multi-scan pattern:

```text
entry:
	@NX_class = NXentry
	measurements:
		@NX_class = NXcollection
		scan_{num}:
			@scan_template = True
			@NX_class = NXcollection
			data:
				@NX_class = NXdata
				@signal = it
				@axes = energy
				energy:NX_FLOAT64[] = scan{num}_energy
				it:NX_FLOAT64[] = scan{num}_apd1
```

Use with a multi-scan SPEC file:

```bash
python3 NeXusCreator.py \
  -n testing_area/07_scan_template.nxd \
  -i data/spec/femtospex/2022-01-Schuessler-01.spec \
  -o testing_area/out_multi_scan.nxs
```

For one file per scan (plus master links), add `-f`.

## YAML Shape

YAML uses explicit fields rather than `@` prefixes:

- `attributes` map for attributes
- `dtype` and `value` for datasets
- `link` for internal links
- `external.file` and `external.path` for external links

Example:

```yaml
entry:
  attributes:
    NX_class: NXentry
  instrument:
    attributes:
      NX_class: NXinstrument
    energy:
      dtype: NX_FLOAT64[]
      value: scan1_energy
      attributes:
        units: eV
```

## Minimal Complete `.nxd` Example

```text
entry:
	@NX_class = NXentry
	@default = data
	data:
		@NX_class = NXdata
		@signal = counts
		@axes = energy
		energy:NX_FLOAT64[] = scan1_energy
			@units = "eV"
		counts:NX_FLOAT64[] = scan1_counts
		title:NX_CHAR = "Simple NeXusCreator example"
```

## Creating a nexus_object Programmatically

You can create `nexus_object` structures directly in Python code, which is particularly useful for dynamic generation or when integrating with other Python tools.

### Basic Structure

A `nexus_object` is a nested dictionary that follows specific conventions:

```python
nexus_object = {
    '@default': 'entry',
    'entry': {
        '@NX_class': 'NXentry',
        # Groups and datasets go here
    }
}
```

### Adding Groups

Groups are represented as nested dictionaries with an `@NX_class` attribute:

```python
nexus_object = {
    'entry': {
        '@NX_class': 'NXentry',
        'instrument': {
            '@NX_class': 'NXinstrument',
            # Instrument components
        },
        'sample': {
            '@NX_class': 'NXsample',
            # Sample information
        }
    }
}
```

### Adding Datasets

Datasets are dictionaries with specific keys:

```python
nexus_object = {
    'entry': {
        '@NX_class': 'NXentry',
        'temperature': {
            '@dtype': 'NX_FLOAT64',
            '@value': 298.15,
            '@units': 'K',
            '@description': 'Sample temperature'
        },
        'pressure': {
            '@dtype': 'NX_FLOAT64',
            '@value': 1.0,
            '@units': 'atm'
        }
    }
}
```

### Using Placeholders

Placeholders allow for dynamic value injection during conversion:

```python
nexus_object = {
    'entry': {
        '@NX_class': 'NXentry',
        'energy': {
            '@dtype': 'NX_FLOAT64[]',
            '@value': 'scan1_energy',  # Placeholder
            '@units': 'eV'
        },
        'title': {
            '@dtype': 'NX_CHAR',
            '@value': 'Experiment ${scan_number}'  # Expression with placeholder
        }
    }
}
```

### Adding Links

#### Internal Link Example

```python
nexus_object = {
    'entry': {
        '@NX_class': 'NXentry',
        'data': {
            '@NX_class': 'NXdata',
            'original': {
                '@dtype': 'NX_FLOAT64[]',
                '@value': [1.0, 2.0, 3.0]
            },
            'linked': {
                '@link': '/entry/data/original'   # key must be '@link'
            }
        }
    }
}
```

#### External Link Example

```python
nexus_object = {
    'entry': {
        '@NX_class': 'NXentry',
        'calibration': {
            '@extlink': {              # key must be '@extlink'
                'file': 'calibration.nxs',
                'path': '/entry/calibration/data'
            }
        }
    }
}
```

### Complete Example

Here's a complete example of creating a `nexus_object` programmatically:

```python
nexus_object = {
    '@default': 'entry',
    'entry': {
        '@NX_class': 'NXentry',
        '@default': 'data',
        
        'sample': {
            '@NX_class': 'NXsample',
            'name': {
                '@dtype': 'NX_CHAR',
                '@value': 'Ni foil',
                '@description': 'Sample name'
            },
            'temperature': {
                '@dtype': 'NX_FLOAT64',
                '@value': 298.15,
                '@units': 'K',
                '@description': 'Sample temperature'
            }
        },
        
        'instrument': {
            '@NX_class': 'NXinstrument',
            'detector': {
                '@NX_class': 'NXdetector',
                'distance': {
                    '@dtype': 'NX_FLOAT64',
                    '@value': 1.5,
                    '@units': 'm',
                    '@description': 'Detector distance'
                }
            }
        },
        
        'data': {
            '@NX_class': 'NXdata',
            '@signal': 'counts',
            '@axes': 'energy',
            
            'energy': {
                '@dtype': 'NX_FLOAT64[]',
                '@value': 'scan1_energy',  # Placeholder
                '@units': 'eV',
                '@long_name': 'Photon energy'
            },
            
            'counts': {
                '@dtype': 'NX_FLOAT64[]',
                '@value': 'scan1_counts',  # Placeholder
                '@units': 'counts',
                '@long_name': 'Detector counts'
            }
        },
        
        'title': {
            '@dtype': 'NX_CHAR',
            '@value': 'Ni foil measurement',
            '@description': 'Experiment title'
        }
    }
}
```

### Using the nexus_object

Once created, inject a variable library and write the HDF5 file:

```python
from nexuscreator.libraries.NeXusHDF5 import NexusValueInjector, NexusHDF5Writer

# library is the flat dict produced by any parser
NexusValueInjector(library).inject(nexus_object)
NexusHDF5Writer(nexus_object).write("output.nxs")
```

Or use the high-level `create_nexus` helper when you have a saved `.nxd` file:

```python
from nexuscreator import create_nexus

create_nexus(
    nexus_definition_file='template.nxd',
    input_path='data.spec',
    output_path='output.nxs',
    beamline_name='ikft',
)
```

Note: `create_nexus` drives the full CLI conversion pipeline from a `.nxd` file — it does not
accept a `nexus_object` dict directly. Use `NexusHDF5Writer` when you have a dict in memory.

### Modifying a nexus_object

You can easily modify a `nexus_object` by working with the dictionary:

```python
# Add a new dataset
nexus_object['entry']['sample']['pressure'] = {
    '@dtype': 'NX_FLOAT64',
    '@value': 1.0,
    '@units': 'atm',
    '@description': 'Sample pressure'
}

# Modify an existing dataset
nexus_object['entry']['sample']['temperature']['@value'] = 300.0

# Remove a dataset
del nexus_object['entry']['sample']['temperature']
```

### Converting Between Formats

You can convert between `.nxd` files and `nexus_object` dictionaries:

```python
from nexuscreator.libraries.NeXusDefinition import NexusDefinitionReader, NexusDefinitionWriter

# Read .nxd → nexus_object dict
nexus_object = NexusDefinitionReader().read('example.nxd')

# Write nexus_object dict → .nxd  (constructor takes the object; write() takes the path)
NexusDefinitionWriter(nexus_object).write('output.nxd')
```

### Best Practices for Programmatic Creation

1. **Structure**: Follow the NeXus standard hierarchy (NXentry, NXsample, NXinstrument, etc.)
2. **Attributes**: Always include units and descriptions for datasets
3. **Placeholders**: Use descriptive placeholder names that match your data keys
4. **Validation**: Test your generated structure with sample data
5. **Documentation**: Add comments to explain complex structures
6. **Modularity**: Create helper functions for common patterns

Programmatic creation is particularly useful when:
- Generating NeXus structures dynamically based on experimental conditions
- Integrating with data analysis pipelines
- Creating templates for specific experimental setups
- Automating the creation of complex NeXus structures

## Detailed `.nxd` File Format

The `.nxd` file format is a custom text-based syntax designed for concise representation of NeXus file structures. It uses a hierarchical structure with indentation to define groups and datasets.

### Syntax Rules

1. **Indentation**: Uses tabs (not spaces) for indentation. Each level of indentation represents a deeper level in the NeXus hierarchy.

2. **Comments**: Lines starting with `#` are treated as comments and ignored during parsing.

3. **Groups**: Defined by a name followed by a colon (`:`). Groups can contain other groups, datasets, and attributes.

4. **Datasets**: Defined by `name:TYPE = value` where:
   - `name` is the dataset name
   - `TYPE` is the NeXus data type (e.g., `NX_FLOAT64`, `NX_INT32`, `NX_CHAR`)
   - `value` can be a literal value, placeholder, or array

5. **Attributes**: Defined by `@attribute_name = value`. Attributes can be attached to groups or datasets.

6. **Links**: Defined using the arrow syntax `name: --> target_path` for internal links or `name: --> file_path | internal_path` for external links.

### Data Types

The `.nxd` format supports all standard NeXus data types:

- **Numeric types**: `NX_INT8`, `NX_INT16`, `NX_INT32`, `NX_INT64`, `NX_UINT8`, `NX_UINT16`, `NX_UINT32`, `NX_UINT64`, `NX_FLOAT32`, `NX_FLOAT64`
- **String type**: `NX_CHAR`
- **Boolean type**: `NX_BOOL`
- **Complex types**: `NX_COMPLEX64`, `NX_COMPLEX128`

Arrays are indicated by appending `[]` to the type (e.g., `NX_FLOAT64[]`).

### Value Formats

Values in `.nxd` files can be:

1. **Literal values**:
   - Numbers: `42`, `3.14159`
   - Strings: `"Hello World"` (quoted)
   - Booleans: `True`, `False`
   - Arrays: `[1, 2, 3]`, `["a", "b", "c"]`
   - Dictionaries: `{"key": "value"}`

2. **Placeholders**: Unquoted strings that will be replaced with values from the input data during conversion. Placeholders can use brace notation: `${placeholder_name}`.

3. **Expressions**: Simple string expressions that combine literals and placeholders: `"run_${scan_number}.nxs"`.

### Special Constructs

#### Internal Link Syntax

Internal links create references to other datasets within the same NeXus file:

```text
linked_dataset: --> /entry/original_dataset
```

#### External Link Syntax

External links create references to datasets in other NeXus/HDF5 files:

```text
external_dataset: --> path/to/file.nxs | /entry/data
```

#### Scan Template Syntax

Scan templates allow for creating multiple similar structures based on scan data:

```text
scan_{num}:
    @scan_template = True
    energy:NX_FLOAT64[] = scan{num}_energy
    counts:NX_FLOAT64[] = scan{num}_counts
```

The `{num}` placeholder in the group name and dataset values will be replaced with scan numbers during processing.

### File Structure Example

Here's a complete example showing the structure of a `.nxd` file:

```text
# This is a comment
@default = entry

entry:
	@NX_class = NXentry
	@default = data

	# Sample information
	sample:
		@NX_class = NXsample
		name:NX_CHAR = "Sample Name"
		temperature:NX_FLOAT64 = 298.15
			@units = "K"

	# Instrument configuration
	instrument:
		@NX_class = NXinstrument
		detector:
			@NX_class = NXdetector
			distance:NX_FLOAT64 = 1.5
				@units = "m"

	# Data collection
	data:
		@NX_class = NXdata
		@signal = counts
		@axes = energy

		energy:NX_FLOAT64[] = scan1_energy
			@units = "eV"
			@long_name = "Photon energy"

		counts:NX_FLOAT64[] = scan1_counts
			@units = "counts"
			@long_name = "Detector counts"

	# Internal soft link
	energy_link: --> /entry/data/energy

	# External link (file | /path)
	calibration: --> calibration.nxs | /entry/calibration/data
```

### Best Practices

1. **Indentation**: Always use tabs for indentation, not spaces. Mixing tabs and spaces can cause parsing errors.

2. **Comments**: Use comments liberally to document the purpose of different sections of your NeXus structure.

3. **Grouping**: Organize related datasets into appropriate NeXus groups (NXsample, NXinstrument, etc.).

4. **Attributes**: Include units and descriptions as attributes for all datasets to ensure proper metadata.

5. **Placeholders**: Use descriptive placeholder names that match the keys in your input data library.

6. **Validation**: After creating a `.nxd` file, test it with sample data to ensure it produces the expected NeXus structure.

### Conversion to YAML

The `.nxd` format can be converted to YAML format, which uses a more verbose but equally expressive syntax:

```yaml
entry:
  attributes:
    NX_class: NXentry
    default: data
  sample:
    attributes:
      NX_class: NXsample
    name:
      dtype: NX_CHAR
      value: "Sample Name"
    temperature:
      dtype: NX_FLOAT64
      value: 298.15
      attributes:
        units: "K"
  instrument:
    attributes:
      NX_class: NXinstrument
    detector:
      attributes:
        NX_class: NXdetector
      distance:
        dtype: NX_FLOAT64
        value: 1.5
        attributes:
          units: "m"
```

Both formats are fully supported by NeXusCreator and can be used interchangeably depending on your preference and use case.