API Reference¶

All public classes are importable directly from the top-level package:

from samplesheet_parser import (
    SampleSheetFactory,
    SampleSheetV1,
    SampleSheetV2,
    SampleSheetConverter,
    SampleSheetValidator,
    SampleSheetDiff,
    SampleSheetWriter,
    SampleSheetMerger,
    SampleSheetSplitter,
    SampleSheetFilter,
    normalize_index_lengths,
)

SampleSheetFactory¶

Method / attribute	Returns	Description
`create_parser(path, *, clean, experiment_id, parse)`	`SampleSheetV1 \\| SampleSheetV2`	Auto-detect format and return the appropriate parser
`get_umi_length()`	`int`	UMI length from the current parser
`.version`	`SampleSheetVersion \\| None`	Detected format version

SampleSheetV1 / SampleSheetV2 (shared interface)¶

Method / attribute	Returns	Description
`parse(do_clean=True)`	`None`	Parse all sections
`samples()`	`list[dict]`	One record per unique sample
`index_type()`	`str`	`"dual"`, `"single"`, or `"none"`
`.adapters`	`list[str]`	Adapter sequences
`.experiment_name`	`str \\| None`	Run/experiment name

V2-only¶

Method	Returns	Description
`get_umi_length()`	`int`	UMI length from `OverrideCycles`
`get_read_structure()`	`ReadStructure`	Parsed read structure dataclass

SampleSheetConverter¶

SampleSheetConverter(path, *, workflow: Workflow | str | None = None)

Method / attribute	Returns	Description
`to_v2(output_path)`	`Path`	Convert IEM V1 → BCLConvert V2
`to_v1(output_path)`	`Path`	Convert BCLConvert V2 → IEM V1 (lossy)
`.source_version`	`SampleSheetVersion \\| None`	Auto-detected format of the input
`.workflow_override`	`Workflow \\| None`	Resolved workflow override, if any

The workflow parameter accepts "a", "b", or a Workflow enum value and overrides auto-detection of the i5 orientation workflow from the instrument header. See Conversion → Index 2 orientation.

samplesheet_parser.instruments¶

i5 orientation workflow classification helpers.

from samplesheet_parser.instruments import (
    Workflow,
    detect_workflow,
    parse_workflow,
    reverse_complement,
    WORKFLOW_A_INSTRUMENTS,
    WORKFLOW_B_INSTRUMENTS,
    AMBIGUOUS_INSTRUMENTS,
)

Name	Kind	Description
`Workflow`	`StrEnum`	`Workflow.A` (i5 forward) / `Workflow.B` (i5 RC'd on chip)
`detect_workflow(name)`	`Workflow \\| None`	Classify an instrument name; `None` for unknown or ambiguous (e.g. `NovaSeq 6000`)
`parse_workflow(value)`	`Workflow \\| None`	Coerce a CLI string (`"a"` / `"b"`) to `Workflow`
`reverse_complement(seq)`	`str`	Reverse-complement a DNA sequence (preserves `N`, case-preserving)
`WORKFLOW_A_INSTRUMENTS`	`frozenset[str]`	Normalised names of workflow-A instruments
`WORKFLOW_B_INSTRUMENTS`	`frozenset[str]`	Normalised names of workflow-B instruments
`AMBIGUOUS_INSTRUMENTS`	`frozenset[str]`	Instruments whose workflow depends on chemistry and require an explicit override

SampleSheetValidator¶

Method	Returns	Description
`validate(sheet, *, min_hamming_distance=3)`	`ValidationResult`	Run all checks; returns structured result

ValidationResult¶

Attribute / method	Type	Description
`is_valid`	`bool`	`False` if any errors present
`errors`	`list[ValidationIssue]`	Structured error records
`warnings`	`list[ValidationIssue]`	Structured warning records
`summary()`	`str`	One-line human-readable summary

ValidationIssue¶

Attribute	Type	Description
`code`	`str`	e.g. `"DUPLICATE_INDEX"`
`message`	`str`	Human-readable description
`context`	`dict`	Relevant sample IDs, lane, etc.

SampleSheetDiff¶

Method	Returns	Description
`compare()`	`DiffResult`	Full comparison across header, reads, settings, and samples

DiffResult¶

Attribute / method	Type	Description
`has_changes`	`bool`	`True` if any difference detected
`summary()`	`str`	Human-readable one-paragraph summary
`header_changes`	`list[HeaderChange]`	Header, reads, and settings diffs
`samples_added`	`list[dict]`	Records present in new sheet only
`samples_removed`	`list[dict]`	Records present in old sheet only
`sample_changes`	`list[SampleChange]`	Per-sample field-level diffs
`source_version`	`SampleSheetVersion`	Format of the old sheet
`target_version`	`SampleSheetVersion`	Format of the new sheet

SampleSheetWriter¶

Method / attribute	Returns	Description
`SampleSheetWriter(version=)`	—	Instantiate for `SampleSheetVersion.V1` or `.V2`
`from_sheet(sheet, version=)`	`SampleSheetWriter`	Load a parsed sheet for editing; optionally change format
`set_header(*, run_name, platform, ...)`	`self`	Set header fields (fluent)
`set_reads(*, read1, read2, index1, index2)`	`self`	Set read cycle counts (fluent)
`set_adapter(adapter_read1, adapter_read2)`	`self`	Set adapter sequences (fluent)
`set_override_cycles(override)`	`self`	Set `OverrideCycles` — V2 only (fluent)
`set_software_version(version)`	`self`	Set `SoftwareVersion` — V2 only (fluent)
`set_setting(key, value)`	`self`	Set an arbitrary settings key/value (fluent)
`add_sample(sample_id, *, index, ...)`	`self`	Append a sample row (fluent)
`remove_sample(sample_id, *, lane=)`	`self`	Remove sample(s) by ID, optionally scoped to a lane (fluent)
`update_sample(sample_id, , lane=, *fields)`	`self`	Update fields on an existing sample in-place (fluent)
`clear_samples()`	`self`	Remove all samples while preserving header/reads/settings (fluent)
`write(path, *, validate=True)`	`Path`	Serialise to disk; validates first by default
`to_string()`	`str`	Serialise to string without writing to disk
`.sample_count`	`int`	Number of samples currently in the writer
`.sample_ids`	`list[str]`	Sample IDs currently in the writer

SampleSheetMerger¶

Method / attribute	Returns	Description
`SampleSheetMerger(target_version=, min_hamming_distance=3)`	—	Instantiate with target format and optional Hamming threshold
`add(path)`	`self`	Register an input sheet path (fluent)
`merge(output_path, *, validate=True, abort_on_conflicts=True)`	`MergeResult`	Run the merge and write output

MergeResult¶

Attribute / method	Type	Description
`has_conflicts`	`bool`	`True` if any conflict recorded
`sample_count`	`int`	Samples in the merged output
`output_path`	`Path \\| None`	Path written; `None` if write was aborted
`source_versions`	`dict[str, str]`	Per-input-file detected version
`conflicts`	`list[MergeConflict]`	Structured conflict records
`warnings`	`list[MergeConflict]`	Structured warning records
`summary()`	`str`	One-line human-readable summary

SampleSheetSplitter¶

Method / attribute	Returns	Description
`SampleSheetSplitter(path, *, by="project", target_version=None, unassigned_label="unassigned")`	—	Instantiate with input path and grouping strategy
`split(output_dir, *, prefix="", suffix="_SampleSheet.csv", validate=True)`	`SplitResult`	Parse input and write one file per group

SplitResult¶

Attribute / method	Type	Description
`output_files`	`dict[str, Path]`	Group key → path of the written file
`sample_counts`	`dict[str, int]`	Group key → number of samples written
`warnings`	`list[str]`	Non-fatal issues (incomplete records, unassigned samples)
`source_version`	`str`	`"V1"` or `"V2"`
`summary()`	`str`	One-line human-readable summary

SampleSheetFilter¶

Method / attribute	Returns	Description
`SampleSheetFilter(path, *, target_version=None)`	—	Instantiate with input path
`filter(output_path, *, project=None, lane=None, sample_id=None, validate=True)`	`FilterResult`	Write filtered copy to `output_path`; at least one criterion required

sample_id supports glob patterns (e.g. "CTRL_*") via fnmatch.fnmatchcase — matching is always case-sensitive.

FilterResult¶

Attribute / method	Type	Description
`matched_count`	`int`	Samples that passed all filter criteria
`total_count`	`int`	Total samples in the input sheet
`output_path`	`Path \\| None`	Path written; `None` when no samples matched
`source_version`	`str`	`"V1"` or `"V2"`
`summary()`	`str`	One-line human-readable summary

normalize_index_lengths¶

normalize_index_lengths(
    samples: list[dict],
    strategy: str,                  # "trim" or "pad"
    index1_key: str | None = None,  # auto-detected if None
    index2_key: str | None = None,  # auto-detected if None
) -> list[dict]

Normalizes index sequence lengths across a list of sample dicts. See Index Utilities for details.

Enums¶

from samplesheet_parser.enums import SampleSheetVersion, InstrumentPlatform, UMILocation

SampleSheetVersion.V1   # IEM / bcl2fastq
SampleSheetVersion.V2   # BCLConvert