Index Utilities¶

normalize_index_lengths¶

Normalizes index sequence lengths across a list of sample dicts before merging sheets with mixed-length indexes.

from samplesheet_parser import normalize_index_lengths

samples = sheet.samples()

# Trim all indexes to the shortest length
normalized = normalize_index_lengths(samples, strategy="trim")

# Pad shorter indexes to the longest length using "N" wildcards
normalized = normalize_index_lengths(samples, strategy="pad")

Strategies¶

Strategy	Behaviour
`"trim"`	Trims all indexes to the shortest sequence present
`"pad"`	Pads shorter indexes to the longest length using `"N"` wildcard characters

BCLConvert compatibility

"N" padding is supported by BCLConvert ≥ 3.9 and bcl2fastq ≥ 2.20.

Dual-index normalization¶

Both I7 (Index / index) and I5 (Index2 / index2) are normalized independently:

normalized = normalize_index_lengths(samples, strategy="pad")

Field name auto-detection¶

The utility auto-detects V1-style (index / index2) and V2-style (Index / Index2) field names. Use explicit overrides if your samples use custom field names:

normalized = normalize_index_lengths(
    samples,
    strategy="trim",
    index1_key="custom_index",
    index2_key="custom_index2",
)

Typical workflow before merging¶

from samplesheet_parser import SampleSheetMerger, normalize_index_lengths
from samplesheet_parser.enums import SampleSheetVersion

# Normalize each sheet's samples before merging
merger = SampleSheetMerger(target_version=SampleSheetVersion.V2)
merger.add("ProjectA.csv").add("ProjectB.csv")
result = merger.merge("combined.csv")