[0.4.0] - 2025-08-24¶
Added¶
seqspec buildsubcommand. Generates a draft spec from a natural-language description using the existinginit/insert/modify/checktools. Marked experimental.- New validation checks for
seqspec checkcheck_region_against_subregion_lengthcheck_region_against_subregion_sequencecheck_read_length_against_library
seqspec index:--no-overlapto de-duplicate regions covered by overlapping reads.- Target
-t kb-singlefor single-end kallisto-bus indexing. - Primer ID may be any region (not only a leaf).
seqspec print -f pngnow layers sequencing reads on top of the library diagram.load_specsupports gzipped YAML (.yaml.gz).seqspec checkcan skip specific checks; can ignore onlist when needed.- “Loose loading” mode for
seqspec checkandseqspec formatwith captured validation errors. - New region type:
sgrna_target. - Added private attribute to
Assay,_spec_path(str). Stores the fully resolved path to the loaded spec. Used inseqspec check.
Fixed¶
Read.get_file_by_idreturned the wrong file in some cases.seqspec file -f pairedlists all files.seqspec index:- Correct file-name indexing with
-s file. - Correct behavior when targeting region indices.
- Deterministic deduplication of FASTQ filenames.
- Lists file URLs and accepts HTTPS onlists.
- Correct file-name indexing with
seqspec onlist:- Local caching and remote download (handles gzip and HTTP 302).
- Support for kallisto “multi” file format.
- Fix for bug tracked in #68.
Plotting fixes for single-modality figures and read overlays.
Test suite rewritten and passing on modern Python; mocks added for remote I/O.
Changed¶
seqspec initnow initializes an empty seqspec file.- seqspec creation simplified:
- Create with
init, then addregions/readsviainsert, thenmodify. insert/modifytake lists of*Inputobjects. Consistent-sselector.-ris deprecated in favor of-iacrossindex,onlist, andfind.
- Create with
- Internal models now derive from Pydantic
BaseModel. YAML tags (!Assay,!Read,!Region,!Onlist,!File) are no longer required and are stripped on load. Region.regionsdefaults to[](not optional). Severalreprand printing paths were cleaned up.- Packaging and dev:
- Moved to
pyproject.toml; works well withuv. Versioning viasetuptools_scm. - Formatter switched to
ruff. Test runner switched topytest.
- Moved to
seqspec indexinternals made consistent; primer mapping works at any level of the library.
Removed¶
spec_fnparameter fromseqspec check.seqspec convertCLI (unimplemented).- Legacy implicit
parent_idin specs (was unused). tox(replaced bypytest).
Breaking changes¶
seqspec onlistnow usesurltype(notlocation) to decide local vs remote. Older specs must update onlist metadata.initno longer accepts library/read structures. You mustinsertregions and reads afterinit.- Input flag change: prefer
-i(ID) over-racrossindex/onlist/find.-ris deprecated and will be removed in a future release. - Onlist/schema updates since 0.3.x (Files in Reads,
file_idfields, URL checks) may require runningseqspec upgradeor manual edits.
[0.3.1] - 2024-10-14¶
Added¶
- Added an option to report the full path of the URL of files if the URL type is local.
Fixed¶
- Fixed a bug in the seqspec file with the flag “-f paired” that was causing it to not list out all of the files correctly.
[0.3.0] - 2024-10-10¶
Added¶
- Added typing hints to many
AssayandRegionfunctions - Added
'k keytoseqspec infoto display one ofmetafor metadata,sequence_specfor sequence spec, andlibrary_specfor library spec - Added
-f formattoseqspec infoto enable multiple formats for displaying info - Added support for 0 length regions in the specification
- Added ability to modify reads using
seqspec modify seqspec initnow accepts read information- Added templates and updated documentation to myst, website
- Added new examples (e.g., dogmaseq-dig)
- Added File object to Read
- Added SeqKit, SeqProtocol, LibKit, and LibProtocol attributes
- Added option to specify multiple SeqKit, SeqProtocol, LibKit, LibProtocol for each modality
- Added name validation against known list of SeqKit, SeqProtocol, LibKit, LibProtocol
- Added new attributes to Onlist object (Onlist now contains a File object):
- filetype
- filesize
- url
- Added “file_id” to the File object
- Added
seqspec filecommand to return the list of files in the spec as paired across reads or interleaved - Added
seqspec upgradehidden feature to upgrade v0.2.x seqspec files to v0.3.0.
Changed¶
- Updated
seqspec indexto read strands for chromap - Updated
seqspec splitto correctly split sequence_spec - Modified
seqspecstring to use min_len instead of max_len to accommodate nanopore reads - Updated documentation on how to propose new vocabulary
- Modified the internals of
seqspec onlistto manage saving the joined onlist to the-olocation when specified (otherwise saves to path where spec lives). - Renamed “location” to “urltype” in Onlist object
- Replaced
-rwith-iid type for all functions (deprecated-r) - Multiple seqspec commands now require the addition of a
-s selector [read, region, file] seqspec indexcan now take in reads, regions, or files and indexseqspec modifycan now add files to the template- Updated splitseq template
Fixed¶
- Fixed
seqspec onlistfunctionality - Fixed error enumeration in seqspec check
Removed¶
TODO:
- Remove
lib_struct - Remove
parent_id
Breaking Changes¶
- File elements are now required in Read objects and Onlist objects in version 0.3.0
[0.2.0] - 2024-04-17¶
Changed¶
seqspec indexuses primer and max length of of supplied Readassay_specrenamedlibrary_spec- Reorganize specification document
- Move contribution guidelines from
SPECIFICATION.mdtoCONTRIBUTION.md - Move example
Regions fromSPECIFCATION.mdtoseqspec/docs/regions seqspec indexdefaults to indexing reads,--regionindexes regions- Change descriptors of attributes
assay_id,doi Assayattributeassaychanged toassay_idReadattributeread_namechanged tonameAssayattributepublication_datechanged todateAssayattributesequencerchanged tosequence_protocolAssayfunctionget_modalitychanged toget_libspecRegionfunctionupdate_attruses themax_lento generaterandomandonlistsequence lengths instead ofmin_lenget_region_by_typechanged toget_region_by_region_typeto disambiguate betweenregion_typeandsequence_typeseqspec onlist(by default) searches for onlists in theRegions intersected by theReadpassed to-r.- Support older versions of matplotlib by handling the
spines[["top", "bottom"...]]structure - Increase the number of Xs in the random region to match
max_lenfor validation - Update
seqspec printcommand to use the replacementassay_idattribute instead ofassay - Implement downloading onlists via URLs and transparently decompress gzip files
- Change
read_listfunction to take theonlistobject for handling local and remote files - Added
onlistargument to specify combined barcode list file format (kallisto’s multi-file format and default cartesian product format)
Added¶
- Added
sequence_specin theAssayobject - Added
Readobject in thesequence_specobject - Added
sequence_specto the seqspec json schema - Added
Readobject to specification document - Added
Readgenerator to website GUI - Added pattern matching to
dateinAssay(expected date format: DAY MONTH YEAR, where day is one or two numbers, month is the full named month starting with a Capital letter and year is the full year) - Added
library_kittoAssayobject (kit that adds seq adapters) - Added
library_protocoltoAssayobject (library that generates insert) - Added
sequence_kittoAssayobject - Added website to view example
seqspecobjects - Added
get_seqspecto assay returns sequence structure for a given modality - Added multiple checks to
seqspec check- check read modalities exist in assay modalities
- check primer ids from seqspec are unique and exists as region ids in libspec
- check that the primer id exists as an atomic region (currently a strong assumption that may be relaxed in the future)
- check properties of multiple sequence types
fixedandregionsnot null incompatiblejoinedandregionsnull incompatiblerandomandregionsnot null incompatiblerandommust havesequenceof all X’sonlistandonlistproperty null incompatible
- check that the min len is less than or equal to the max len
- check that the length of the sequence is between min and max len
- Note a strong assumption in
seqspec printis that the sequence have a length equal to themax_lenfor visualization purposes
- Added
RegionCoordinateobject that mapsRegionmin/max lengths to 0-indexed positions seqspec onlistsearches for onlists in aRegionbased on--regionflag- Added type annotations for
join_onliststo clarify it needs a list ofOnlistobjects - Added minimal tests for
RegionCoordinate,project_regions_to_coordinates,run_onlist_region,run_onlist_read, and seqspec print functions - Added list of options to CLI for
-f FORMATwithinseqspec onlistandseqspec print - Added
-s SEQTYPEtoseqspec printto disambiguate printingsequence,library, orlibseqobjects. TODO wrapseqspec infointoseqspec print -f info. - Added
-s SPECOBJECTtoseqspec onlist. Specify specific objectread,region, orregion-typefor finding theonlist. - Added fetching ability for seqspec onlist from remote with IGVF credentials (credit to detrout)
Removed¶
TODO:
- Remove
lib_struct - Remove
parent_id
Fixed¶
- Sequencing overlapping pairs now supported
seqspec checkcorrectly handles sequences lengths longer than the stated min/max range- Fixed test for
project_regions_to_coordinates - Get the test of seqspec check working again by updating the schema for the refactored example specification YAML files and mocking fastq and barcode files
- Only return the onlist filename if it’s a local file, downloading remote lists when needed