Skip to yearly menu bar Skip to main content


Poster
in
Workshop: 2nd Generative AI for Biology Workshop

eccDNAMamba: A Pre-Trained Model for Ultra-Long eccDNA Sequence Analysis

zhenke liu · Jien Li · Ziqi Zhang

Keywords: [ Ultra-Long Sequence ] [ State-Space Model ] [ Pretrained Model ] [ Extrachromosomal circular DNA ]


Abstract:

Extrachromosomal circular DNA (eccDNA) plays key regulatory roles and contributes to oncogene overexpression in cancer through high-copy amplification and long-range interactions. Despite advances in modeling, no pre-trained models currently support full-length circular eccDNA for downstream analysis. Existing genomic models are either limited to single-nucleotide resolution or hindered by the inefficiency of the quadratic attention mechanism. Here, we introduce eccDNAMamba, the first bidirectional state-space encoder tailored for circular DNA sequences. It combines forward and reverse passes for full-context representation learning with linear-time complexity, and preserves circular structure through a novel augmentation strategy. Tested on two real-world datasets, eccDNAMamba achieves strong classification performance and scales to sequences up to 200 Kbp, offering a robust and efficient framework for modeling circular genomes. Our codes are available at https://github.com/zzq1zh/GenAI-Lab.

Chat is not available.