GenSLMs are a LLM, but genome sequence Take genome sequence Throw transformers at it create “semantic embedding” autoregression happens This is trained as a foundational model to organize the genomic sequence Turns out, the embedding space above can be used to discover relations. See proteins can be encoded as hierarchies