Skip to content

NeoBERT

Adhilsha Ansad
Published: at 08:00 AM

Recent advancements in large language models have pushed the boundaries of in-context learning and reasoning, yet bidirectional encoders like BERT have lagged behind in innovation. In this talk, I introduce NeoBERT, a next-generation encoder that redefines bidirectional modeling through state-of-the-art architectural improvements, modern data strategies, and optimized pre-training. I will highlight key enhancements, including an optimal depth-to-width ratio, an extended 4,096-token context length, and efficiency-driven modifications that enable NeoBERT to achieve state-of-the-art results with just 250M parameters.

Additional resources: