Research & Writing

Blog

Dispatches from the edge of chaos — on nonlinear dynamics, AI, emergence, and the mathematics of complex systems.

State Space Model

Hyena: Architecture & How It Works

Hyena is a sub-quadratic attention replacement that uses long convolutions and element-wise gating to achieve Transformer-quality performance with significantly reduced computational cost, particularly for long sequences.

2 min read
State Space Model

xLSTM: Architecture & How It Works

xLSTM (Extended Long Short-Term Memory) modernizes the classic LSTM architecture with exponential gating and novel memory structures, challenging Transformers and SSMs on language modeling while retaining

2 min read
State Space Model

RWKV: Architecture & How It Works

RWKV (Receptance Weighted Key Value) is a novel architecture that combines the efficient parallelizable training of Transformers with the efficient O(1) inference of RNNs, achieving

2 min read
State Space Model

Mamba, S4 & State Space Models: Architecture & How They Work

State Space Models (SSMs) including S4 and Mamba offer an alternative to Transformers for sequence modeling, achieving linear-time complexity during training and constant-time per-step inference, while

2 min read