Research & Writing

Blog

Dispatches from the edge of chaos — on nonlinear dynamics, AI, emergence, and the mathematics of complex systems.

RNN

Stacked LSTM & GRU: Architecture & How They Work

Stacked LSTMs and GRUs are deep recurrent neural network architectures that process sequential data by maintaining hidden states across time steps, with gating mechanisms that control

2 min read
State Space Model

xLSTM: Architecture & How It Works

xLSTM (Extended Long Short-Term Memory) modernizes the classic LSTM architecture with exponential gating and novel memory structures, challenging Transformers and SSMs on language modeling while retaining

2 min read