Discussion about this post

User's avatar
Madhav Malhotra's avatar

What I really struggled to understand in the original Mamba paper was the math (hard to figure out the dimensions of the variables involved). It's interesting to see the history of SSMs, though I'm still on the lookout for a simple explanation of the layer-by-layer math involved. It'd be great to see an article on that!

Expand full comment

No posts