Ah. In that case... Yeah. Is it using GPU, and does the whole model fit in your ...

ekianjo · 2026-02-10T09:12:25 1770714745

This is a CPU implementation only.

yjftsjthsd-h · 2026-02-10T16:02:46 1770739366

Oh, that's interesting. The readme talks about GPU acceleration on Apple Silicon and I didn't see anything explicit for other platforms, so I assumed it needs GPU everywhere, but it does BLAS acceleration which a web search seems to agree is just a CPU optimized math library. That's great; should really increase the places where it's useful:)

ekianjo · 2026-02-11T07:05:17 1770793517

It should be possible to develop a cuBLAS backend to accelerate BLAS on Nvidia.