Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
CamperBob2
on Feb 11, 2025
|
parent
|
context
|
favorite
| on:
Building a personal, private AI computer on a budg...
What I'd like to know is how well those dual-Epyc machines run the 1.58 bit dynamic quant model. It really does seem to be almost as good as the full Q8.
DrNosferatu
on Feb 12, 2025
[–]
I tried that that: ~1.5 to 3 tokens/sec.
CamperBob2
on Feb 14, 2025
|
parent
[–]
Ouch, thanks. About what I get now on a single-CPU box with 128 GB+a 4090. Was hoping for a major speedup.
DrNosferatu
on Feb 14, 2025
|
root
|
parent
[–]
Peak performance is achieved at ~21 cores. Bottleneck - without any special configs - is RAM to CPU bandwidth.
Let me know if you find some config that really leverages more cores!
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: