Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Is there a library to distill bigger models into BitNet?


I could be wrong, but my understanding is that bitnet models have to be trained that way.


They don't have to be trained that way! The training data for 1-bit LLMs is the same as for any other LLM. A common way to generate this data is called 'model distillation', where you take completions from a teacher model and use them to train the child model (what you're describing)!


Maybe I wasn't clear, I think you've misunderstood me. I understand that all sorts of LLMs can be trained using a common corpus of data. But my understanding is that the choice of creating a bitnet LLM must be made at training time, as modifications to the training algorithms are required. In other words, an existing FP16 model cannot be quantized to bitnet.


Ah yes, definitely misunderstood you, my bad




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: