> the Mac OS build of the GGML plugin uses the Metal API to run the inference wo... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		anentropic on Nov 13, 2023 \| parent \| context \| favorite \| on: Fast and Portable Llama2 Inference on the Heteroge... > the Mac OS build of the GGML plugin uses the Metal API to run the inference workload on M1/M2/M3’s built-in neural processing engines I don't think that's accurate (someone please correct me...) GGML use of Metal API means it runs on the M1/2/3 GPU and not the neural engine Which is all good, but for sake of being pedantic...

btown on Nov 13, 2023 [–]

Not pedantic at all! https://github.com/ggerganov/llama.cpp/discussions/336 is a (somewhat rambling) discussion on whether it would even be worthwhile to use the neural engine specifically, beyond the GPU.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact