'온디바이스' 태그의 글 목록

지난 시간에는 ollama를 활용하여 경량화된 LLaMA3.2 모델(1b, 3b 등)을 로컬 머신에서 직접 실행해보는 실습을 진행하였다. NVIDIA의 GPU 시장 독점을 견제하려는 목적으로 설립된 미국 스타트업 Groq은, 자체 개발한 LPU(Language Processing Unit)를 활용해 LLaMA-70B, Gemma-2-9B 등 로컬 머신에서 실행하기 어려운 대규모 모델을 웹과 API로 제공한다. 오늘은 이를 활용해보려 한다. Groq is Fast AI InferenceThe LPU™ Inference Engine by Groq is a hardware and software platform that delivers exceptional compute speed, quality, and ..