Meta launches lighter Llama models for low-power devices

Meta releases lightweight versions of Llama 3.2 models. It has been developed for low power devices.

Meta aims to popularize the large open source language models 3.2 1B and Llama 3B with these lightweight versions, designed specifically for low-power devices. The models can run on energy-efficient sources and still deliver solid performance.

Quantitative models

Meta’s AI team confirmed that the models are designed for “short context applications up to 8K”, due to limited memory on mobile devices. Quantizing language models reduces their size by adjusting the precision of their model weights.

The developers used a couple of different methods there, including “quantization-aware training with LoRA transformers.” QLoRA helps improve performance in low-resolution environments. If the model tends to focus on portability at the expense of performance, SpinQuant can be used. This improves compression to simplify transferring the model to different devices.

Meta, in collaboration with Qualcomm and MediaTek, has improved Arm-based system-on-chip hardware models. Thanks to optimization using the Kleidi AI kernel, models can run on mobile CPUs. This makes more privacy-friendly AI applications possible, as all operations are performed locally on the device.

Llama 3.2 1B and Llama 3B quantity models can be downloaded starting today via Llama.com and Hugging Face. Meta also launched templates for video editing earlier this week.

See also  Jan Friends The Nation's New Comic Creator | general

Winton Frazier

 "Amateur web lover. Incurable travel nerd. Beer evangelist. Thinker. Internet expert. Explorer. Gamer."

Leave a Reply

Your email address will not be published. Required fields are marked *