Why Lightweight Models May Be the Future of AI

Google has introduced Gemma, a series of lightweight, open-source models that analysts say could herald the arrival of a sleeker form of artificial intelligence (AI). 

The company launched two Gemma versions, Gemma 2B and Gemma 7B. These large language models (LLMs) can adjust to instructions, working on laptops, desktops, or in the Google Cloud. Google reports that Gemma models are efficient for their size, beating bigger models like Meta’s Llama-2 in tests of reasoning, math, and programming skills. 

“Smaller models are more portable and able to be deployed for a wider scope of use cases, such as remote operations or devices with limited local storage,” Sam Mugel, the CTO of Multiverse Computing, said in an interview. “Reducing the overall size of these models also reduces the energy required to operate them.”

Smaller Can Be Better

In a recent paper, lightweight models like Gemma were found to be appealing beyond low computational resources and their abilities to run on edge devices. With well-curated training data, they can achieve competitive performance, Nikolaos Vasiloglou, VP of research ML at RelationalAI, said in an interview. 

“With well-curated training data, they can achieve competitive performance,” he added. “They offer both explainability and interpretability that large models can not.”

Aside from being easier to manage, lightweight models like Gemma can be nimbler and more specific regarding features and functions, Jason Turner, the CEO of Entanglement, said in an interview. The lightweight models typically are based upon a subset of LLMs and include a much smaller number of parameters.

“It can also provide an ability to develop models that are subject-matter experts on the topic or be topic-specific,” he added. “These lightweight models typically are based upon a low-rank adaptation technology (LoRA), and the advantages include the ability to fine-tune a model quickly. Larger models are very difficult to fine-tune and require significant resources.”

Instead of creating new, smaller LLMs, some companies are building tools that can compress existing models to reduce the overall size and operating requirements while maintaining the same quality of results, Mugel said. His company makes CompactifAI software, which uses tensor networks to decrease the number of parameters in a model, reducing its overall size and shrinking memory and storage space requirements. 

“Retraining has the potential to be faster as well since users can add new data to an existing model that has already been compressed to generate an updated model that retains the benefits of compression while preserving the quality of results,” he added. 

Addressing Privacy Concerns

As the debate over AI and privacy intensifies, Google is also pitching Gemma as part of its commitment to ethical standards. The model, pretrained to sift through data without compromising personal and sensitive information, has been refined through a process known as reinforcement learning from human feedback (RLHF). The process, aimed at ensuring the AI’s decisions align with ethical guidelines, is complemented by human and adversarial testing to mitigate any potential risks of harm.

Beyond the development of Gemma, Google has introduced the Responsible Generative AI Toolkit. The tools are intended to be a step towards safer AI development, providing developers a means for safety classification and debugging and imparting best practices for crafting LLMs.

Mugel highlighted significant issues with AI that Gemma attempts to address. Privacy concerns top the list, with these systems potentially disclosing sensitive, copyrighted or hazardous information. The environmental impact is another major drawback, as the intricate designs of these systems demand substantial computational resources and storage, leading to a high ecological footprint. 

Gemma will compete with other open models, such as Meta Llama, well-funded AI startups, such as Mistral and Falcon, and GPT4 from OpenAI. Gemma is Google’s answer to supporting the open community of developers and researchers working on the responsible development and deployment of AI, Karthik Sj, vice president of product management and marketing at Aisera, said in an interview.

“Given Gemma shares technical and infrastructure components with Gemini, it is equipped to achieve best-in-class performance for its sizes compared to other open models,” he added. “Gemma surpasses significantly larger models on key benchmarks while adhering to rigorous standards for safe and responsible outputs.”