Red Hat Developing Ramalama To "Make AI Boring" By Offering Great AI Simplicity & Ease Of Use

Written by Michael Larabel in Red Hat on 15 August 2024 at 12:55 PM EDT. 18 Comments
RED HAT
Red Hat engineers have been developing Ramalama as a new open-source project that hopes to "make AI boring" by this inferencing tool striving for simplicity so users can quickly and easily deploy AI workloads without much fuss.

Ramalama leverages OCI containers and makes it easy to run AI inferencing workloads across GPU vendors, seamlessly fallback to CPU-based inferencing if no GPU support is present, and interfaces with Podman and Llama.cpp to do the heavy lifting while fetching models from the likes of Hugging Face and the Ollama Registry. The goal is to have native GPU support working across Intel, NVIDIA, Arm, and Apple hardware. CPU support includes AMD, Intel, RISC-V, and Arm.

Ramalama slide


Ramalama was recently presented at Fedora's Flock conference as the "boring AI companion" and further described as:
In a field dominated by cutting-edge innovations and intricate solutions, the project ramalama stands out with a refreshingly simple mission: "to make AI boring." This talk delves into the philosophy and functionality of ramalama, a tool designed to simplify AI and machine learning for a broader audience. By embracing "boring" as a virtue, ramalama focuses on creating reliable, user-friendly, and accessible tools that just work without the fanfare.

We'll explore the core features of ramalama, from its straightforward installation process to its intuitive commands for managing and deploying AI models. Whether you're listing, pulling, running, or serving models, ramalama ensures the experience is hassle-free and enjoyable, catering to everyone from AI enthusiasts to casual tech users.

As an early-stage project, ramalama is constantly evolving, with a strong emphasis on community involvement and feedback. Join us as we uncover how ramalama is making advanced AI accessible to all, stripping away the complexity and hype to deliver a tool that's both powerful and practical. Let's embrace the journey towards making AI "boring" in the best way possible, and discover the joy in simplicity and reliability.

The early stage code to Ramalama is hosted on GitHub. It's certainly a worthwhile effort at large to make it easier to run/deploy different AI models across different hardware and software platforms. Mozilla's Llamafile is another worthy effort for making it easy to run GPU or CPU accelerated AI models from a single file that doesn't take the route of containers.

Those wishing to learn more about Ramalama can find their Flock 2024 presentation embedded below with the presenters being Red Hat's Eric Curtin and Dan Walsh.

Related News
About The Author
Michael Larabel

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via MichaelLarabel.com.

Popular News This Week