Intel's OpenVINO 2025.0 Brings Support For Deepseek Models, Better AI Performance

Written by Michael Larabel in Intel on 6 February 2025 at 08:34 AM EST. 2 Comments
INTEL
Intel's software engineers working on the OpenVINO AI toolkit today released OpenVINO 2025.0 that brings support for the much talked about Deepseek models along with other large language models (LLMs), performance improvements to some of the existing model support, and other changes.

New model support with Intel's OpenVINO 2025.0 open-source AI toolkit include Qwen 2.5, Deepseek-R1-Distill-Llama-8B, DeepSeek-R1-Distill-Qwen-7B, and DeepSeek-R1-Distill-Qwen-1.5B, FLUX.1 Schnell, and FLUX.1 Dev.

OpenVINO 2025.0 also delivers on better whisper model performance on CPUs, integrated GPUs, and discrete GPUs with OpenVINO's GenAI API. Plus there is initial Intel NPU support for torch.compile for using the PyTorch API on Intel NPUs.

OpenVINO diagram


The OpenVINO 2025.0 also brings improvements for second token latency for LLMs, KV cache compression is now enabled for INT8 on CPUs, support for Core Ultra 200H "Arrow Lake H" processors, OpenVINO backend support with the Triton Inference Server, and the OpenVINO Model Server can now work natively on Windows Server deployments.

Downloads and more details on the just-released OpenVINO 2025.0 via GitHub. I'll have new OpenVINO benchmarks and OpenVINO GenAI benchmarks soon on Phoronix.
Related News
About The Author
Michael Larabel

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via MichaelLarabel.com.

Popular News This Week