Oracle Working On Multi-Threaded VFIO Page Pinning For ~10x Faster QEMU Initialization

Written by Michael Larabel in Virtualization on 6 January 2022 at 05:30 AM EST. 6 Comments
VIRTUALIZATION
For those assigning VFIO devices to guest virtual machines, the initialization/start-up process may soon be much faster with a set of patches volleyed by Oracle.

Oracle engineers have been working on multi-threaded VFIO page pinning to speed-up the initialization process and can be quite noticeable impact for large guest VMs. The patch series providing this multi-threaded VFIO page pinning is currently under a "request for comments" and the patch cover letter explains the motivation and benefits:
Assigning a VFIO device to a guest requires pinning each and every page of the guest's memory, which gets expensive for large guests even if the memory has already been faulted in and cleared with something like qemu prealloc.

Some recent optimizations have brought the cost down, but it's still a significant bottleneck for guest initialization time. Parallelize with padata to take proper advantage of memory bandwidth, yielding up to 12x speedups for VFIO page pinning and 10x speedups for overall qemu guest initialization. Detailed performance results are in patch 8.

Phase one of multithreaded jobs made deferred struct page init use all the CPUs on x86. That's a special case because it happens during boot when the machine is waiting on page init to finish and there are generally no resource controls to violate.

Page pinning, on the other hand, can be done by a user task (the "main thread" in a job), so helper threads should honor the main thread's resource controls that are relevant for pinning (CPU, memory) and give priority to other tasks on the system. This RFC has some but not all of the pieces to do that.

12x speed-up for VFIO page pinning thanks to multi-threading is quite a difference and then especially translating into a ~10x speed-up for overall QEMU guest initialization. This patch message has more of the AMD and Intel server performance test details.


Large servers with lots of RAM obviously stand to benefit the most from this VFIO multi-threaded page pinning.


Oracle has been carrying some of these patches in their downstream kernel builds for Oracle Enterprise Linux for about three years. See this patch series for the initial 16 RFC patches.
Related News
About The Author
Michael Larabel

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via MichaelLarabel.com.

Popular News This Week