Intel Releases x86-simd-sort 6.0 For Speedy AVX2/AVX-512 Sorting, PyTorch Now Using It
The x86-simd-sort project from Intel has been an interesting open-source software effort for much faster number sorting by using AVX-512. There's been lightning fast number sorting with AVX-512 and AVX2 code paths also added to broaden the appeal in helping CPUs without AVX-512. Projects like Numpy have been making use of this library while today x86-simd-sort 6.0 was released and also comes a few days after PyTorch has begun using this library too.
The x86-simd-sort 6.0 release adds support for qselect and partial sort for key-value data types, accelerating key-value sort using OpenMP pragmas, AVX2 support for key-value sort / partial sort / objsort methods, Intel LLVM compiler support, and support for descending order sort for all the sort routines. There is also better performance expected when working on daa with few unique values.
Downloads and more details on all of the x86-simd-sort 6.0 changes via GitHub.
Opened back in June was this merge request to PyTorch to begin making use of x86-simd-sort for faster sorting on x86/x86_64. A 10x speed-up was cited for large arrays. As of last week that code was merged to PyTorch for much faster sorting on systems with AVX2 or AVX-512. The torch.sort and torch.argsort functions stand to benefit with the up to 10x gains.
The x86-simd-sort 6.0 release adds support for qselect and partial sort for key-value data types, accelerating key-value sort using OpenMP pragmas, AVX2 support for key-value sort / partial sort / objsort methods, Intel LLVM compiler support, and support for descending order sort for all the sort routines. There is also better performance expected when working on daa with few unique values.
Downloads and more details on all of the x86-simd-sort 6.0 changes via GitHub.
Opened back in June was this merge request to PyTorch to begin making use of x86-simd-sort for faster sorting on x86/x86_64. A 10x speed-up was cited for large arrays. As of last week that code was merged to PyTorch for much faster sorting on systems with AVX2 or AVX-512. The torch.sort and torch.argsort functions stand to benefit with the up to 10x gains.
2 Comments