Announcement

Collapse
No announcement yet.

Intel Contributes AVX-512 Optimizations To Numpy, Yields Massive Speedups

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Intel Contributes AVX-512 Optimizations To Numpy, Yields Massive Speedups

    Phoronix: Intel Contributes AVX-512 Optimizations To Numpy, Yields Massive Speedups

    Intel has contributed AVX-512 optimizations to upstream Numpy. For those using Numpy as this leading Python library for numerical computing, newer Intel CPUs with AVX-512 capabilities can enjoy major speed-ups in the range of 14~32x faster...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    Why would it provide such a massive speedup over using AVX2? Or did Numpy not use any SIMD instructions at all?

    Comment


    • #3
      Originally posted by ctlansdown View Post
      Why would it provide such a massive speedup over using AVX2? Or did Numpy not use any SIMD instructions at all?
      It's about using SVML, which provides very fast and vectorized (including avx-512 it appears) versions of math functions like sin(), cos(), log(), gamma() etc. etc.

      Unfortunately SVML isn't open source, so you're not gonna see this performance with the out of the box numpy on your Linux distro.

      Comment


      • #4
        Originally posted by jabl View Post

        Unfortunately SVML isn't open source, so you're not gonna see this performance with the out of the box numpy on your Linux distro.
        > This open-source AVX-512 code originates from the Intel Short Vector Math Library (SVML) that they open-sourced the code from
        read again? This is fully in upstream numpy.

        Comment


        • #5
          Originally posted by ctlansdown View Post
          Why would it provide such a massive speedup over using AVX2? Or did Numpy not use any SIMD instructions at all?
          AVX512 is not just twice as wide AVX2. It also introduces some really cool bitmask operations and it allows you to put control flow decisions into the avx512 operations (to some degree) - this means you have to leave the FPU less often and can provide significant speedups

          Comment


          • #6
            Originally posted by jabl View Post

            It's about using SVML, which provides very fast and vectorized (including avx-512 it appears) versions of math functions like sin(), cos(), log(), gamma() etc. etc.

            Unfortunately SVML isn't open source, so you're not gonna see this performance with the out of the box numpy on your Linux distro.
            It looks like the code this is using is BSD licensed, available here: https://github.com/numpy/svml

            Anyway, I believe ctlansdown is right and numpy must not have been using vectorized instructions at all on certain operations in order to get this kind of speedup.

            Comment


            • #7
              Unfortunately, these who buy hardware explicitly for computation tasks, chose Intel.
              It would be nice if AMD contribute to compute libraries such as numpy at least
              Last edited by RedEyed; 12 October 2021, 03:38 PM.

              Comment


              • #8
                If Zen 4 has AVX-512 as rumored, AMD should send Intel a "thank you" cake. Considering Alder Lake removed AVX-512, this is a double-win for AMD.

                Comment


                • #9
                  Originally posted by RedEyed View Post
                  Unfortunately, these who buy hardware explicitly for computation tasks, chose Intel.
                  Don't you mean Nvidia?

                  Comment


                  • #10
                    Originally posted by numacross View Post

                    Don't you mean Nvidia?
                    I was talking about CPU

                    Comment

                    Working...
                    X