Page 4 of 4 FirstFirst ... 234
Results 31 to 34 of 34

Thread: Improving The Linux Kernel's Memory Performance

  1. #31
    Join Date
    Feb 2011
    Location
    France
    Posts
    185

    Default

    Quote Originally Posted by Shining Arcanine View Post
    The compiler won't generate its own SSE3 assembly unless it is told to do it by the build system, so strictly speaking, he would need to recompile his kernel to get SSE3 instructions into areas where the kernel developers did not do this manually.
    Yes of course but recompiling your kernel with "sse3" have few chance to build optimized "sse3" path. The best way is to "manually" code SSE3 path: this is the subject of this article. And if that is done in the kernel ("manual" sse3 paths), you don't have to compile the kernel with "sse3" options to use this optimization.
    Last edited by whitecat; 08-18-2011 at 09:38 AM.

  2. #32
    Join Date
    Oct 2008
    Posts
    3,038

    Default

    Quote Originally Posted by Shining Arcanine View Post
    How is SSE4a not a SSE4 derivative if half of its instructions match SSE4 instructions in opcode, name and functionality?

    SSE was made after 3DNow, while SSE4a was made after Intel published its SSE4 extensions. The instructions provided by SSE and 3DNow do not intersect.

    I feel like these points on SSE4a not being a SSE4 derivative are derived from the following rather than any actual technical reason:

    http://arstechnica.com/science/news/...self-image.ars
    4 instructions are common out of 54. That's 7%.

    If you're going to claim it's a derivative, that # should be AT LEAST 50%, if not higher. IMHO

    And this has nothing to do with me liking Intel - I actually root for AMD which is why SSE4a is so disappointing.

  3. #33
    Join Date
    Apr 2008
    Location
    Saskatchewan, Canada
    Posts
    460

    Default

    Quote Originally Posted by whitecat View Post
    No, AMD64 implies SSE2.
    Thanks, that must be what I was thinking of.

  4. #34
    Join Date
    Oct 2010
    Posts
    306

    Default

    Yes, it will be able to detect it at runtime, there is a feature called CPUID that is used for this. You can also see my previous post in this thread on how the kernel determines the best implementation for raid6.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •