Results 1 to 6 of 6

Thread: Intel SNA Acceleration Performance On Ironlake

Hybrid View

  1. #1
    Join Date
    Jan 2007
    Posts
    15,627

    Default Intel SNA Acceleration Performance On Ironlake

    Phoronix: Intel SNA Acceleration Performance On Ironlake

    Wondering how Intel's SNA acceleration architecture is performing for Ironlake hardware? Here's some benchmarks...

    http://www.phoronix.com/vr.php?view=MTAyOTE

  2. #2
    Join Date
    May 2010
    Posts
    684

    Default

    Cool results, looks like overall performance is already much better. Can't wait until this gets enabled by default.

  3. #3
    Join Date
    Jan 2007
    Posts
    459

    Default

    Quote Originally Posted by phoronix View Post
    Phoronix: Intel SNA Acceleration Performance On Ironlake

    Wondering how Intel's SNA acceleration architecture is performing for Ironlake hardware? Here's some benchmarks...

    http://www.phoronix.com/vr.php?view=MTAyOTE
    "A VMA cache appears unavoidable thanks to compiz and an excruciatingly slow GTT pagefault, though it does look like it will be ineffectual during everyday usage. Compiz (and presumably other compositing managers) appears to be undoing all the pagefault minimisation as demonstrated on gen5 with large XPutImage. It also appears the CPU to memory bandwidth ratio plays a crucial role in determining whethergoing straight to GTT or through the CPU cache is a win - so no trivial heuristic."

    im wondering what Chris means and implies here ?, is he saying that the compositing managers are stealing all the CPU cycles gains because they are simply not being benched and re-factored to minimise their overall impact often enough!
    Last edited by popper; 12-15-2011 at 04:52 PM.

  4. #4

    Default

    Quote Originally Posted by popper View Post
    "A VMA cache appears unavoidable thanks to compiz and an excruciatingly slow GTT pagefault, though it does look like it will be ineffectual during everyday usage. Compiz (and presumably other compositing managers) appears to be undoing all the pagefault minimisation as demonstrated on gen5 with large XPutImage. It also appears the CPU to memory bandwidth ratio plays a crucial role in determining whethergoing straight to GTT or through the CPU cache is a win - so no trivial heuristic."

    im wondering what Chris means and implies here ?, is he saying that the compositing managers are stealing all the CPU cycles gains because they are simply not being benched and re-factored to minimise their overall impact often enough!
    No, it is a limitation in how the rendering is split between X and the DRI compositor. In order for all rendering performed by X to be seen by the compositor, the ddx must flush its queues before broadcasting the damage to clients. The ddx only knows when X is about to reply to a client, but we don't know if we're sending a damage report so we need to assume the worst and flush the rendering before every reply to any client. This means that when a compositor is in use, or more generally when we have exported GEM buffers to other DRI applications i.e. games, the ddx can only batch little amounts of rendering and so throughput suffers and cpu overhead increases. In this particular instance, PutImage is buffered onto a system copy of the pixmap and normally flushed in time for vblank, however with a DRI compositor we end up flushing the pixmap after each call to PutImage, causing many more small uploads rather than one big one. Prior to the commit, the GPU buffer would be mmapped on each upload. The commit introduces a caching scheme so that those mappings (which themselves are a precious resource and have costs associated with keeping them open) are preserved between uploads.

    This is also one of the major changes inherent in the design of Wayland; the clients push the damage to the compositor without any unnecessary round-trips, updates are always atomic, fast and only when required.

  5. #5

    Default

    You can also compare your system's 2D performance to these Intel Ironlake numbers by simply running phoronix-test-suite benchmark 1112104-AR-INTELIRON40 from your system.
    Now that is a handy little feature, thanks Michael!

  6. #6

    Default

    Quote Originally Posted by ickle View Post
    Now that is a handy little feature, thanks Michael!
    For any result uploaded to OpenBenchmarking.org, you can pass that to the phoronix test suite and it will automatically fetch the results, etc.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •