Page 2 of 2 FirstFirst 12
Results 11 to 16 of 16

Thread: Improving OpenCL On CPUs, Building Linux

  1. #11
    Join Date
    Nov 2008
    Location
    Germany
    Posts
    5,411

    Default

    Quote Originally Posted by popper View Post
    i said PLEASE
    LOL YOU started the Ridiculous Bridgman stuff in this tread you get what you deserve!

    Linus Torvalds: "I like offending people, because I think people, who get offended, should be offended."
    Last edited by Qaridarium; 06-17-2012 at 07:30 PM.

  2. #12
    Join Date
    Oct 2008
    Posts
    3,036

    Default

    Quote Originally Posted by popper View Post
    i said PLEASE
    Responding to trolls never works, it just feeds them.

  3. #13
    Join Date
    Oct 2008
    Posts
    3,036

    Default

    Quote Originally Posted by Qaridarium View Post
    you are right FPGA is the future and i think future CPUs will do have a Vector-SIMD unit and a FPGA part just to make sure you can make software run like hellfire speed.
    I may not be up to speed on this story, but why in the world would AMD want an FPGA on their boards?

    Wouldn't they just hardcode whatever functionality you are talking about? The point behind a FPGA is that it can be reprogrammed on the fly - that is it's defining characteristic, and it's what makes it so expensive. I can't imagine AMD would want to pay for that feature, they'd just put a chip on the board that would run OpenCL without being fully-programmable.

  4. #14
    Join Date
    Nov 2008
    Location
    Germany
    Posts
    5,411

    Default

    Quote Originally Posted by smitty3268 View Post
    I may not be up to speed on this story, but why in the world would AMD want an FPGA on their boards?
    i read about it in the past that they want implement a FPGA into there future products.
    maybe not visible for the enduser but in fact you can build great speed ups with that kind of technique!

    Quote Originally Posted by smitty3268 View Post
    Wouldn't they just hardcode whatever functionality you are talking about? The point behind a FPGA is that it can be reprogrammed on the fly - that is it's defining characteristic, and it's what makes it so expensive. I can't imagine AMD would want to pay for that feature, they'd just put a chip on the board that would run OpenCL without being fully-programmable.
    in future cpus you do have so many transistor space that you can place anything you want on the cpu!
    and after 16 cores and many vector-simd units on 1 die this is the next step.

    Imagine it even if its not visible to the enduser they can hardcode many speedups

  5. #15
    Join Date
    Feb 2008
    Location
    Linuxland
    Posts
    4,993

    Default

    Do the slides show Intel's implementation beating AMD's, a completely opposite result to Michael's?

    Units aren't exactly clear nor is the hw, but it appears that way.

  6. #16
    Join Date
    Jan 2007
    Posts
    459

    Default

    Quote Originally Posted by curaga View Post
    Do the slides show Intel's implementation beating AMD's, a completely opposite result to Michael's?

    Units aren't exactly clear nor is the hw, but it appears that way.
    reading the PDF it does seem that way but then
    Michael only needs a headline to grab the ad revenue
    http://llvm.org/devmtg/2012-04-12/Sl...Karrenberg.pdf

    see: Evaluation II: WFVOpenCL vs. Intel/AMD (milliseconds)

    the meat of it being... they say that their WFVOpenCL tested code algorithm's beats the tested Intel OpenCL SDK v1.1 / AMD APP SDK v2.5 by an Average of : 2.5x (Intel), 40x (AMD)
    lower being better OC.

    and then there's the Conclusion

    "OpenCL benefits from both multi-threading and WFV on CPUs"

    WFV being "whole function vectorisation" or if you prefer whole function SIMD Optimization ,OC they could have known that if they had just asked the x264 devs and looked at it's assembly and C code functions to know that Intel SIMD beats AMD clock for clock for a very long time now

    in fact the WFVOpenCL guys would probably get a lot more speed in their CPU Optimizations if they just took the x264 code examples and modified them and the general framework to their needs and pay special attention to the supplied
    checkasm as The tool can be used to perform function-level benchmarks
    Last edited by popper; 06-18-2012 at 03:26 PM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •