Page 2 of 4 FirstFirst 1234 LastLast
Results 11 to 20 of 32

Thread: Intel Has A Single-Chip Cloud Computer

  1. #11
    Join Date
    Oct 2009
    Posts
    2,064

    Default

    hmm.. what's stronger: 48x 8086 or 1x 80386?

    This really doesn't mean much.

  2. #12

    Default

    Quote Originally Posted by Louise View Post
    Yes, and Sony have finally realised that Try searching for IBM, Cell, cancel

    So my guess is that the PS4 will have a Larrabee solution, as Sony likes to try out new technologies, and XBox 720 will have an AMD solution, since nVidia is very likely not even interested in working with MS again after the XBox 1, where MS trashed millions of SouthBridges, giving nVidia a loss in one quarter.
    Microsoft and AMD together. Yet another reason not to buy products from either of them.

  3. #13
    Join Date
    May 2008
    Posts
    598

    Default

    AMD Core Counts and Bulldozer: Preparing for an APU World
    http://www.anandtech.com/cpuchipsets...oc.aspx?i=3683

    Who are these guys? And where do they get those wonderful toys from?

    AMD did add that eventually, in a matter of 3 - 5 years, most floating point workloads would be moved off of the CPU and onto the GPU. At that point you could even argue against including any sort of FP logic on the "CPU" at all. It's clear that AMD's design direction with Bulldozer is to prepare for that future.
    I wonder that would mean for programmers.
    Last edited by Louise; 12-03-2009 at 05:46 PM.

  4. #14
    Join Date
    Oct 2007
    Location
    Toronto-ish
    Posts
    7,386

    Default

    I'm not sure moving *all* of the floating point logic off the CPU would ever make sense, since all kinds of programs use floating point variables and you still want those programs to run without falling back to SW floating point emulation.

    I believe the discussion is more about the SIMD instruction extensions, which work on explicitly vectorized code.
    Last edited by bridgman; 12-03-2009 at 06:16 PM.

  5. #15
    Join Date
    May 2008
    Posts
    598

    Default

    Quote Originally Posted by bridgman View Post
    I'm not sure moving *all* of the floating point logic off the CPU would ever make sense, since all kinds of programs use floating point variables and you still want those programs to run without falling back to SW floating point emulation.
    Hopefully for Linux programmers it will just be a case of
    Code:
    sed -i 's/%f/%f_gpu/g' *


    Are there other things that GPU's are unquestionable better at than CPU's?

    I suppose that preforming FP calculations are very parallelizable?

    Quote Originally Posted by bridgman View Post
    I believe the discussion is more about the SIMD instruction extensions, which work on explicitly vectorized code.
    So something like the Cell architecture?

  6. #16
    Join Date
    Oct 2008
    Posts
    3,038

    Default

    Are there other things that GPU's are unquestionable better at than CPU's?
    It's typically highly parallelizable low-precision floating point code, but over the last few years a lot of progress has been made to improve the GPU in other areas and I'm sure that will continue.

    I suppose that preforming FP calculations are very parallelizable?
    Not intrinsically, no. Multiplying 2.5 * 3.5 is no more parallel than multiplying 2 * 3. However, many of the applications that heavily use fp hardware are - anything that benefits from SSE support, for example, is probably at least somewhat parallelizable.

    Also, GPU hardware is built from the ground up around floating point operations, while most of the code that goes through a cpu is integer based. Floating point hardware is a lot more complicated (and therefore expensive) than integer hardware, so the amount of fp resources that can be justified on a cpu are fairly limited.

    I believe they're talking about having a single gpu on chip to handle multiple cpu cores (threads) at a time, so that will let the hardware stretch it's legs a little and of course the compilers will probably be tuned to try to schedule as many fp operations at a time as they can.

    So something like the Cell architecture?
    It is sort of like the Cell architecture in that they're talking about having different kinds of specialized hardware on the same chip. Cell had a weak general purpose core with a bunch of highly specialized SPUs, while AMD is talking about multiple x86 cores along with a gpu that would handle most of the fp load.

    I'm just guessing here, but I think the idea would be for the hardware to automatically do all the offloading here, which is different than Cell. With Cell, you had to be very careful about what you were programming where, and I would assume that this stuff from AMD would just take normal code and have the cpu fetch/decode logic forward fp calls through to another part of the chip automatically for you.
    Last edited by smitty3268; 12-03-2009 at 08:48 PM.

  7. #17
    Join Date
    Oct 2007
    Location
    Toronto-ish
    Posts
    7,386

    Default

    Sorry, when I mentioned SIMD instruction extensions I was talking about the SIMD extensions which are already built into x86 processors today. SIMD extensions are how most of the serious floating point work is done on x86 today, but unless you're writing math libraries or game engines you probably don't see them.

    Originally x86 processors only handled integer work, and a separate x87 coprocessor handled floating point operations (or you trapped down to a software emulation library). The coprocessor had its own set of registers and other state information, but it pulled instructions out of the same stream as the rest of your program. The floating point coprocessor moved onto the same die as the CPU around the 486 days, but the separate instructions and registers remained (and are still there today AFAIK).

    Enter SIMD. The Intel MMX extensions added integer SIMD functions -- the ability to process multiple sets of data with a single instruction. AMD's 3DNow! added the first floating point SIMD extensions, targetted at 3D game geometry but useful in other areas as well. Intel's SSE extensions (Streaming SIMD aka Screaming Sindy) added more floating point capabilities along with a third set of registers.

    In general compilers don't directly use the SIMD extensions - you get at them through math libraries or calls into hand-tweeked assembler routines. This is starting to change but we're still at the early stages - OpenCL is one attempt to make the SIMD extensions on CPUs generally accessible.

    While all this was happening, GPUs were gradually evolving into floating point SIMD engines as well, but with a higher degree of streaming (trading off cache coherence for throughput) and parallelism than CPUs (HD5870 has 20 SIMD engines each executing 80 floating point ops per clock, ie 1600 operations per clock or 3200 FLOPs/clock for MAD).

    This obviously reminds one of Seymour Cray's comment about the emerging conflict between highly parallel microprocessor-based systems and optimized supercomputers :

    If you were plowing a field, which would you rather use? Two strong oxen or 1024 chickens?
    It took a lot of years, but the chickens are finally starting to win.

    The GPU vs CPU discussion is primarily about whether the work currently handled by the CPU's SIMD extensions (SSEx) can be handled as well or better by a GPU-type architecture instead. So far it looks pretty good - the biggest application of SIMD floating point was 3D geometry and that moved to the GPU quite a few years ago. Math libraries come next, and many of them have been ported to GPUs. APIs like OpenCL and DirectCompute are designed to pick up most of the remaining workload and isolate it from the hardware specifics.
    Last edited by bridgman; 12-03-2009 at 10:23 PM.

  8. #18
    Join Date
    May 2007
    Location
    Third Rock from the Sun
    Posts
    6,582

    Default

    FYI, Larrabee is dead. (As I said was going to happen so long ago).

    http://news.cnet.com/8301-13924_3-10409715-64.html

  9. #19
    Join Date
    Apr 2008
    Location
    Saskatchewan, Canada
    Posts
    460

    Default

    Quote Originally Posted by deanjo View Post
    FYI, Larrabee is dead. (As I said was going to happen so long ago).
    From what I've read it's not officially dead, they're just not releasing the first version because it's not competitive; which is hardly unusual with new architectures in the consumer graphics market. Of course they may decide never to build a second version with the lessons they've learned from this one, but they haven't killed Itanium yet so Larrabee may still be with us a decade or two from now.

    Personally I've never thought that Larrabee made much sense -- why put x86 instructions into a massively parallel architecture if you could use a new instruction set and eliminate all the complex instruction decoding? -- but I wouldn't write it off yet.

  10. #20
    Join Date
    May 2007
    Location
    Third Rock from the Sun
    Posts
    6,582

    Default

    Quote Originally Posted by movieman View Post
    From what I've read it's not officially dead, they're just not releasing the first version because it's not competitive; which is hardly unusual with new architectures in the consumer graphics market. Of course they may decide never to build a second version with the lessons they've learned from this one, but they haven't killed Itanium yet so Larrabee may still be with us a decade or two from now.
    Really by a decade or two there really would be no point as by that time CPU's should be massively parallel with possibly hundreds (maybe thousands) of cores making a x86 based graphics card relatively a moot product.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •