* PS3: significantly faster than real-time, i.e. few minutes of multichannel DST data are decoded for less than those few minutes; of course that's with its SPE optimized decoder.
* high-end x86 CPU with the reference decoder:
few minutes of multichannel DST data are decoded for several hours, yes several hours. so, it's like PS3 is hundreds if not thousands times faster
so, unless you state that there are ARMs with faster computational power than high-end x86 CPU then ARM performance would be worst than x86 CPU with the reference decoder - in fact i can try it for sure since i have several Samsung and Texas Instruments ARMs (probably one of the fastest at least those 2 companies are making), but i see it as pointless. once again, optimizing the reference decoder for x86 CPUs and multithreading it gives big performance bump and high-end x86 CPU get closer to PS3, but then is the price - such x86 system costs times the costs of PS3. so, bottom line is that ARM will be 3rd for sure and doing real test only will show 3rd by how big margin compared to the 2nd - i believe it would be very big.
const, the fact that you keep saying "high-end x86 cpu" instead of the actual model number makes me doubt your claims. furthermore, your comparison of ARM and Cell are pretty off-base because you're comparing the Cell running code that its best at to an ARM CPU which is optimized for power efficiency; not a fair comparison at all. show overall benchmarks across media, content creation, 3d rendering, power efficiency, etc and you'll have a better idea.
I have a PS3 and it has the worst freakin I/O speeds I've ever seen. Even with a SSD. It is tremendously deficient at mundane tasks like reading data from a hard drive. Overall, it's a shitty processor in my eyes, but for gaming - assuming you write a game that can be in RAM most of the time - it's absolutely amazing. God of War 3 is better looking than any PC game ive ever seen, including ones running on quad SLI etc. Crysis 2, Unreal Engine 3, Crysis 1, Metro 2033, etc look like shit in comparison, and that's running on hardware that's AT LEAST 10x faster GPU-wise. Sony Santa Monica is the studio behind it, and they're an amazing example of what can be done with super-optimized C code.
The issue is that reference code is usually extremely badly optimized as it serves to show how an algorithm works. I have first hand experience on the reference code for AES; it's used in one of the EEMBC benchmarks and was giving very bad performance on ARM CPU due to the use of integer divisions and modulos that were not needed. Once I took care of the inefficiencies the code was 20 times faster.
I don't know the ref DST decoder but I guess it's not been written with speed in mind. So using it as the basis to know whether a CPU can decode DST looks like a very bad idea. Reference code should never be used as is, unless one has an agenda :)
This seems to show that DTS decoding is taking less than 30% of a 500 MHz old ARM CPU.
Overall, I'm expecting ARM computers to really come into their own when they hit 8+ A15 cores with at least quad-channel memory controllers (since individual memory controllers tend to be stuck at a 32-bit width). ARM is a great architecture; I can't wait until it can actually challenge x86 performance-wise. I hear that MIPS is an even more efficient architecture (but I don't know for sure). The Chinese are putting a lot of weight behind MIPS and Alpha for their "new" CPU architectures, so I would expect a lot of progress in those fields. The Chinese will probably be able to give US microprocessor design companies a run for their money in about 10 years (once they get fab issues worked out and confidence in their products, mainly). I welcome our new Chinese overlords. :cool:
... that everybody should have forgotten by now. Then again, it is phoronix.
PPE is a single core with dual-threading. And the PPE just isn't very fast, but it wasn't designed to be. It's only there to route data to/from the SPUs and to handle I/O. The SPUs are so fast, and have some very useful multi-cpu features that there's little point in doing any heavy lifting on a PPE.
PS3 linux has 6 SPE's available for free use by applications, but they need to be coded specially for it (opencl should work rather well with appropriate code, although sony killed linux before that was around). The hypervisor only adds some overhead to i/o and the disk is slow to start with - but it doesn't affect computational performance.
A benchmark of linux is one thing, but it isn't doing much to compare the capabilities of the hardware. The 6xSPUs available to linux can run easily code an order of magnitude faster than the PPE (each spu is faster than 1 ppu to start with, even with scalar code) and with some effort, more like two orders. This is because they have isolated memory (== all isolated, dedicated cache), a ton of registers, SIMD, and some fast multi-core synchronisation hardware. They're also most of the chip, so not using them is just a silly comparison.
One can get some pretty nice performance out of ARM + NEON (but not using a c compiler either!), but it's just not in the same league. And when the whole system picture is taken into account (RAM speed, the EIB, the atomic unit, etc), it's another few leagues away.
But it's all pretty academic, since the PS3 was a dead platform the day Sony intentionally killed it to save money.
Yes PS3 Cell probably is faster if properly used. I'm not saying ARM is faster, I agree the original benchmarks are stupid. The problem is that Cell development cost is extremely high.Quote:
let me put it short again - when ODROID-X can give the same gaming and multimedia real-life experience as PS3 then i will believe that benchmark and those none-sense numbers of the PS3 performance that are put there - i myself have small 4-node PS3 cluster at the university and would be very happy to replace it with ODROID-X, but ARM won't reach any time soon computational power that PS3 offers via its SPE cores. also, you're accepting and talking how important are the optimizations and at the same time you're finding benchmark made with generic not-optimized code for PS3 as good benchmark showing ARM is faster, that's why reciprocal to that way of benchmarking i suggested to test "Direct Stream Transfer" decoding on ODROID-X with the reference code and even if you wish with optimized - i'm sure it can't reach PS3 performance and all PS3 has basically the same CPU - in case you're asking now what PS3 CPU i used to measure the performance.
OTOH your original claim of DST decoding taking days on ODROID-X is stupid too and obviously not based on first hand experience, and that is what I was pointing.
If you want we can all use the reference decoder, don't touch anything to it and see how it performs. Too bad it won't use the SPE, its performance will be extremely bad and you won't have proven anything.
I have no idea how Sony ended up spending $3 billion on developing one console.
Does anyone know why Sony chose QNX as the base platform for the PS3 instead of Linux? I hear it's a really good realtime OS, but I'm sure Linux is too (plus most of their platform work would be done for them for free).