Pure speculation, but with some sort of multi-cpu register coherency mechanism it should be possible to pipeline execution across multiple processors in manner similar to a single cpu.
Having more than one CPU core executing a single instruction stream would give new dimensions to speculative execution. Good branch prediction is one thing, having multiple cores simply go ahead and speculatively execute both outcomes of the branch could boost the IPC.