So, you guys do all realise that this is what basically every X app does already today, right? Unless you're running Motif/Xaw/Xt apps which actually use core X rendering, of course. GTK+, Qt, EFL and friends all composite one massive image for the window on the client side, and then upload this image to the server. The only X rendering operation you'll see in common usage today is PutImage and Composite, basically.
Even better, since X performs absolutely no compression whatsoever, it's actually the most inefficient method of doing a network-transparent window system you could get, without going out of your way to insert random junk into the data stream. So if Wayland's solution is a screen-scrape and compression, which I expect it will be, then it'll be a hell of a lot _more_ efficient than X.
Anyway, carry on ...
I remember reading long-ago discussions on lbx and dxpc, and by now have blurred the two together, forgive me please. ISTR that they split the X datastream into 4 categories, and treated each category differently. There were combinations of compression, caching, state-mapping, and out-and-out short-circuiting. They described the X protocol as very verbose, with frequent conversations about current state. By keeping a state map they were able to only send hints over the wire, and pretty much eliminated that section of the traffic. I believe fonts and other stuff like that were both compressed and cached, and bitmaps and such were compressed. It was quite a while ago, so the memories are a bit vague.
But the net I took out of it was that some parts of the X protocol could be just plain more efficiently designed.
Second point/question... When using X tunneling through SSH, they offer compression. Back in the day, it didn't seem that that compression would be anywhere near as effective as dxpc. If as you say, most of what's flowing through X now is bitmaps, etc, perhaps SSH compression with modern toolkits is a better bargain than it used to be.