The two APIs seem to be fairly different. TTM has been around for a couple of years. Some of it (the memory management part) seems to be quite well understood, but TTM also includes some mechanisms to synchronize buffer release with the GPU having finished using the buffer (the "fence" mechanism) and that seems to be causing grief. Not sure if there is an inherent API problem or whether the synchronization is just so different from one GPU vendor to the next that any API would cause pain. As I understand it you can't just use memory management without the fences, but I'm only guessing that from the general discussion.
GEM takes a different approach from TTM in a number of ways. The main differences are (a) it adds a new way of accessing vram buffers, using a filesystem-type pread/pwrite mechanism, and (b) all vram buffers are backed up by an equal size buffer in system memory, simplifying suspend/resume (at the cost of consuming more system memory space). That second point seems bad at first glance, but Linux does a lazy allocation of physical memory (ie you can allocate virtual memory but if you never use it no physical memory gets harmed) so in principle it's not as bad as it sounds. Discussion has now moved onto nuances of suspend/resume partly to see if this will be sufficient, and partly because a couple of potential problems with the current driver stack showed up during the discussion.
GEM also adds a mechanism to formalize the change of "ownership" of a buffer between CPU and GPU, called domains, which took care of things like cache flushing. Initial thought was that GEM provided a significant performance boost over TTM, but I believe current thinking is that the initial tests were not using the same version of the test program and that other factors were causing most of the performance delta. That said, I expect the initial GEM implementation has been tweaked up these days so it probably is faster again
I think GEM was actually proposed in part because it was *easier* to code for; TTM is more complex (primarily the fence mechanism) but has been around longer so initial implementations exist for most GPUs -- the problem is that those implementations are often a year or two old and so neither match the current driver stack nor the current state of Gallium.



Reply With Quote

