Found by code inspection:
The intel_miptree_alloc_hiz() function calls intel_miptree_create(), passing it the exact same format, first_level, last_level, width, height, depth, and num_samples that were used to create the parent miptree. This means that exactly the same amount of memory will be set aside for the HiZ buffer as for the main depth buffer.
For Z24 buffers, this is too large by roughly a factor of 8. For Z16 buffers, it's too large by roughly a factor of 4. This wastes physical memory as well as available aperture space, since the kernel ensures that all of the pages of the HiZ buffer are resident and mapped into the GPU's address space before running an execbuffer, even if not all of the pages are actually needed by the GPU.
For Gen7, we should use the formulas in the bspec to set the HiZ buffer size. For Gen6, we may need to do some research to figure out the correct formulas, because on Gen6 we are still using X/Y offsets to manually select which part of the miptree to render to, and our notion of the miptree layout may not match that used by the hardware.
Gen7 should get fixed first, because there the required fixes will be less intrusive than in Gen6.
For reference, the discussion of the correct size for the HiZ buffer is located on page [3D-Media-GPGPU Engine > 3D Pipeline Stages > Pixel > Depth and Stencil > Hierarchical Depth Buffer]. To understand the discussion, [Memory Views > Common Surface Formats > Surface Layout] is a prerequisite.
On Gen7, the root cause of this waste of memory is that Mesa re-uses ``struct intel_mipmap_tree`` and to hold the HiZ buffer's layout (even though the HiZ buffer isn't a true miptree) and inappropriately re-uses the vanilla miptree constructors to determine the HiZ buffer's size. On Gen6, the issue is more complex, so let's focus on solving Gen7 first.
There is an additional undesired consequence of re-using ``struct intel_mipmap_tree`` to hold the HiZ buffer, though this consequence causes no bugs. The struct contains many fields that are completely irrelevant and meaningless to HiZ, such as ``format``, which we set to the same format as parent depth buffer; and the layout fields, which are never used on Gen7. Because we construct the HiZ miptree with the vanilla miptree constructors, we set many of these fields to arbitrary values, and other fields lie unused.
I propose the below approach to fixing the memory waste. Paul and Jordan, what do you think of this plan?
By the way, Ben is expressing interest in picking up a low-priority, straightforward Mesa task close to the metal, so I suggested this. I believe the Gen7 fixes fit his criteria, though the Gen6 fixes may not.
Proposed Fixes for Gen7
1. Create a new type, ``struct intel_hiz_buffer``, that initially is identical to ``struct intel_mipmap_tree``. Change the type of ``intel_mipmap_tree::hiz_mt`` to this new type. Initially, intel_hiz_buffer_create() should replicate the current behavior of intel_miptree_alloc_hiz().
2. Fix intel_hiz_buffer to allocate a correctly sized HiZ buffer on Gen7.
3. Delete all unused or irrelevant fields in intel_hiz_buffer.
4. Since we now have a separate type for the HiZ buffer, do some related clean-ups in intel_mipmap_tree.c ...
4a. The layout data in ``struct intel_mipmap_slice::x/y_offset`` is used only for Gen6, so let's not initialize that data on Gen7.
4b. Move intel_mipmap_tree::hiz_map into intel intel_hiz_buffer.
5. Validate on Ivybridge and Haswell, especially Haswell because it's more sensitive to HiZ hangs than Ivybridge.
My "i965/gen7+ HiZ cleanup + gen8 hiz improvements"
series may help with this, but I did not test the hiz
buffer sizes. (I am trying to use the formulas in the
docs to size the hiz buffers though.)
Since you're updating this one...
for posterity, here was the last spin I did on a quick search:
I tested the hiz buffer size allocated for this piglit command:
bin/depthstencil-render-miplevels 1024 d=z24
When using patches 1-5 of my bwd-aux-hiz branch in
For gen7 and gen8:
This seems to match Paul's comment that it is roughly 8
times larger than necessary.
Fixed upstream in Mesa.