Add Hierarchical Z (HiZ) support for r600g driver.
I'm attaching my work-in-progress patchs to this bug report to keep track of comments.
This patch adds 3 new env vars :
- R600_EARLYZ : controls whether EarlyZ functionnality is used (DB_SHADER_CONTROL register)
- R600_HIZ : HiZ enabled/disabled
- R600_HIZ_READ_HACK : this one is a quick hack for debugging HiZ data buffer. If set, data will be read from HiZ bo instead of depth buffer bo when calling glReadPixels( ... GL_DEPTH_COMPONENT ...)
General notes on HiZ on r600 :
(all testing done using : ATI Technologies Inc RV770 [Radeon HD 4850])
* HiZ buffer is made of DWORD entries. Each DWORD represents one tile (4x4 or 8x8 pixels depending DB_HTILE_SURFACE register fields)
* HiZ buffer does not need to be manually cleared by the driver
* DB_RENDER_CONTROL:RESUMMARIZE_ENABLE bit is not necessary to get it working - but if enabled it slows things down
* application performance are roughly the same using R600_EARLYZ or R600_HIZ
Know problems :
* Hierarchical Stencil has not been tested
* HiZ buffer data layout is still unclear. As an example : using 640x640 window and 8x8 tiles. HiZ buffer should contain (640/8)*(640/8) = 6400 entries. When reading it back using the above hack, it contains 6400 entries but spread on the 7680 first dwords of the buffer. Therefore HiZ bo if largely oversized for now (ie = depth buffer size)
Created attachment 46081 [details] [review]
Hiz bo deletion patch
Created attachment 46082 [details] [review]
HiZ initial patch
Created attachment 46083 [details] [review]
kernel side patch to handle TILE_SURFACE commands
(patch built against linux 126.96.36.199)
This one is minimum, and is lacking hiz bo size checks.
Thanks for starting on this. I've provided an overview of of how HiZ and HiS work on 6xx+ hw and some relevant drm patches here:
ping me on irc (agd5f) if you have any questions.
Created attachment 46281 [details] [review]
Here's an updated patch.
Main differences are :
- num tile pipes getter
- algo for determining HTILE_SETTINGS value
- htile buffer size is more precisely calculated
- actually htile buffer is allocated to : DB_size_in_pixels * 4 / 4*4. This could be reduced in some case (when using 8x8 tiles for instance). I was thinking of moving the htile_settings calculation to r600_texture.c (when HiZ bo is created), and storing the htile_settings value alongside the HiZ bo pointer, for future usage (mainly when setting HTILE_SETTINGS register) ?
- I sometimes get this kernel error when enabling HiZ :
[ 459.432139] WARNING: at drivers/gpu/drm/radeon/radeon_fence.c:248 radeon_fence_wait+0x2a0/0x33f [radeon]()
[ 459.432146] Hardware name: MS-7551
[ 459.432152] GPU lockup (waiting for 0x0001216B last fence id 0x00012167)
[ 459.432694] [<ffffffffa02796c1>] ? drm_ioctl+0x273/0x347 [drm]
I'm not sure where does this come from, and how much this could be related to HiZ ?
The htile buffer needs to be bound for the life of the Z/S buffer(s) to be effective. I just put together a set of patches to allocate the htile buffer in the ddx along with the Z and stencil buffers. It seems to work ok, but I'm not seeing much change in performance.
Created attachment 47419 [details]
kernel + x + app logs
I've applied the linked patches and started testing.
Depending on which app I run, kern.log is filling with errors telling that "htile buffer too small".
I've added some logs to radeon_dri2.c and r600g's r600_state.c to help understanding the problem.
I'm using Radeon HD 4850 with dual-screen setup (total res: 3360x1050).
It seems that each component (dri2, kernel, mesa) uses a different value for width/height/pitch, maybe that's the source of the problem ?
The attached file, contains the log for 2 applications : glxgears (working fine), and gnome-shell (shows some errors).
Created attachment 47420 [details]
openarena rendering errors
I think that the little squares along the diagonal display htile buffer data.
I've marked previously attached patches as obsolete because the ones hosted here : http://people.freedesktop.org/~agd5f/htile/ are the most up to date.
All the patches are needed (kernel + mesa ones) to enable HiZ.
They are now mainly in need of testing to see how far from completion we are, so feedbacks are more than welcome :-)
It might be a good idea to refresh those patches first, to ease testing.
These two no longer apply cleanly to current git master:
The second part of this one (the changes to radeon_drm.h) seems to be already in 2.6.39?
This one also seems to already be in 2.6.39?
New rebased patches:
Also, fixed a bug in the evergreen htile_offset noticed by Vadim Girlin.
Created attachment 49931 [details]
On my HD5670 I get a floating point exception (backtrace attached). Is this a bug or have I failed to apply one of the patches?
(In reply to comment #12)
> Created an attachment (id=49931) [details]
> On my HD5670 I get a floating point exception (backtrace attached). Is this a
> bug or have I failed to apply one of the patches?
It seems kernel patch (0001-drm-radeon-kms-add-htile-support-to-the-cs-checker.patch) is missing.
(In reply to comment #13)
> It seems kernel patch
> (0001-drm-radeon-kms-add-htile-support-to-the-cs-checker.patch) is missing.
Hmm, I rebuilt the kernel a second time and have made sure the patch is applied and I'm still getting the same error.
I'm applying the patches to linux 3.0 but maybe that's too old already?
(In reply to comment #15)
> I'm applying the patches to linux 3.0 but maybe that's too old already?
Your backtrace show this: "num_tile_pipes = 0" which is wrong.
This value is asked by r600g to kernel using 'RADEON_INFO_NUM_TILE_PIPES' request.
Linux 3.0 kernel have all in place to handle this request (in radeon_kms.c), so num_tile_pipes definitely shouldn't be null.
Did you check if your kernel log (or Xorg ?) contain something interesting ?
Created attachment 49966 [details] [review]
mesa patch to print some diagnostic info
(In reply to comment #14)
> Hmm, I rebuilt the kernel a second time and have made sure the patch is applied
> and I'm still getting the same error.
Please post the output of "glxinfo | grep radeon_get" with this mesa patch.
For reference, you need to patch the ddx:
(In reply to comment #18)
> For reference, you need to patch the ddx:
> the drm:
> and mesa:
I just tried these on my rv790 with d-r-t kernel ( plus "fix version comment due to merge timing" so the drm patch would apply cleanly).
I don't get any exceptions or errors, but I do get some minor corruption on some mesa demos and openarena. Nexuiz and etqw looked OK.
The corruption varies between runs and isn't always present or as bad as the attached image. It's often just a few patches almost always running along the diagonal top left to bottom right.
Created attachment 49972 [details]
cubemap demo showing Hiz corruption.
Other demos seem to have only the diagonal affected not the bottom like this one.
(In reply to comment #17)
> Please post the output of "glxinfo | grep radeon_get" with this mesa patch.
Only this line is printed:
radeon_get_num_backends: 0 2
I *think* the problem is that linux 3.0 only uses version 2.10 of the kms wrapper, where as the call to radeon_get_backend_map(radeon) in r600_drm.c only happens when the minor version is >= 11.
Just changing this value to 10 both makes the second line appear (radeon_get_num_tile_pipes: 0 4) and gets working 3D again, but I guess I really need a later kernel :)
I followed Andy's advice and grabbed a d-r-t kernel, now everything seems to be running okay.
On my HD5670 I haven't noticed any difference in performance, OTOH neither do I get the corruption mentioned, cubemap renders correctly.
(In reply to comment #22)
> I followed Andy's advice and grabbed a d-r-t kernel, now everything seems to be
> running okay.
> On my HD5670 I haven't noticed any difference in performance, OTOH neither do I
> get the corruption mentioned, cubemap renders correctly.
I've tested a bit more and I don't always see it when just running and individual app.
I have a script that runs the mesa demos in alphabetical order - and so far running like this I have always seen it on some random selection of the 3D demos.
fbo_firecube is often OK, but if it does show it, then the corruption will be along the diagonal of the faces of the rotating cube and not across the window like the others.
Openarena is the only game I've seen it with. The corrupted squares/rectangles have varying content and also get the bloom effect applied to them.
(In reply to comment #19)
> I don't get any exceptions or errors, but I do get some minor corruption on
> some mesa demos and openarena. Nexuiz and etqw looked OK.
Tried the patches on my AGP rv670 and rather than corruption I get GPU timeouts/resets.
(In reply to comment #24)
More running with my rv790 and I've found that if I re-size gears then at certain sizes I get these.
radeon 0000:01:00.0: htile buffer too small (384x264 (4x4) -> 430848 have 430080)
radeon 0000:01:00.0: r600_packet3_check:1550 invalid cmd stream 523
some different sizes that fail -
radeon 0000:01:00.0: htile buffer too small (640x264 (4x4) -> 718080 have 716800)
radeon 0000:01:00.0: htile buffer too small (512x392 (4x4) -> 852992 have 851968)
radeon 0000:01:00.0: htile buffer too small (896x520 (4x4) -> 1980160 have 1978368)
radeon 0000:01:00.0: htile buffer too small (640x392 (4x4) -> 1066240 have 1064960)
radeon 0000:01:00.0: htile buffer too small (1152x520 (4x4) -> 2545920 have 2543616)
Created attachment 50174 [details]
[behem0th@ArchLinux ~]$ unigine-heaven
Engine::init(): can't create log file
Engine::init(): clear video settings for "Gallium 0.4 on AMD RV770 2.1 Mesa 7.12-devel (git-e09b706)"
AL lib: pulseaudio.c:612: Context did not connect: Connection refused
Set 640x480 windowed video mode
Received signal SIGFPE, floating point exception
[behem0th@ArchLinux ~]$ glxgears
Running synchronized to the vertical refresh. The framerate should be
approximately the same as the monitor refresh rate.
floating point exception
Created attachment 50175 [details]
(In reply to comment #26)
> kernel 3.0.1
> Received signal SIGFPE, floating point exception
You need a more recent kernel.
Created attachment 50192 [details]
Unigine Heven, World of Tanks (in wine) have rendering errors.
unigine-heaven 1280x1024 Scores: 171
unigine-heaven 1280x1024 hiz Scores: 174
I was starting trying out patches and and compiled kernel drm and ddx driver, but got number of patching failures while trying out mesa patches at present git master
below snapset from "0002-r600g-add-htile-support.patch" shows that some stuff has moved around as mentioned file does not even exists in the repository anymore...
can't find file to patch at input line 368
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
|diff --git a/src/gallium/winsys/r600/drm/r600d.h b/src/gallium/winsys/r600/drm/r600d.h
|index 8042481..f5d47ae 100644
please if possible update mesa patchset
Is there any news?
Created attachment 56784 [details] [review]
r600g patch on top of master
Created attachment 56785 [details] [review]
Kernel patch top of lastest linux + tiling
Kernel patch on top of last linus + tiling. All seems to work ok but likely stuff regressed.
Thanks. Any chance hitting git master (masked behind a flag of course)?
(In reply to comment #33)
> Created attachment 56785 [details] [review] [review]
> Kernel patch top of lastest linux + tiling
> Kernel patch on top of last linus + tiling. All seems to work ok but likely
> stuff regressed.
Didn't work very well for me, glxgears output garbled and logs full of:
[ 76.197338] [drm:radeon_cs_ib_chunk] *ERROR* Invalid command stream !
[ 76.197524] radeon 0000:01:00.0: evergreen_cs_track_validate_depth:664 htile surface too small 8192 for 9216 (48 48)
[ 76.197527] radeon 0000:01:00.0: evergreen_packet3_check:1972 invalid cmd stream 487
-- system architecture: 32-bit
-- Linux distribution: Debian unstable
-- GPU: REDWOOD
-- Model: XFX Radeon HD 5670 1GB
-- Display connector: DVI
-- xf86-video-ati: e20284409937d784847339b5d466a95012d85940
-- xserver: 188.8.131.521
-- mesa: 26de5273acf1ebe6730b5e72b55b3bcceba167c6 + hiz patch
-- drm: 230ec7d7bbf1e8a7e263d471b21afb08c28eba0c
-- kernel: 3.3.0-rc2 + streamout, tiling and hiz patches
(In reply to comment #35)
> Didn't work very well for me, glxgears output garbled and logs full of:
Actually scratch that, glxgears in a window doesnt work, but running it fullscreen and it has no problems.
Created attachment 56845 [details]
Work with EVE-Online.
I have a small problem (you can see on the screen shot).
It's some square of 'not rendered' stuff like if a miscalculation occur somewhere. They are of variable number (stay stable once the game is loaded) and are placed in diagonal on the screen. On the screen shot I see only 7 but some time it's like 35, all in the same diagonal plane.
I just checked back with the bug to see how the development is going and seems either I am doing something wrong or that patches updated by Jerome Glisse would need to be updated again for kernel 3.4 or 3.5-rc1 and mesa git master as they do not apply for me :(
I fear nobody is working on hi-z anymore...
The kernel patches are already upstream. All you need are the mesa patches.
Really? Thanks for that, that's an awesome news.
Is updated patch, this one actually improve performance
Updated with more fixes (down to ~100 regression with piglit)
To enable hyperz
(In reply to comment #45)
> Thanks Jerome
Getting +20% fps with etqw on my 4890.
The latest patch is a no go on Cayman (running kernel 3.5-rc1, with latest mesa, drm and DDX). When booting, 3D is unavailable and I have to use the 2D gnome-shell version. Reverting the v6 patch, 3D is back again.
I'll attach some files at the end of the day (dmesg, .xsession-errors, Xorg log) about the errors.
Confirmed on Radeon HD6950, the screen is completely garbled. Could this have to do with #45018?
(In reply to comment #48)
> Confirmed on Radeon HD6950, the screen is completely garbled. Could this have
> to do with #45018?
We could disable virtual address space (as proposed in bug 45018) and test the latest patch (v8 I think, it should be merged soon at some point according to last week discussion if it hasn't been yet).
I've seen some messages about Hi-Z being released, but not enabled and so on a couple of weeks ago. What is the status?
I saw this:
Author: Jerome Glisse <firstname.lastname@example.org>
Date: Wed Sep 5 15:18:24 2012 -0400
r600g: order atom emission v3
To avoid GPU lockup registers must be emited in a specific order
(no kidding ...). This patch rework atom emission so order in which
atom are emited in respect to each other is always the same. We
don't have any informations on what is the correct order so order
will need to be infered from fglrx command stream.
v2: add comment warning that atom order should not be taken lightly
v3: rebase on top of alphatest atom fix
Signed-off-by: Jerome Glisse <email@example.com>
Does this mean the htile patch may also be committed without causing lockups?
HyperZ support was merged a while ago. Closing.