Bug 53490

Summary: [bisected] bump map corruption from kernel 3.5
Product: Mesa Reporter: Joeri Capens <joeri>
Component: Drivers/Gallium/r600Assignee: Default DRI bug account <dri-devel>
Status: RESOLVED INVALID QA Contact:
Severity: normal    
Priority: medium    
Version: 7.11   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: Portal 2 screenshot - good
Portal 2 screenshot - bad
bisect info
num_shader_engines
Xorg.0.log
evergreen_fix_tile_config_and_remove_unused_variable.patch
Portal 2 screenshot - Linux GIT
evergreen_fix_tile_config_and_remove_unused_variable_v2.patch

Description Joeri Capens 2012-08-14 12:22:22 UTC
Created attachment 65540 [details]
Portal 2 screenshot - good

After updating my Linux kernel to 3.5.0 I'm seeing bump map corruption in Portal 2. See attachments.

I bisected it down to commit 416a2bd274566a6f607a271f524b2dc0b84d9106 - drm/radeon: fixup tiling group size and backendmap on r6xx-r9xx (v4)

My GPU is a Radeon HD 5670 (Evergreen, Redwood) and I'm using Gallium (for R600) on Gentoo Linux AMD64.

If I disable bump mapping in Portal 2 (command "mat_bumpmap 0") the corruption goes away, but obviously the walls and objects appear flatter.

Mesa, libdrm, xf86-video-ati or Wine versions don't seem to matter. I tested some combinations of Mesa 7.11.2 and 8.0.4, libdrm 2.4.27, 2.4.33 and 2.4.38, xf86-video-ati 6.14.4 and 6.14.6, Wine 1.5.9 and 1.5.10.
Comment 1 Joeri Capens 2012-08-14 12:23:05 UTC
Created attachment 65541 [details]
Portal 2 screenshot - bad
Comment 2 Joeri Capens 2012-08-14 12:24:48 UTC
Created attachment 65542 [details]
bisect info
Comment 3 Joeri Capens 2012-08-14 12:31:22 UTC
Forgot to mention: while trying (but failing) to narrow down the problem with patch 416a2bd274566a6f607a271f524b2dc0b84d9106 I found that variable num_shader_engines in evergreen.c appears to be unused?
Comment 4 Michel Dänzer 2012-08-14 14:04:13 UTC
Please attach the Xorg.0.log file corresponding to the problem.

Have you tried current Mesa Git master?
Comment 5 Joeri Capens 2012-08-15 00:18:56 UTC
Adding Xorg.0.log. I tried the latest Mesa Git now but it didn't make a difference.

However, I found the specific change in 416a2bd274566a6f607a271f524b2dc0b84d9106 which causes the bump map corruption I'm seeing. When I revert the change in the tile_config calculation, the problem goes away.

I've created a patch which fixes the tile_config calculation and which removes the unused num_shader_engines variable.
Comment 6 Joeri Capens 2012-08-15 00:19:34 UTC
Created attachment 65572 [details]
num_shader_engines
Comment 7 Joeri Capens 2012-08-15 00:20:55 UTC
Created attachment 65573 [details]
Xorg.0.log
Comment 8 Joeri Capens 2012-08-15 00:21:53 UTC
Created attachment 65574 [details] [review]
evergreen_fix_tile_config_and_remove_unused_variable.patch
Comment 9 Alex Deucher 2012-08-15 13:41:52 UTC
(In reply to comment #8)
> Created attachment 65574 [details] [review] [review]
> evergreen_fix_tile_config_and_remove_unused_variable.patch

That patch is wrong.  group size has no relation to burst size.  I think what's happening is that increasing group size is causing larger alignment.  Does this patch help?

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=c8d15edc17d836686d1f071e564800e1a2724fa6
Comment 10 Joeri Capens 2012-08-15 21:10:52 UTC
I actually already tried that patch since I tried the last Linux GIT two days ago.

Unfortunately it doesn't fix the problem. The screen corruption does look a bit different though, see attachment "16_bank.png".

Adding a printk for the tile_config value right after its calculation gives me the following values:

Linux 3.4: 0x1112 (good)
Linux 3.5: 0x1012 (corruption)
Linux GIT: 0x0022 (slightly less corruption)

Does this information help?
Comment 11 Joeri Capens 2012-08-15 21:11:54 UTC
Created attachment 65617 [details]
Portal 2 screenshot - Linux GIT
Comment 12 Joeri Capens 2012-09-01 00:56:43 UTC
Bit 8 of tile_config (CHANSIZE?) needs to be 1 to make the bumpmap corruption disappear on my system.

Before 416a2bd274566a6f607a271f524b2dc0b84d9106 this used to be calculated from the BURSTLENGTH value. My patch reverted to that and while it works it does indeed look weird.

Patch 416a2bd274566a6f607a271f524b2dc0b84d9106 replaced the calculation with:

rdev->config.evergreen.tile_config |= 0 << 8;

which also looks weird because this operation does nothing. So I guess it must simply be:

rdev->config.evergreen.tile_config |= 1 << 8;

I don't have a clue what all of this actually means or does to the hardware, it only seems more logical to me and it fixes the problems I was having.

The value of tile_config is now 0x0122 on my system.

Attaching a new patch.
Comment 13 Joeri Capens 2012-09-01 00:57:49 UTC
Created attachment 66427 [details] [review]
evergreen_fix_tile_config_and_remove_unused_variable_v2.patch
Comment 14 Joeri Capens 2012-09-07 12:10:45 UTC
I re-compiled mesa again (the "8.1_rc1_pre20120724" git snapshot provided by Gentoo, to be exact) and this time the bump map corruption does not occur anymore, with or without my kernel patch.

I tried some recent mesa versions before, but I must have made mistakes while copying the mesa libraries from a 32-bit chroot to the /usr/lib32/ directory on my 64-bit system.

Somewhere between mesa 8.0.4 and the snapshot of 2012-07-24, the bug has already been fixed.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.