Bug 15439

Summary: [G35] Horizontal corruption with Tiling TRUE
Product: Mesa Reporter: Kevin DeKorte <kdekorte>
Component: Drivers/DRI/i965Assignee: Zou Nan hai <nanhai.zou>
Status: VERIFIED FIXED QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: medium CC: lerui.zhu, michael.fu
Version: unspecifiedKeywords: NEEDINFO
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: Digicam of screen issue, notice the bars to the left and right of the google earth pixmap.
Xorg.0.log with 2.3 branch of the driver
output of glxinfo with tiling off
Xorg.0.log when using git master of xserver

Description Kevin DeKorte 2008-04-10 11:30:19 UTC
When using git master versions of drm, mesa and xf86-video-intel (4/10/08) and xserver X.Org X Server 1.4.99.901 (1.5.0 RC 1)

Build ID: xorg-x11-server 1.4.99.901-18.20080401.fc9

Anytime a pixmap is used as a texture. IE mplayer -vo gl mode or in googleearth there are black bars to the left and right of the image. Moving the window around erases old bars and creates new ones. After stopping the app or pausing it you can erase them with other apps.

When tiling is disabled this problem is not present.
Comment 1 Kevin DeKorte 2008-04-10 11:33:03 UTC
Created attachment 15812 [details]
Digicam of screen issue, notice the bars to the left and right of the google earth pixmap.

Black bars are only present when tiling is enabled
Comment 2 Gordon Jin 2008-04-11 01:46:10 UTC
Kevin, are you able to git-bisect to find the culprit commit?
Comment 3 Wang Zhenyu 2008-04-11 07:44:56 UTC
So you saw this with intel driver git master branch right? Could you try xf86-video-intel-2.3-branch instead? It can tell us if this is caused by recent i965 render improvement patches or keith's transform change.
Comment 4 Kevin DeKorte 2008-04-11 08:29:28 UTC
Bug is present in xf86-video-intel-2.3-branch

Attaching Xorg.0.log as confirmation that I have the right driver
Comment 5 Kevin DeKorte 2008-04-11 08:29:58 UTC
Created attachment 15836 [details]
Xorg.0.log with 2.3 branch of the driver
Comment 6 Kevin DeKorte 2008-04-11 08:52:53 UTC
Tried driver 2.2.1 and it has the same issue with Tiling enabled

Tried to go back to driver 2.1 and it won't compile on my system due to errors in I810AccelInit as it is missing members in _I810Rec
Comment 7 Kevin DeKorte 2008-04-11 09:02:15 UTC
Also a note on the master branch with tiling enabled. If I leave glxgears running for about 30 seconds. The machine will hardlock. If I disable Tiling, it will run without locking.

As of Apr 11, 2008
drm - git
mesa - git
xf86-drv-intel - git

RPM of xorg from fedora rawhide
xorg-x11-server 1.4.99.901-18.20080401.fc9
Comment 8 Wang Zhenyu 2008-04-11 09:16:55 UTC
Kevin, I'll just ignore your glxgears problem in this track.

I've also done like your tests with intel driver 2.3 branch and 2.2.1, both googleearth caused screen artifacts with xserver-1.5 and pixman master.

I also tried intel 2.3 branch with xserver-1.4, mesa-7.0, and I _can't_ see the problem. So this seems a xserver or pixman regression, bisect xserver exa/ changes might be able to show what's wrong.

Comment 9 Kevin DeKorte 2008-04-11 10:24:04 UTC
I did a couple of tests with my setup

I switched the AccelMethod to XAA and the lines did disappear with Tiling and without Tiling, however the display in XAA mode was corrupted in other ways so it was not usable for me. 

I also tried to compile mesa 7.1 but I was unable to do so. So I am still running mesa master.

So it looks like the bug is in EXA mode + Tiling. At this point I have no idea how to bisect it anymore. I'm not sure I can bisect the change that broke it since the error occurs in all of the components I can compile. 

I am willing to test and try and recreate things but other than that, I think it will have to be debugged by someone with a little more experience in this area.

I will continue to test and report back what I find.
Comment 10 Wang Zhenyu 2008-04-13 23:25:16 UTC
Kevin, I tried with today's master branches, drm, mesa, xserver, xf86-video-intel (master and 2.3-branch), it seems if I loaded ttm-enabled drm kernel modules, googleearth can work just fine (no screen corrupts and fonts are right in googleearth).

If I use vanilla 2.6.24 kernel's drm modules, it just shows the corrupt. And as I tested before, mesa-7.0 also works fine.

Could you try that too? Looks this's a mesa master problem in ttm fallback classic mode.
Comment 11 Kevin DeKorte 2008-04-14 05:49:56 UTC
I tested with todays code using git master of drm, mesa and xf86-drv-intel and I believe I am running in TTM mode (is there a way to verify) and I am still getting the issue. 

Again this is with tiling on, if I turn tiling off the problem goes away.
Comment 12 Wang Zhenyu 2008-04-14 06:52:55 UTC
(In reply to comment #11)
> I tested with todays code using git master of drm, mesa and xf86-drv-intel and
> I believe I am running in TTM mode (is there a way to verify) and I am still
> getting the issue. 

Do you load drm modules from drm source linux-core/? (make i915.o; insmod drm.ko; insmod i915.ko) dmesg shows what's version of i915 module you've loaded, and export LIBGL_DEBUG=verbose might have more info. I don't know other environ you might set, but mesa master i965 dri driver will say if it fails to detect ttm and fallback to classic.

> 
> Again this is with tiling on, if I turn tiling off the problem goes away.
> 

Even if disable tiling, I still can see fonts in googleearth window are broken (city names have missing glyphs,etc.), although mine is T61 with GM965.

Comment 13 Kevin DeKorte 2008-04-14 07:09:26 UTC
Yes, I am running the full drm stack from git master. I do compile the main code and the linux_core code and install it. 

From mesa I do not get any messages that state that I am falling back to classic mode (I have seen that message before, but I don't get it now). So I am pretty sure I am running all the right parts.

on my chip (g35) I do not see any font issues in google earth with tiling _disabled_. I even looked at the Hong Kong airport and verified that non-latin characters were working ok.


$ export LIBGL_DEBUG=verbose
$ googleearth 
libGL: XF86DRIGetClientDriverName: 1.9.0 i965 (screen 0)
libGL: OpenDriver: trying /usr/lib/dri/i965_dri.so
drmOpenDevice: node name is /dev/dri/card0
drmOpenDevice: open result is 8, (OK)
drmOpenByBusid: Searching for BusID pci:0000:00:02.0
drmOpenDevice: node name is /dev/dri/card0
drmOpenDevice: open result is 8, (OK)
drmOpenByBusid: drmOpenMinor returns 8
drmOpenByBusid: drmGetBusid reports pci:0000:00:02.0
libGL error: 
Can't open configuration file /etc/drirc: No such file or directory.
libGL error: 
Can't open configuration file /home/kdekorte/.drirc: No such file or directory.
wx=0.312690, wy=0.328990, rx=0.639990, ry=0.330000


I'll attach the output of glxinfo as well (Tiling off)
Comment 14 Kevin DeKorte 2008-04-14 07:10:23 UTC
Created attachment 15905 [details]
output of glxinfo with tiling off

Probably not real useful, but should be able to show you that I am using the right stuff to test.
Comment 15 Kevin DeKorte 2008-04-14 07:38:44 UTC
I think I found another clue with googleearth with tiling on. I found that if I zoomed in slowly from the globe to the continent. If the texture didn't change the screen corruption didn't change. However if I found a spot where the textures changed to to LOD changes. IE high lod to a lower one and vis versa that is when the corruption came up. So I believe it is a problem when changing textures on a surface. But that is just my guess to the problem.

Also, I tried getting xorg server from git master and the thing just exploded on me. So I had to revert back to my distros Xserver.
Comment 16 Kevin DeKorte 2008-04-14 08:35:31 UTC
Just another data point, but I finally got xserver upgraded to git master and even with that a tiling on I get screen corruption as seen in the attached jpeg. So still a problem..

from git

drm
mesa
xserver
xf86_drv_intel

So I think I'm at the same level as you now.

Attaching Xorg.0.log for confirmation
Comment 17 Kevin DeKorte 2008-04-14 08:36:51 UTC
Created attachment 15910 [details]
Xorg.0.log when using git master of xserver
Comment 18 Kevin DeKorte 2008-04-14 08:44:39 UTC
Just for grins, I upgraded pixman to git master as well and that did not seem to change anything.

Comment 19 Kevin DeKorte 2008-04-14 08:51:32 UTC
Also, I tried with AccelMethod XAA and No offscreen pixmaps set and noticed something... I don't get the blackbars outside if the window like I do with XAA but while the texture is being refreshed, I do see screen corruption like I do with EXA just for a split second until the texture is fully loaded. What it looks like is the same corruption you get outside the window, but constrained to the OpenGL surface... so you see the black bars, but after the texture is fully loaded and the surface is refreshed the bars go away.
Comment 20 Kevin DeKorte 2008-04-14 09:45:24 UTC
I've been thinking about this problem and I'm wondering if in tiling mode if there is call that clears the memory for the texture being uploaded. I'm thinking that since the black bars are being seen that rather than memory being cleared for each tile or tile band, it is being cleared for the entire texture and the tiles and the the memory are not quite the same or the X offset for each row of the tile array for the clear is not reset prior to the clear. 

The problem I see it looks like this

Say the screen is 30x30 and the texture is at 5,5 and is 10,10

--------------------------------
|
|
|
|     *XXXXXXXXX
|                XXXXXXXXXX
|                          XXXXX
|XXXXXXXXXXXXXXX

etc...


The black bars are the X's in the diagram so it looks to me that either when the memory is cleared the X offset is not reset properly or a block of memory is just cleared once for the entire size of the texture and it is clearing the wrong blocks. The titles are being placed in the right spot since when the texture is used the texture looks correct.

I think that the difference between XAA and EXA mode could be that XAA has better clipping of where bits are written and so the problem is not in either EXA or XAA but in the tile upload process.

Feel free to tell me I'm totally wrong on this. This is just a guess based on observation of the garbage on the screen.
Comment 21 Kevin DeKorte 2008-04-14 10:26:00 UTC
Did some more testing and switched the batch buffer branch in DRI2 mode and everything looks great there with Tiling mode on.

OpenGL code in some cases (etracer) is slightly fewer FPS in batch buffer mode over non-batchbuffer in tiling mode, but that fact that it works without artifacts in a positive in my opionion.

Once batchbuffer is merged into master, we can retest and if things look good we can close this.
Comment 22 Kevin DeKorte 2008-04-14 10:59:35 UTC
ok one more note. Using intel-batchbuffer branch I switched from DRI2 mode to DRI mode just by changing the setting in the xorg.conf file. After that the black lines came back. So that appears to point the problem in the direction of the dri code. 
Comment 23 Wang Zhenyu 2008-04-14 19:26:15 UTC
I tried on one G35 here, master branches seem work fine on it like on GM965.

batch-buffer driver issue should be on seperate track though.
Comment 24 Kevin DeKorte 2008-04-15 05:39:51 UTC
I have git/master of drm, mesa, xserver, and pixman. What else could I need master off?

Also odd that DRI1 shows the error and DRI2 makes it go away. Can I ask what hardware you are testing with the G35 chip?
Comment 25 Wang Zhenyu 2008-04-20 06:49:16 UTC
Sorry for delay, it's an ASUS board with G35 and HDMI output, also on my T61 with GM965. 

Comment 26 Wang Zhenyu 2008-04-20 18:21:46 UTC
Kevin, could you try current drm master? Keith has pushed several fixes in his googleearth testing.
Comment 27 Wang Zhenyu 2008-04-20 23:32:54 UTC
assign back to gordon to hand this to somebody else.
Comment 28 Kevin DeKorte 2008-04-21 06:00:23 UTC
Well I tried to compile the entire X stack. drm, mesa, xserver and the intel driver.

drm and mesa compiled ok, but the xserver won't compile right now. And the intel driver gives an awful mess for text right now. (it appears to have been broken for a couple of days, but not sure if it is related to xserver or not).

Anyway, with updated drm and mesa, I try to run anything 3d and the machine crashes pretty much right away.

Comment 29 Kevin DeKorte 2008-04-21 19:06:24 UTC
Ok, I got this to build with all the latest stuff...

drm, mesa, xserver, pixman all from git master

xf86-drv-intel is from the batchbuffer branch (git master has problems with text display and screen corruption)

With those items in place, running with EXA and Tiling and DRI (version 1) I am able to play mplayer -vo gl video and not get the horizontal stripes. 

I tried running with googleearth, and the machine hangs.

Also I experienced other hangs with the system using gtkperf and other display tests. 
Comment 30 Kevin DeKorte 2008-04-21 19:11:21 UTC
Using the batchbuffer branch with DRI2, googleearth runs ok, 15500 is still outstanding.
Comment 31 Colin.Joe 2008-04-22 00:26:27 UTC
Using git master versions of drm, mesa and xf86-video-intel (4/21/08) , 
and setting tiling disabled , this problem still exist , but if I change 
xf86-video-intel to 2.3 branch , this case can run fine . 
Comment 32 Gordon Jin 2008-04-22 19:06:58 UTC
reassigning to Nanhai, as Zhenyu believes it's 3d issue.
Comment 33 Colin.Joe 2008-05-04 02:23:42 UTC
I updated the xserver to the master tip , this bug can't be reproduced . 
Comment 34 Michael Fu 2008-07-03 19:30:09 UTC
Is this bug gone?
Comment 35 liuhaien 2008-07-06 22:26:37 UTC
it 's ok in the latest commit of master.
Comment 36 liuhaien 2008-07-06 22:27:50 UTC
verified.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.