Bug 17507 - [GM45] intel driver 2.4.2 freezes xorg
Summary: [GM45] intel driver 2.4.2 freezes xorg
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/intel (show other bugs)
Version: unspecified
Hardware: Other All
: medium critical
Assignee: Wang Zhenyu
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-09-09 22:21 UTC by Evan Klitzke
Modified: 2008-09-21 20:06 UTC (History)
3 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Xorg.0.log from crash (17.21 KB, text/plain)
2008-09-10 10:58 UTC, Evan Klitzke
no flags Details
Xorg log from hard lock (20.69 KB, text/plain)
2008-09-10 15:45 UTC, Jason D. Clinton
no flags Details
crash with ModeDebug turned on (13.78 KB, text/plain)
2008-09-10 22:02 UTC, Evan Klitzke
no flags Details
Xorg log after patch (40.37 KB, text/plain)
2008-09-11 09:06 UTC, Eitan Isaacson
no flags Details
glxinfo output (6.58 KB, text/plain)
2008-09-11 09:09 UTC, Eitan Isaacson
no flags Details
crash with ModeDebug and more output (18.55 KB, text/plain)
2008-09-11 14:20 UTC, Jason D. Clinton
no flags Details
Xorg.0.log crash when on battery (29.71 KB, text/plain)
2008-09-17 08:50 UTC, Taiyang Chen
no flags Details

Description Evan Klitzke 2008-09-09 22:21:53 UTC
lspci reports my video card as

00:02.0 VGA compatible controller: Intel Corporation Cantiga Integrated Graphics Controller (rev 07)
00:02.1 Display controller: Intel Corporation Cantiga Integrated Graphics Controller (rev 07)

When X11 starts the screen turns black and the computer becomes unresponsive (I need to hold down the power button and force the power to shut off). This is with the 2.4.2 driver I downloaded and built from intellinuxgraphics.org (although to be honest I'm not 100% confident that Xorg was using the one in /usr/local rather than the one shipping with the Ubuntu alpha). I also noticed the problem with a patched 2.4.1 intel driver shipping with the current Ubuntu alpha (which they claim is the same as 2.4.2 but w/o the new version string) *and* I noticed the problem with Fedora Rawhide which I think is also on the latest version of the driver. So I think it's safe to say that there is definitely a bug in the upstream code, i.e. not just in the one Ubuntu is shipping.

I tried to build the driver in git but didn't have a new enough libdrm, but I can spend more time and try that out too if someone thinks it will help.

I wasn't really able to get any diagnostics because the Xorg log in /var/log is always empty after I hard reset.

Please help me -- I'm stuck on the vesa driver for the time being!
Comment 1 Gordon Jin 2008-09-09 23:41:55 UTC
Xorg log will be definitely helpful, and I don't understand why yours is empty...

When it "freeze", does keyboard or network alive?
Comment 2 Evan Klitzke 2008-09-10 10:58:09 UTC
Created attachment 18811 [details]
Xorg.0.log from crash

Log from after a freeze.
Comment 3 Evan Klitzke 2008-09-10 11:06:01 UTC
(In reply to comment #1)
> When it "freeze", does keyboard or network alive?

When it freezes the network becomes unresponsive (can't ping the laptop). Also, I can't force the kernel to restart using the Magic SysRq sequences (e.g. Alt-SysRq-b doesn't restart the system).
Comment 4 Jason D. Clinton 2008-09-10 15:45:52 UTC
Created attachment 18812 [details]
Xorg log from hard lock

I can "me too" this (with same software versions but on Debian Experimental) and also add that at least one other person is having the same problem (on Ubuntu Intrepid):

http://forums.lenovo.com/lnv/board/message?board.id=Special_Interest_Linux&thread.id=412

I have attached the log from one of the "hard lock" events. It appears to hard lock the machine when probing EDID but I have no way of proving that because, as the reporter mention, the system is completely dead--not even a kernel panic.

I can add some details, also. I have tried 2.6.25, 2.6.26 and 2.6.27-rc5 and -rc6. I have tried disabling the acpi-video module to work around the potential for a ACPI bug in this laptop. I've tried disabling a number of power saving features suggested by lesswatts.org in case one of them is responsible.

It "hard locks" in this way 80% of the time. If I reboot the machine between 3-10 times, it will eventually let me in to Xorg.

I don't know if this is related but there are many other issues with GM45. If I manage to get in to X, DPMS Off events kill the graphics (but the machines is still up), xrandr reports SVDO and LVDS are both active (with the wrong resolutions). I have nothing plugged in to the DVI port so this is additionally wrong (and possibly the source of any EDID probing issue?).

This is extremely frustrating. I bought this laptop after I heard Keith Package at IDF say how proud the Intel team was for getting the GM45 driver out a month before the hardware was available. At the moment, I have a laptop that takes 5-20 minutes to bring online.

Jason D. Clinton
Gnome Games Module Maintainer
Comment 5 Eitan Isaacson 2008-09-10 16:58:19 UTC
I could add my ditto to Jason's comment. I am "eeejay" on the lenovo forum thread. This is happening on a Lenovo T400 with switchable graphics (disabled in BIOS). I am getting the same DPMS bug too, if it is related.
Comment 6 Wang Zhenyu 2008-09-10 18:04:56 UTC
We're also seeing X start lockup issue on one of GM45 board here. It looks relate to C4 or render standby setting in bios. Chance is to see if your bios has those setting, and trys to disable it as a workaround until we find what really goes wrong. Although my debian sid (2.6.26) with current X master bits work fine on it.

Please attach X log with ModeDebug on.
Comment 7 Jason D. Clinton 2008-09-10 20:31:27 UTC
Through some subtle property of Murphey's Law, 20 attempts to get a crash with ModeDebug on have resulted in booting perfectly each time; I haven't changed anything except for that one xorg variable... it just suddenly won't crash with it on. I'll try it again tomorrow morning connected to my dock; perhaps that will coax it to do my bidding. Race condition, perhaps?

And sorry about the name, Keith Packard. Not sure how I typo'd that.
Comment 8 Evan Klitzke 2008-09-10 21:40:10 UTC
(In reply to comment #7)
> Through some subtle property of Murphey's Law, 20 attempts to get a crash with
> ModeDebug on have resulted in booting perfectly each time; I haven't changed
> anything except for that one xorg variable...

Also works for me with ModeDebug... sounds an awful lot like a nasty race condition if you ask me.
Comment 9 Evan Klitzke 2008-09-10 22:02:19 UTC
Created attachment 18814 [details]
crash with ModeDebug turned on
Comment 10 Evan Klitzke 2008-09-10 22:03:51 UTC
(In reply to comment #8)
> Also works for me with ModeDebug... sounds an awful lot like a nasty race
> condition if you ask me.

I take this back -- what actually happens for me with ModeDebug is the X server crashes (but doesn't lock the kernel) and Ubuntu automatically switches to the vesa driver (which is why I thought it was working). I just attached the Xorg log from a crash with ModeDebug turned on.
Comment 11 Wang Zhenyu 2008-09-10 22:31:31 UTC
Could you help me to verify if bios has 'C4 state' or 'Render Standby' options available and setting?

Evan, what's your laptop model?
Comment 12 Evan Klitzke 2008-09-10 23:55:35 UTC
(In reply to comment #11)
> Could you help me to verify if bios has 'C4 state' or 'Render Standby' options
> available and setting?
> 
> Evan, what's your laptop model?

This is on a Thinkpad T500.
Comment 13 Wang Zhenyu 2008-09-11 00:56:36 UTC
I've pushed a patch which might fix this to master and 2.4 branch, please try it.
Comment 14 Eitan Isaacson 2008-09-11 09:06:40 UTC
Created attachment 18823 [details]
Xorg log after patch

I applied the patch.
The X server starts, this doesn't really mean anything in itself since I was able to briefly use X also last time when i just recompiled the driver.

The oddest thing is that glxinfo segfaults. I'll attach the output of glxinfo (just before it crashes) in the next comment. I'll need to rebuild mesa-utils since the stack trace I have now has no debugging signals.

Anyway, the Xorg log attached here might give a clue..
Comment 15 Eitan Isaacson 2008-09-11 09:09:52 UTC
Created attachment 18824 [details]
glxinfo output
Comment 16 Jason D. Clinton 2008-09-11 14:20:05 UTC
Created attachment 18830 [details]
crash with ModeDebug and more output

It took about 20 crashes but I got a crash log with some actually useful information. This log that is attached happened to have an ext3 commit at the moment right before the xserver locked the machine--so the log is very late in the initialization process.

I haven't tried the patch to the 2.4 branch yet.

Specifically which commit SHA1 is the one you're talking about?
Comment 17 Jason D. Clinton 2008-09-11 15:39:55 UTC
BTW, my laptop is a T400 and it does _not_ have the C4 or Standby options.
Comment 18 Wang Zhenyu 2008-09-11 18:13:17 UTC
commit 86f82c429f5d7067c52d3b783988917869e13d1d on 2.4 branch.

Jason, please try it, I've seen from log render standby on Evan's T500 is on and disabled by driver.
Comment 19 Jason D. Clinton 2008-09-11 18:52:23 UTC
I have tried the new driver: so far, so good. No crashes. But I'll have to report after using it a little more extensively since it's not 100% guaranteed to crash every time.

As for Eitan's OpenGL problems: I believe that issue may be related the 2.4 trunk now requiring libdrm 2.4. I'm using Intel driver 2.4 trunk with libdrm 2.3.1 + some git cherry picks (as found in Debian experimental) and everything OpenGL-wise is working perfectly. Perhaps Eitan is using libdrm 2.3.1 w/o the git cherry picks.
Comment 20 Eitan Isaacson 2008-09-11 22:06:44 UTC
(In reply to comment #19)
> As for Eitan's OpenGL problems: I believe that issue may be related the 2.4
> trunk now requiring libdrm 2.4. I'm using Intel driver 2.4 trunk with libdrm
> 2.3.1 + some git cherry picks (as found in Debian experimental) and everything
> OpenGL-wise is working perfectly. Perhaps Eitan is using libdrm 2.3.1 w/o the
> git cherry picks.
> 

I didn't use 2.4 trunk, I just created a patch for "HEAD~1" and added it to the debian package collection. But that might not be a bad idea, maybe I should just use 2.4 trunk...
Comment 21 Eitan Isaacson 2008-09-11 23:08:38 UTC
Ok, I used 2.4 trunk, and libdrm from debian experimental, I still get a segfault with glxinfo.
Comment 22 Evan Klitzke 2008-09-12 00:31:53 UTC
(In reply to comment #18)
> commit 86f82c429f5d7067c52d3b783988917869e13d1d on 2.4 branch.
> 
> Jason, please try it, I've seen from log render standby on Evan's T500 is on
> and disabled by driver.

Wang, I will try out your changes tomorrow night... I spent a few minutes tonight (was able to build the driver with your commit), but gave up for now after running into issues with dlopen in X11 not being able to find the new libdrm_intel.so.1.

To the others in this thread -- did you install your module into /usr/local? What was the magic you needed to get it to work? I added /usr/local/lib/xorg/modules to my ModulePath and exported /usr/local/lib in LD_LIBRARY_PATH but still no luck
Comment 23 Jason D. Clinton 2008-09-12 08:16:01 UTC
I used ./autogen.sh --prefix=/usr

I tried /usr/local with no luck and didn't care to go poking around in xorg internals to modify the driver search path.
Comment 24 Evan Klitzke 2008-09-12 08:43:02 UTC
(In reply to comment #22) 
> Wang, I will try out your changes tomorrow night... I spent a few minutes
> tonight (was able to build the driver with your commit), but gave up for now
> after running into issues with dlopen in X11 not being able to find the new
> libdrm_intel.so.1.

When I booted up my laptop this morning my Xorg.conf was still set to use the intel driver from last night, and now it's working (and I can see that it's loading the driver I built from git).

I can also run glxinfo and see that direct rendering is on. glxinfo isn't crashing for me, either.

As far as I can tell this bug is fixed :-)
Comment 25 Gordon Jin 2008-09-13 00:02:12 UTC
Good. So I'm closing this bug.

If any of you find further problems (like Eitan's glxinfo segfault), please open a separate bug for tracking.
Comment 26 Taiyang Chen 2008-09-17 08:50:18 UTC
Created attachment 18955 [details]
Xorg.0.log crash when on battery
Comment 27 Taiyang Chen 2008-09-17 08:52:43 UTC
Hi,

I am using the intel driver from the latest git repo, and I am seeing a similar crash again.
This only happens when my laptop (thinkpad x200) is on battery but goes away when it is on AC. Would this be relevant to this bug or there are some other configs I messed up?
The crash log is attached.
Thanks!


-Tai
Comment 28 Bryan O'Sullivan 2008-09-21 15:11:04 UTC
This bug definitely isn't fixed. I just built the head of the git tree, and my X200 suffers the same blank screen problem as with plain 2.4.2.
Comment 29 Jason D. Clinton 2008-09-21 15:59:47 UTC
I can say with 100% certainty that it did fix the problem for me on a T400 w/ GM45 graphics. Also resolved DPMS off issues.

How 'bout some details. What do you mean "blank screen problem"? What OS, kernel, Xorg, libdrm?
Comment 30 Wang Zhenyu 2008-09-21 18:07:33 UTC
Sorry, the missing point here is that the patch is on both git master and 2.4-branch, but not included in 2.4.2 release. (I don't know if we'd plan to release 2.4.3, as 2.5.0 is very close to be out.) So please try to pull current xf86-video-intel-2.4-branch.
Comment 31 Gordon Jin 2008-09-21 18:56:07 UTC
(In reply to comment #28)
> This bug definitely isn't fixed. I just built the head of the git tree, and my
> X200 suffers the same blank screen problem as with plain 2.4.2.

We would close the bug if the original reporter says his problem resolved. Yours may be different issue (though with similar sympton). So please file a new bug instead of reopening this, according to http://www.intellinuxgraphics.org/how_to_report_bug.html. Thanks. 

Comment 32 Bryan O'Sullivan 2008-09-21 20:06:33 UTC
Reopened as http://bugs.freedesktop.org/show_bug.cgi?id=17507


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.