Bug 17540 - X can not start up with xf86-video-intel tip without GEM kernel
Summary: X can not start up with xf86-video-intel tip without GEM kernel
Status: VERIFIED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/intel (show other bugs)
Version: git
Hardware: Other Linux (All)
: highest blocker
Assignee: Eric Anholt
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
: 17572 (view as bug list)
Depends on:
Blocks: intel-2.5 17426
  Show dependency treegraph
 
Reported: 2008-09-11 18:25 UTC by liuhaien
Modified: 2008-10-12 18:52 UTC (History)
7 users (show)

See Also:
i915 platform:
i915 features:


Attachments
xorg.0.log (26.02 KB, text/plain)
2008-09-11 18:25 UTC, liuhaien
no flags Details
xorg conf file (3.26 KB, text/plain)
2008-09-11 18:25 UTC, liuhaien
no flags Details
a patch to fix the crash when using non-GEM DRM module. (1.68 KB, patch)
2008-09-12 01:43 UTC, haihao
no flags Details | Splinter Review
a patch to fix the crash when using non-GEM DRM module. (1.68 KB, patch)
2008-09-12 01:44 UTC, haihao
no flags Details | Splinter Review

Description liuhaien 2008-09-11 18:25:23 UTC
Created attachment 18832 [details]
xorg.0.log

System Environment:
--------------------------
Host:		bl1
Arch:		i386
OSD:		Fedora Core release 6 (Zod)
Kernel:		2.6.27-rc5
Libdrm:		(master)973c634eaa54ee4085a72102c690bc643cb2d7a8
Mesa:		(master)35fd72756a05463568d94862f4fcd234903e1204
Xserver:		(master)31c62495f1de6e9ba41e1f6d7fa263eeb849129b
Xf86_video_intel:		(master)ec17c88a0ed7c9cf4ad68aa52a7a891946a1c0f4
Bug detailed description:
--------------------------
on our all test machines(include i915 and i965),X can not start up with the latest driver.following is the error info:
(EE) Failed to load module "dri2" (module requirement mismatch, 0)

Backtrace:
0: X(xf86SigHandler+0x79) [0x80c0d99]
1: [0xffffe400]
2: /opt/X11R7/lib/xorg/modules/drivers//intel_drv.so(I830DRIDoMappings+0x943) [0xb7e936c3]
3: /opt/X11R7/lib/xorg/modules/drivers//intel_drv.so [0xb7e62fd5]
4: X(AddScreen+0x195) [0x806bf65]
5: X(InitOutput+0x1ec) [0x80a64fc]
6: X(main+0x1fb) [0x806c67b]
7: /lib/libc.so.6(__libc_start_main+0xdc) [0x4a07ff2c]
8: X [0x806bcd1]

Fatal server error:
Caught signal 11.  Server aborting


Please consult the The X.Org Foundation support
         at http://wiki.x.org
 for help.
Please also check the log file at "/opt/X11R7/var/log/Xorg.0.log" for additional information.


[1]+  Aborted                 X


Reproduce steps:
----------------
1.X&
Comment 1 liuhaien 2008-09-11 18:25:49 UTC
Created attachment 18833 [details]
xorg conf file
Comment 2 liuhaien 2008-09-11 18:35:17 UTC
and our drmmodule use below commit:
DrmR:		(drm-next)721f1651546231c4e53a1afe31e15e8234b076e0
it happens on our GM45,G965,G33 and 945g.
Comment 3 Gordon Jin 2008-09-11 19:20:08 UTC
I guess this has been fixed in the latest commit:

commit d8d95d8c71f2cd4bab277f44132ece7963714a5b
Author: Eric Anholt <eric@anholt.net>
Date:   Thu Sep 11 16:11:46 2008 -0700

    Fix build failures that should have been in the previous merge commit.
Comment 4 Gordon Jin 2008-09-11 19:46:12 UTC
Oh, please ignore my comment#3. It's on xf86-video-intel dri2 branch.
Comment 5 Gordon Jin 2008-09-11 19:52:21 UTC
Anyone see this regression too? Any idea?

The only reason I can imagine is Eric's libdrm change yesterday. Haien, can you confirm?
Comment 6 Wang Zhenyu 2008-09-11 20:19:28 UTC
I don't have this problem on my 915G, so fixing your building. Pull all needed repos. This bug should be closed.
Comment 7 liuhaien 2008-09-12 00:42:02 UTC
I try to rebuild all components using the latest commit.and this issue still exists on our G33 ,G43, GM45 Q965 and GM965.
I bisect and find when using below commit,it can work well:
libdrm:master 738e36acbce24df0ccadb499c5cf62ccb74f56df
xf86-video-intel:master 738e36acbce24df0ccadb499c5cf62ccb74f56df
Comment 8 haihao 2008-09-12 01:39:05 UTC
I can reproduce it with a non-GEM DRM module.
Comment 9 haihao 2008-09-12 01:43:15 UTC
Created attachment 18837 [details] [review]
a patch to fix the crash when using non-GEM DRM module.
Comment 10 haihao 2008-09-12 01:44:21 UTC
Created attachment 18838 [details] [review]
a patch to fix the crash when using non-GEM DRM module.
Comment 11 Tobias Hain 2008-09-12 09:31:50 UTC
Actually I am confused by this bugreport:

First of all you recognize in Xorg.0.log this error message, which you make responsible for the error:

(EE) Failed to load module "dri2" (module requirement mismatch, 0)

If you scroll up in the Xorg.0.log you'll see the reason why this fails:

(II) Loading /opt/X11R7/lib/xorg/modules/extensions//libdri2.so
(II) Module dri2: vendor="X.Org Foundation"
	compiled for 1.5.99.1, module version = 1.0.0
	ABI class: X.Org Server Extension, version 1.1
(EE) module ABI major version (1) doesn't match the server's version (2)
(II) UnloadModule: "dri2"
(II) Unloading /opt/X11R7/lib/xorg/modules/extensions//libdri2.so

It tells libdri2 has extesion version 1.1, but your server has version 2.0, which can be seen scrolling further up:

(II) Module ABI versions:
	X.Org Server Extension : 2.0

This should be easy to fix, since it's cause by a wrong installation on your machine.

rm /opt/X11R7/lib/xorg/modules/extensions/libdri2.so
rm /opt/X11R7/lib/xorg/modules/extensions/libdri2.la

should be sufficient. And then you may rebuild libdri2.so with the correct Xserver Extension ABI. Something like:

cd my_x_server_sources
cd hw/xfree86/dri2
make install

However from what I've seen so far the xf86_video_intel master branch that you've been using isn't dependent on DRI2 at all. This would be the case if you switch to origin/dri2.

That means: As long as you are on the 2d master branch you can do even without dri2 xserver extension. You wouldn't even have to recompile it. Just deleting libdri2.so should be sufficient.

--

The second thing is that I believe you ran into
https://bugs.freedesktop.org/show_bug.cgi?id=17531

as well. Although your failure message is different.

One indication is your bisect result:

> I bisect and find when using below commit,it can work well:
> libdrm:master 738e36acbce24df0ccadb499c5cf62ccb74f56df
> xf86-video-intel:master 738e36acbce24df0ccadb499c5cf62ccb74f56df

We are talking about the exact same commit of libdrm that causes the break. It's just that you made a copy and paste mistake when posting SHA1 commit checksum of 2d driver. It's the same as libdrm, which can't be the case.

--

Last but not least I'd like to mention that I also tried running the 2d DRI2 driver branch against a DRI2 enabled Xorg Server - even in UXA mode. This fails with bug #17531 as well.
Comment 12 Gordon Jin 2008-09-12 21:14:21 UTC
(In reply to comment #11)
> Actually I am confused by this bugreport:

Tobias, thanks for your reply.

1) Yes, this bug is not related to DRI2.

2) This bug is different with bug#17531 as this kernel doesn't have GEM. Haihao's patch should fix this bug, in a non-GEM kernel path.
Comment 13 Gordon Jin 2008-09-12 21:15:22 UTC
Eric, please review Haihao's patch with high priority.
Comment 14 Jie Luo 2008-09-13 13:18:35 UTC
This patch make my xserver work again with none-GEM drm module.
Comment 15 Eric Anholt 2008-09-15 13:39:27 UTC
Comment on attachment 18837 [details] [review]
a patch to fix the crash when using non-GEM DRM module.

looks like the patch was posted twice?
Comment 16 haihao 2008-09-15 18:18:17 UTC
Yes.  I made a mistake. I clicked the commit button twice in a short time.
Comment 17 liuhaien 2008-09-15 18:44:45 UTC
haihao's patch takes effect,we get the same result as comment #14.
Comment 18 Gordon Jin 2008-09-15 22:42:58 UTC
With bug#17543 fixed on 2.5-branch, now 2.5-branch also blocked by this bug.
Comment 19 Wang Zhenyu 2008-09-15 23:36:31 UTC
@@ -3693,7 +3692,7 @@ I830EnterVT(int scrnIndex, int flags)
 	* operation which accessing that page, like irq install, etc.
 	*/
        if (pI830->starting && !pI830->memory_manager) {
-	   if (!I830DRISetHWS(pScrn)) {
+	   if (pI830->hw_status != NULL && !I830DRISetHWS(pScrn)) {
 		   xf86DrvMsg(pScrn->scrnIndex, X_ERROR,
 			   "Fail to setup hardware status page.\n");
 		   I830DRICloseScreen(pScrn->pScreen);

HWS_NEED_GFX(pI830) might be used for this, or put the check into I830DRISetHWS().
Comment 20 Gordon Jin 2008-09-16 01:55:53 UTC
*** Bug 17572 has been marked as a duplicate of this bug. ***
Comment 21 Jan Ekholm 2008-09-16 02:54:50 UTC
I'm the reporter of https://bugs.freedesktop.org/show_bug.cgi?id=17572 which was a duplicate of this bug. Testing the above patch (https://bugs.freedesktop.org/attachment.cgi?id=18838) makes the driver go into an endless loop when going with totally default options and crash when EXA and DRI are disabled. Stack traces etc as attachments at the duplicate bug report.

Personally I will unfortunately not be actively monitoring the G45 bugs anymore as I need to actually use my new workstation and will be putting in another graphics card. Doesn't seem to be any way to get the driver to stay alive for basic 2D work?

Comment 22 Eric Anholt 2008-09-16 11:58:42 UTC
I pushed the bit for putting the hw_status check back in, but the moving of the other two hunks wasn't explained in a commit message in the patch, and I didn't see why they were necessary.  Could you explain?
Comment 23 Jie Luo 2008-09-16 12:35:29 UTC
(In reply to comment #22)
> I pushed the bit for putting the hw_status check back in, but the moving of the
> other two hunks wasn't explained in a commit message in the patch, and I didn't
> see why they were necessary.  Could you explain?
> 

This is added in commit f367334c6392a717f6cd2f4ed02200be1c6d356a.

--- a/src/i830_dri.c
+++ b/src/i830_dri.c
@@ -834,6 +834,11 @@ I830DRIDoMappings(ScreenPtr pScreen)
       return FALSE;
    }
 
+   if (pI830->memory_manager == NULL)
+       intel_bufmgr_fake_set_last_dispatch(pI830->bufmgr,
+                                          (volatile unsigned int *)
+                                          &sarea->last_dispatch);
+
    /* init to zero to be safe */
    sarea->front_handle = 0;
    sarea->back_handle = 0;

And I830DRIDoMappings() is called before i830_init_bufmgr(), which means pI830->bufmgr is NULL at this point. This cause NULL dereference inside intel_bufmgr_fake_set_last_dispatch().

void intel_bufmgr_fake_set_last_dispatch(dri_bufmgr *bufmgr,
                                         volatile unsigned int *last_dispatch)
{
   dri_bufmgr_fake *bufmgr_fake = (dri_bufmgr_fake *)bufmgr;

   bufmgr_fake->last_dispatch = last_dispatch;
}
Comment 24 Eric Anholt 2008-09-16 13:23:57 UTC
Thanks for the explanation.  Applied the rest.
Comment 25 liuhaien 2008-09-17 00:50:47 UTC
this issue has been blocked by bug #17621.
Comment 26 liuhaien 2008-09-17 20:29:49 UTC
this issue also happens with xf86-video-intel-2.5-branch.
Comment 27 liuhaien 2008-10-12 18:52:38 UTC
verified


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.