Summary: | 2.10.0 causes kernel oops and system hangs | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | xorg | Reporter: | Łukasz Maśko <ed> | ||||||
Component: | Driver/intel | Assignee: | Carl Worth <cworth> | ||||||
Status: | RESOLVED FIXED | QA Contact: | Xorg Project Team <xorg-team> | ||||||
Severity: | critical | ||||||||
Priority: | high | CC: | arekm, octavsly | ||||||
Version: | unspecified | Keywords: | regression | ||||||
Hardware: | x86 (IA32) | ||||||||
OS: | Linux (All) | ||||||||
Whiteboard: | |||||||||
i915 platform: | i915 features: | ||||||||
Attachments: |
|
Description
Łukasz Maśko
2010-01-06 01:57:56 UTC
what's the previous working version? I can use without problems intel driver 2.9.1 with Mesa 7.7, libdrm 2.4.17 and kernel 2.6.32.x. The problems appear when I change the intel driver. My system: 00:02.1 Display controller: Intel Corporation Mobile 945GM/GMS/GME, 943/940GML Express Integrated Graphics Controller (rev 03) also become unstable with 2.10.0. No kernel crash but display is frozen I had xf86-video-intel-2.10.0 libdrm-2.4.17 mesa-7.7 xorg-server-1.7.4 kernel-2.6.33_rc3 after few minutes of X using I got in Xorg.0.log: (EE) intel(0): Failed to submit batch buffer, expect rendering corruption or even a frozen display: Input/output error. could not use the display anymore, I had to reboot. Then I have tried: xf86-video-intel-2.10.0 libdrm-2.4.17 mesa-7.7 xorg-server-1.7.4 kernel-2.6.32 again issues so I have dumped the registers with intel_gpu_dump which is attached. Also the Xorg.0.log is attached The working solution right now is using xf86-video-intel-2.9.1 libdrm-2.4.17 mesa-7.7 xorg-server-1.7.4 kernel-2.6.32 Created attachment 32610 [details]
intel_gpu_dump for driver 2.10 on 2.6.32 kernel
Created attachment 32611 [details]
Xorg log file showing the error message
(In reply to comment #2) > I can use without problems intel driver 2.9.1 with Mesa 7.7, libdrm 2.4.17 and > kernel 2.6.32.x. The problems appear when I change the intel driver. Hi Łukasz, Is this failure easy for you to replicate? I haven't seen a trend of similar kernel crashes from other users with version 2.9.1, so there might be something unique about your system. If so, could you perform a git-bisect between version 2.9.1 and 2.10 of the driver to identify what the commit is that introduced the problem? -Carl (In reply to comment #3) > My system: > 00:02.1 Display controller: Intel Corporation Mobile 945GM/GMS/GME, 943/940GML > Express Integrated Graphics Controller (rev 03) > > also become unstable with 2.10.0. No kernel crash but display is frozen Hi Octavian, I appreciate you sharing your report. However, with graphics driver bugs we really want one bug report per user symptom. Certainly one bug that results in a kernel crash is distinct from a bug that does not. Could you please open a second bug report for your issue so that we can track it separately? Thanks, -Carl (In reply to comment #6) [...] > Is this failure easy for you to replicate? > > I haven't seen a trend of similar kernel crashes from other users with version > 2.9.1, so there might be something unique about your system. Yes, it is (or at leas was) easy to replicate. I've given a chance to 2.10.0 and tried it several times, for I know, that sometimes crashes may be caused by other system elements - but every time, sooner or later, the result was the same (system hanged). Since then I'm using only 2.9.1, which is rock-steady (if I don't try to use a 2-screen configuration, which gives me a workspace wider then 2048). > If so, could you perform a git-bisect between version 2.9.1 and 2.10 of the > driver to identify what the commit is that introduced the problem? I'm affraid, I'll have two problems with it: first, right now I have no time to do it, to much work and not enough time. Second, I've never done such bisection, so I need some instructions. Lukasz Just to confirm - I've just tried again, this time on 2.6.32.7. Half an our later: crash. Went back to 2.9.1. Eric realised that we were not accounting for pinned buffers when working out of the number of fences required for the batch. As we don't actually know how many of these fences are lost due to pinned buffers, we have to make a conservative guess of 2 instead. The reason this starting having a pronounced effect with 2.10.0 is the put_image acceleration introduced with that release will used a tiled blit (requiring a fence on i915 and prior) and hence causing fence starvation (as previously we never even attempted to use fences). I've pushed this patch to drm, that should work-around this issue: commit fdcde592c2c48e143251672cf2e82debb07606bd Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Feb 9 08:32:54 2010 +0000 intel: Account for potential pinned buffers hogging fences As the kernel reports the total number of fences, we must guess how many fences are likely to be pinned. In the typical system these will be only used by the scanout buffers, of which there may be one per pipe, and any number of manually pinned fenced buffers. So take a conservative guess and reserve two fences for use by the system. Note this reduces the number of fences to 3 for i915 and prior. Reference: http://bugs.freedesktop.org/show_bug.cgi?id=25911 The latest intel driver 2.10.0 causes kernel oops and system hangs Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.