Forwarding this bug from Ubuntu reporter Ben Gamari: http://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/980017 [Problem] Lockup several times a day preceded by corruption, started immediately after upgrading to precise on 1st April. Does not occur with Unity 2D. Seems to be more frequently triggered when using the Unity 3D application switcher. We saw quite a few 0x02000004 bugs last cycle in Ubuntu Oneiric, but this cycle just a couple. (LP #978836 being the other report). [Original Description] Machine sporadically locks up while running compiz. Lockup is generally preceded by obvious display corruption. Eventually GPU locks up, resulting in a blank screen and unresponsive machine, even to sysrq. Attached is an example of the sort of corruption exhibited. Note the color and grey boxes of the menu bar at the top of the screen. It actually seems that the system will respond to sysrq if issued not too long after the screen turns blank. After a few seconds however, it will not respond. The application switcher seems to be very good at reproducing the crash, which occurs quite often, usually within five minutes of logging in. The machine is completely stable under Unity 2D. ProblemType: Crash DistroRelease: Ubuntu 12.04 Package: xserver-xorg-video-intel 2:2.17.0-1ubuntu4 ProcVersionSignature: Ubuntu 3.2.0-23.36-generic 3.2.14 Uname: Linux 3.2.0-23-generic x86_64 .tmp.unity.support.test.0: ApportVersion: 2.0.1-0ubuntu2 Architecture: amd64 Chipset: i965gm CompizPlugins: No value set for `/apps/compiz-1/general/screen0/options/active_plugins' CompositorRunning: compiz Date: Thu Apr 12 11:54:19 2012 DistUpgraded: 2012-04-01 17:31:17,679 DEBUG enabling apt cron job DistroCodename: precise DistroVariant: ubuntu DuplicateSignature: [i965gm] GPU lockup render.IPEHR: 0x02000004 Ubuntu 12.04 ExecutablePath: /usr/share/apport/apport-gpu-error-intel.py ExtraDebuggingInterest: Yes, whatever it takes to get this fixed in Ubuntu GpuHangFrequency: Several times a day GpuHangReproducibility: Seems to happen randomly GpuHangStarted: Immediately after installing this version of Ubuntu GraphicsCard: Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (primary) [8086:2a02] (rev 0c) (prog-if 00 [VGA controller]) Subsystem: Dell Device [1028:01fe] Subsystem: Dell Device [1028:01fe] InstallationMedia: Ubuntu 11.10 "Oneiric Ocelot" - Release amd64 (20111012) InterpreterPath: /usr/bin/python2.7 MachineType: Dell Inc. Latitude D830 PccardctlIdent: Socket 0: no product info available PccardctlStatus: Socket 0: no card ProcCmdline: /usr/bin/python /usr/share/apport/apport-gpu-error-intel.py ProcEnviron: ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.2.0-23-generic root=UUID=27880cc8-df42-4098-8e07-3c4fb9dba0a5 ro quiet splash vt.handoff=7 RelatedPackageVersions: xserver-xorg 1:7.6+12ubuntu1 libdrm2 2.4.32-1ubuntu1 xserver-xorg-video-intel 2:2.17.0-1ubuntu4 SourcePackage: xserver-xorg-video-intel Title: [i965gm] GPU lockup render.IPEHR: 0x02000004 UpgradeStatus: Upgraded to precise on 2012-04-01 (10 days ago) UserGroups: dmi.bios.date: 01/04/2010 dmi.bios.vendor: Dell Inc. dmi.bios.version: A15 dmi.board.vendor: Dell Inc. dmi.chassis.type: 8 dmi.chassis.vendor: Dell Inc. dmi.modalias: dmi:bvnDellInc.:bvrA15:bd01/04/2010:svnDellInc.:pnLatitudeD830:pvr:rvnDellInc.:rn:rvr:cvnDellInc.:ct8:cvr: dmi.product.name: Latitude D830 dmi.sys.vendor: Dell Inc. version.compiz: compiz 1:0.9.7.6-0ubuntu1~ppa3 version.ia32-libs: ia32-libs N/A version.libdrm2: libdrm2 2.4.32-1ubuntu1 version.libgl1-mesa-dri: libgl1-mesa-dri 8.0.2-0ubuntu3 version.libgl1-mesa-dri-experimental: libgl1-mesa-dri-experimental N/A version.libgl1-mesa-glx: libgl1-mesa-glx 8.0.2-0ubuntu3 version.xserver-xorg-core: xserver-xorg-core 2:1.11.4-0ubuntu10 version.xserver-xorg-input-evdev: xserver-xorg-input-evdev 1:2.7.0-0ubuntu1 version.xserver-xorg-video-ati: xserver-xorg-video-ati 1:6.14.99~git20111219.aacbd629-0ubuntu2 version.xserver-xorg-video-intel: xserver-xorg-video-intel 2:2.17.0-1ubuntu4 version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau 1:0.0.16+git20111201+b5534a1-1build2
Created attachment 60189 [details] BootDmesg.txt
Created attachment 60190 [details] CurrentDmesg.txt
Created attachment 60191 [details] i915_error_state.txt
Created attachment 60192 [details] XorgLog.txt
Created attachment 60193 [details] Screenshot from 2012-04-12 12:05:52.png
The error-state looks ordinary and more importantly self-consistent. There are not the tell-tales of recent bugs, so I currently have no explanation for the hang. Can you please attach a few more error-states to see if a pattern forms?
Created attachment 60321 [details] Another crash dump The panel was inactive although the machine was responsive over SSH.
In the second crash dump, mesa overwrote our batch performing a depth-clear. One more...
Created attachment 60322 [details] Yet another dump
(In reply to comment #9) > Created attachment 60322 [details] > Yet another dump This wasn't a hang, so I'm not going to use its vote as to whether there is an underlying UXA bug here...
Created attachment 60323 [details] i915_error_state from odd hang This time I couldn't get the rest of the dump as cat BUGs in i915_batchbuffer_info. Nevertheless, here is i915_error_state.
Created attachment 60324 [details] crash-20120419-1230
That time, the stray depth clear hit a Mesa batch buffer. With 3 clear errors, let's presume this is the first and foremost the stray clear that's causing the hangs.
Created attachment 60325 [details] crash-20120419-1233
Created attachment 60327 [details] crash-20120419-1242
I can confirm that the problem appears to be gone with mesa master (dbf48e88)
The 8.0 branch (6fe42b6) exhibits the problem.
(In reply to comment #17) > The 8.0 branch (6fe42b6) exhibits the problem. To clarify the 8.0 branch (currently 49ed43b6) exhibits the issue as does 6fe42b6, the point where master diverged from 8.0.
8f5c172c does not exhibit the problem
952ca07 exhibits the problem.
7335cf1c exhibits the problem.
9be0f9 exhibits the issue.
dbadd39 does not exhibit the problem.
f00c97b does not exhibit the problem. 117a0e9 exhibits the problem. 308c6be exhibits the problem. fbe8543 does not exhibit the problem. e2dce7f does not exhibit the problem.
Here is the first working commit: commit e2dce7f7ee3e7da9cbb0bb33307ecd79e824426d Author: Eric Anholt <eric@anholt.net> Date: Fri Feb 10 12:54:25 2012 -0800 intel: Fix rendering from textures after RenderTexture(). There's a serious trap for drivers: RenderTexture() does not indicate that the texture is currently bound to the draw buffer, despite FinishRenderTexture() signaling that the texture is just now being unbound from the draw buffer. We were acting as if RenderTexture() *was* the start of rendering and that we could make texturing incoherent with the current contents of the renderbuffer. This caused intel oglconform sRGB Mipmap.1D_textures to fail, because we got a call to TexImage() and thus RenderTexture() on a texture bound to a framebuffer that wasn't the draw buffer, so we skipped validating the new image into the texture object used for rendering. We can't (easily) make RenderTexture() indicate the start of drawing, because both our driver and gallium are using it as the moment to set up the renderbuffer wrapper used for things like MapRenderbuffer(). Instead, postpone the setup of the workaround render target miptree until update_renderbuffer time, so that we no longer need to skip validation of miptrees used as render targets. As a bonus, this should make GL_NV_texture_barrier possible. (This also fixes a regression in the gen4 small-mipmap rendering since 3b38b33c1648b07e75dc4d8340758171e109c598, which switched set_draw_offset from image->mt to irb->mt but didn't move the irb->mt replacement up before set_draw_offset). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44961 NOTE: This is a candidate for the 8.0 branch.
Pushed the cherry pick.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.