Bug 61069

Summary: [NV17] Crash while displaying large image
Product: xorg Reporter: dave.mueller
Component: Driver/nouveauAssignee: Nouveau Project <nouveau>
Status: RESOLVED FIXED QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: medium CC: v_2e
Version: unspecified   
Hardware: x86 (IA32)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
crash Xorg.0.log
none
Image which causes the crash none

Description dave.mueller 2013-02-18 16:00:53 UTC
Created attachment 75050 [details]
crash Xorg.0.log

Xorg crash

If i try to display a large (3200x2700) PNG file using the "display" tool from the "ImageMagick" package, X crashes at shown in the attached Xorg log file.
  
If i use the "old" nv driver on the same system and environment, image is displayed without problem. Therefore i assume a problem in the "nouveau" driver.
Comment 1 dave.mueller 2013-02-18 16:02:47 UTC
Created attachment 75051 [details]
Image which causes the crash
Comment 2 Ilia Mirkin 2013-08-21 05:32:14 UTC
Does this still happen? I just tried it with the latest kernel/ddx on a NV18, which is rather similar, and it all worked fine (albeit rather slowly). Please re-test.
Comment 3 dave.mueller 2013-08-21 06:37:41 UTC
What do you mean by "latest kernel"? If you mean 3.10, the answer is yes.
Is there something in the upcoming 3.11 which may fix this issue?
Comment 4 Ilia Mirkin 2013-08-21 06:52:14 UTC
I doubt there's anything in 3.11 to fix this, but the fact is that I'm currently running on top of the nouveau/linux-2.6 tree (http://cgit.freedesktop.org/nouveau/linux-2.6/), which is 3.11-rc6 + some more commits. I have a couple of patches applied on top of that, but I don't think any of them would matter for this case.

What other differences are there? I'm using 1920x1200, you're on 1600x1200. I did get the "pan" thing from imagemagick's display command. I assume you would get that too (if it didn't crash)? Do you use a compositor of some sort? I don't. My NV18 has 64M VRAM, reports a 128M GART... how much does your card have?

Can you get debug libraries in place so that you get symbol resolution on nouveau_drv.so in the stacktrace printed by X?

BTW, if you don't want to constantly kill your "regular" X session, you can start up a second X session to do your testing in. (Might seem obvious, but it took me a little while before I thought of that.)
Comment 5 dave.mueller 2013-08-21 07:51:12 UTC
Perhaps you have to try with a bigger image (e.g. 4096x4096 pixel) to compensate for your larger screen resolution.

I didn't played with the compositor so far, just used the standard KDE settings.

I don't think that a second X session will do any good, as the system is quite unstable after the crash and needs to be rebooted.

My NV17 also has 64MiB VRAM as shown below:

nouveau  [  DEVICE][0000:01:00.0] BOOT0  : 0x017200a5
nouveau  [  DEVICE][0000:01:00.0] Chipset: NV17 (NV17)
nouveau  [  DEVICE][0000:01:00.0] Family : NV10
nouveau  [   VBIOS][0000:01:00.0] checking PRAMIN for image...
nouveau  [   VBIOS][0000:01:00.0] ... appears to be valid
nouveau  [   VBIOS][0000:01:00.0] using image from PRAMIN
nouveau  [   VBIOS][0000:01:00.0] BMP version 5.15
nouveau  [   VBIOS][0000:01:00.0] version 04.17.00.45.00
nouveau  [  PTIMER][0000:01:00.0] unknown input clock freq
nouveau  [     PFB][0000:01:00.0] RAM type: DDR1
nouveau  [     PFB][0000:01:00.0] RAM size: 64 MiB
nouveau  [     PFB][0000:01:00.0]    ZCOMP: 0 tags
nouveau  [     DRM] VRAM: 63 MiB
nouveau  [     DRM] GART: 128 MiB
nouveau  [     DRM] BMP version 5.21
nouveau  [     DRM] DCB version 2.0
nouveau  [     DRM] DCB outp 00: 01000100 000088b8
nouveau  [     DRM] DCB outp 01: 02010111 00000003
nouveau  [     DRM] DCB outp 02: 02010211 00000003
nouveau  [     DRM] Merging DCB entries 1 and 2
nouveau  [     DRM] Loading NV17 power sequencing microcode
nouveau  [     DRM] Saving VGA fonts
nouveau  [     DRM] 1 available performance level(s)
nouveau  [     DRM] 0: memory 332MHz
nouveau  [     DRM] c: core 249MHz memory 333MHz
nouveau  [     DRM] MM: using M2MF for buffer copies
nouveau  [     DRM] Setting dpms mode 3 on TV encoder (output 1)
nouveau  [     DRM] allocated 1600x1200 fb: 0x9000, bo f5a0c600
Comment 6 Ilia Mirkin 2013-08-24 00:43:15 UTC
I still can't repro the issue. Another difference might be that I'm using xorg 1.13, whereas your original log is for 1.12. Is there anything in dmesg when the hangs occur?

I did notice that if I'm running with a compositor, e.g. xcompmgr, then things get REALLY slow when loading large images. Slow enough for watchdogs to fire, rcu to complain, etc. But the system does come into its own, esp if I kill xcompmgr. I don't know whether "standard KDE settings" include a compositor (wouldn't be surprised), but if they do, can you try turning that off?
Comment 7 dave.mueller 2013-12-07 08:17:05 UTC
In the meantime i updated my system (kernel 3.12, X Server 1.14.3, nouveau_drv.so module version = 1.0.9) and the crash with the 3200x2700 picture is gone. 

But i see the following lines in the kernel logs:

nouveau E[   PFIFO][0000:01:00.0] DMA_PUSHER - ch 1 [X[1287]] get 0xbeef0200 put 0x00022a9c state 0xc000018c (err: MEM_FAULT) push 0x00000000
nouveau E[ X[1287]] reloc wait_idle failed: -16
nouveau E[ X[1287]] reloc apply: -16

which seems to be accompanied by a short "hang" of the GUI.

Unfortunately these "hangs" seem not only be triggered by my synthetic test case, but also by normal daily work. 

The system seems to be able to recover from this state but from time to time (mostly if the system is under load) this results in a complete GUI freeze and i need to reboot the whole system.

But i am not sure if this is just a new symptom of the same problem or a totally unrelated new problem.
Comment 8 Ilia Mirkin 2013-12-07 13:15:57 UTC
Are you using a version of libdrm before 2.4.48 compiled with gcc-4.8? If so, upgrade. Or compile using an older gcc.
Comment 9 dave.mueller 2013-12-28 07:22:09 UTC
Thanks for the hint. I've upgraded to libdrm-2.4.50 and the errors lines in the kernel log are gone.
Comment 10 Ilia Mirkin 2014-08-21 16:43:24 UTC
It appears that all the issues that were talked about here are resolved. Feel free to re-open if I missed something.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.