Bug 69349 - [NV98] Random image corruptions
Summary: [NV98] Random image corruptions
Status: NEW
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/nouveau (show other bugs)
Version: git
Hardware: Other All
: medium normal
Assignee: Nouveau Project
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-09-14 08:25 UTC by Paul Menzel
Modified: 2015-10-22 04:54 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
Screenshot taken with `import` after opening a new tab in Iceweasel 17.0.8 (137.81 KB, image/png)
2013-09-14 08:25 UTC, Paul Menzel
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Paul Menzel 2013-09-14 08:25:11 UTC
Created attachment 85813 [details]
Screenshot taken with `import` after opening a new tab in Iceweasel 17.0.8

Using Debian Wheezy with

    linux-image-3.2.0-4-686-pae 3.2.46-1+deb7u1
    xserver-xorg-video-nouveau 1:1.0.1-5
    libglu1-mesa:i386 8.0.5-4+deb7u2
    libdrm2                                2.4.40-1~deb7u2
    libudev0                               175-7.2
    xserver-xorg-core [xorg-video-abi-12]  2:1.12.4-6

sometimes there is an image corruption where some areas are garbled with other colors. See the attached screenshot taken with `import` from the ImageMagick suite. (What is that type of corruption called?)

The obfuscation was done with the following command.

    $ convert 20130909-Debian_Wheezy-nouveau-corruption.png -draw "rectangle 231,165 1703,953" 20130909-Debian_Wheezy-nouveau-corruption--obfuscated.png

This issue happened five times and I do not know how to force it. Three of these five times it happened after resuming from suspend to RAM. Additionally after these corruption start, the system is going to freeze after some time. There are no error messages in `/var/log/syslog`, `/var/log/messages`, `/var/log/Xorg.0.log` and `~/.xsession-errors`.

Please tell me, what else you need for debugging.
Comment 1 Ilia Mirkin 2013-09-14 15:51:24 UTC
Please see if this still happens with the most recent software. If that's not an option for you, feel free to file a bug with debian and close this one. If it still happens with the most recent software, please take a look at http://nouveau.freedesktop.org/wiki/Bugs/ to see what information is needed.
Comment 2 Paul Menzel 2013-09-16 20:25:53 UTC
Today it happened again¹.

1. Some more information about the system:

        $ lspci -s 4:00
        04:00.0 VGA compatible controller: NVIDIA Corporation G98 [GeForce 8400 GS] (rev a1)

I have another system with an integrated Nvidia device, where I have not been able to reproduce this.

        $ lspci -s 2:00
02:00.0 VGA compatible controller: NVIDIA Corporation C77 [GeForce 8200] (rev a2)

2. The same corruptions started some time after startup (no suspend in between). After some time, I closed Iceweasel/Firefox and the system froze. I just pressed the reset button and the corruption was right there the next start. So it might have to do something with the card setup. Then it froze pretty quickly (after five minutes) without even starting Iceweasel/Firefox and just awesome WM and uxrvt terminals running. Switching the power supply and then starting again the corruption was gone and I could not reproduce it again today.

3. This is a hard freeze, meaning I am not able to ping the machine when it is frozen.

4. I checked again and the log files do not contain anything before the crash related to Nouveau or the graphics stack.

5. I am going to try Linux 3.10 from Debian Backports [1], though was not the Linux Nouveau module rewritten in Linux 3.7 [2], so that this is not going to point to the problem in Linux 3.2 (with DRM from 3.4.47 [3])?

[1] http://backports.debian.org/
[2] http://www.phoronix.com/scan.php?page=news_item&px=MTE1NDg
[3] http://ftp-master.metadata.debian.org/changelogs//main/l/linux/linux_3.2.46-1+deb7u1_changelog
    »drm, agp: Update to 3.4.47:«

¹ Here are the Nouveau messages from the Linux kernel ring buffer.

        $ dmesg | grep -i nouveau
        [    6.006338] nouveau 0000:04:00.0: setting latency timer to 64
        [    6.006359] [drm] nouveau 0000:04:00.0: Detected an NV50 generation card (0x298200a2)
        [    6.012273] [drm] nouveau 0000:04:00.0: Checking PRAMIN for VBIOS
        [    6.588173] [drm] nouveau 0000:04:00.0: ... appears to be valid
        [    6.588176] [drm] nouveau 0000:04:00.0: Using VBIOS from PRAMIN
        [    6.588180] [drm] nouveau 0000:04:00.0: BIT BIOS found
        [    6.588183] [drm] nouveau 0000:04:00.0: Bios version 62.98.71.00
        [    6.588186] [drm] nouveau 0000:04:00.0: TMDS table version 2.0
        [    6.589065] [drm] nouveau 0000:04:00.0: MXM: no VBIOS data, nothing to do
        [    6.589068] [drm] nouveau 0000:04:00.0: DCB version 4.0
        [    6.589071] [drm] nouveau 0000:04:00.0: DCB outp 00: 02000300 00000028
        [    6.589074] [drm] nouveau 0000:04:00.0: DCB outp 01: 01000302 00020030
        [    6.589076] [drm] nouveau 0000:04:00.0: DCB outp 02: 04011310 00000028
        [    6.589078] [drm] nouveau 0000:04:00.0: DCB conn 00: 00001030
        [    6.589080] [drm] nouveau 0000:04:00.0: DCB conn 01: 00000200
        [    6.589085] [drm] nouveau 0000:04:00.0: Parsing VBIOS init table 0 at offset 0xD691
        [    6.614808] [drm] nouveau 0000:04:00.0: Parsing VBIOS init table 1 at offset 0xDA96
        [    6.621444] [drm] nouveau 0000:04:00.0: Parsing VBIOS init table 2 at offset 0xE392
        [    6.621477] [drm] nouveau 0000:04:00.0: Parsing VBIOS init table 3 at offset 0xE460
        [    6.622693] [drm] nouveau 0000:04:00.0: Parsing VBIOS init table 4 at offset 0xE716
        [    6.622695] [drm] nouveau 0000:04:00.0: Parsing VBIOS init table at offset 0xE77B
        [    6.642560] [drm] nouveau 0000:04:00.0: 0xE77B: Condition still not met after 20ms, skipping following opcodes
        [    6.644880] [drm] nouveau 0000:04:00.0: Detected 512MiB VRAM (DDR2)
        [    6.645915] [drm] nouveau 0000:04:00.0: 512 MiB GART (aperture)
        [    6.988385] [drm] nouveau 0000:04:00.0: 1 available performance level(s)
        [    6.988389] [drm] nouveau 0000:04:00.0: 3: core 567MHz shader 1400MHz memory 400MHz fanspeed 100%
        [    6.988392] [drm] nouveau 0000:04:00.0: c: core 500MHz shader 1300MHz memory 399MHz
        [    7.119405] [drm] nouveau 0000:04:00.0: allocated 1920x1080 fb: 0x320000, bo f745aa00
        [    7.119474] fbcon: nouveaufb (fb0) is primary device
        [    7.165821] fb0: nouveaufb frame buffer device
        [    7.166156] [drm] Initialized nouveau 1.0.0 20120316 for 0000:04:00.0 on minor 0
Comment 3 Ilia Mirkin 2013-09-17 02:10:00 UTC
(In reply to comment #2)
> 5. I am going to try Linux 3.10 from Debian Backports [1], though was not
> the Linux Nouveau module rewritten in Linux 3.7 [2], so that this is not
> going to point to the problem in Linux 3.2 (with DRM from 3.4.47 [3])?

Also make sure that you have an up-to-date xf86-video-nouveau, and mesa if you're using some sort of opengl-based compositor. I don't think there's any interest to maintain backports of fixes for nouveau (by the nouveau dev community), so if you'd like to see it fixed in Debian, you'll need to talk to Debian people.
Comment 4 Tobias Klausmann 2015-02-01 00:04:52 UTC
Is this still a problem with newer software (kernel/xf86-video-nouveau)?
Comment 5 Paul Menzel 2015-03-02 07:44:09 UTC
(In reply to Tobias Klausmann from comment #4)
> Is this still a problem with newer software (kernel/xf86-video-nouveau)?

Yes, I am still able to reproduce this with Debian Jessie/testing, which currently has Linux kernel 3.19.x and `xserver-xorg-video-nouveau` 1:1.0.11-1.

Suddenly the corruptions start on the left side of the monitor and then I can see how slow the window of uxterm is rendered line by line (takes maybe two seconds). If I do not shut the system down, it’ll eventually freeze.
Comment 6 Ilia Mirkin 2015-03-02 07:47:43 UTC
(In reply to Paul Menzel from comment #5)
> (In reply to Tobias Klausmann from comment #4)
> > Is this still a problem with newer software (kernel/xf86-video-nouveau)?
> 
> Yes, I am still able to reproduce this with Debian Jessie/testing, which
> currently has Linux kernel 3.19.x and `xserver-xorg-video-nouveau`
> 1:1.0.11-1.
> 
> Suddenly the corruptions start on the left side of the monitor and then I
> can see how slow the window of uxterm is rendered line by line (takes maybe
> two seconds). If I do not shut the system down, it’ll eventually freeze.

Are there any messages in dmesg when this happens? I assume this is still with the G98?

What environment do you use? Gnome? KDE? Redirecting compositor which uses GL?
Comment 7 Paul Menzel 2015-09-04 06:24:45 UTC
(In reply to Ilia Mirkin from comment #6)
> (In reply to Paul Menzel from comment #5)
> > (In reply to Tobias Klausmann from comment #4)
> > > Is this still a problem with newer software (kernel/xf86-video-nouveau)?
> > 
> > Yes, I am still able to reproduce this with Debian Jessie/testing, which
> > currently has Linux kernel 3.19.x and `xserver-xorg-video-nouveau`
> > 1:1.0.11-1.
> > 
> > Suddenly the corruptions start on the left side of the monitor and then I
> > can see how slow the window of uxterm is rendered line by line (takes maybe
> > two seconds). If I do not shut the system down, it’ll eventually freeze.

This is now with Debian 8.1 (Jessie/stable) and Linux 4.1.1.

> Are there any messages in dmesg when this happens?

Normally there are no messages. In a few times, the following message can be seen.

    [  663.478750] do_IRQ: 1.160 No irq handler for vector (irq -1)

> I assume this is still with the G98?

Indeed, it’s still with the same card.

```
$ lspci -v -s 04:00.0
04:00.0 VGA compatible controller: NVIDIA Corporation G98 [GeForce 8400 GS Rev. 2] (rev a1) (prog-if 00 [VGA controller])
	Subsystem: Micro-Star International Co., Ltd. [MSI] Device 2060
	Flags: bus master, fast devsel, latency 0, IRQ 29
	Memory at fd000000 (32-bit, non-prefetchable) [size=16M]
	Memory at e0000000 (64-bit, prefetchable) [size=256M]
	Memory at fa000000 (64-bit, non-prefetchable) [size=32M]
	I/O ports at ec00 [size=128]
	[virtual] Expansion ROM at fc000000 [disabled] [size=128K]
	Capabilities: <access denied>
	Kernel driver in use: nouveau
```

> What environment do you use? Gnome? KDE? Redirecting compositor which uses
> GL?

The awesome WM 3.4.15 [1] (no compositing as far as I know) is started with `exec startx`.

[1] http://awesome.naquadah.org
Comment 8 Paul Menzel 2015-09-23 06:57:05 UTC
I installed Linux 4.2-1~exp1 (linux-image-4.2.0-trunk-686-pae) and, today, the issue happened after using the system for 20 minutes without any suspend/resume cycle in between.

    $ uname -a
    Linux gm-debian 4.2.0-trunk-686-pae #1 SMP Debian 4.2-1~exp1 (2015-08-31) i686 GNU/Linux

No messages were logged.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.