Bug 35935

Summary: since 2.6.38.1, screen become garbled after some times (RV770)
Product: DRI Reporter: Mjules <mjulien.m>
Component: DRM/RadeonAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED INVALID QA Contact:
Severity: major    
Priority: medium    
Version: XOrg git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
garbled screen
none
Xorg log
none
dmesg none

Description Mjules 2011-04-03 11:57:04 UTC
Created attachment 45194 [details]
garbled screen

Hi, 

Since I'm using kernel 2.6.38.1, Xorg screen became garbled after some times (see screenshot). It occurs on 2.6.38.2 but seems to be less frequent (3-4 hours for 2.6.38.1 ; 8-10 for 2.6.38.2)
I didn't get this bug with 2.6.38

TTY stay fine and after I close and start X, everything is fine again. No need to reboot.

I can't find anything precise which can trigger this bug, it simply occurs. Maybe an hint, it seems to be when I try to display a minimized window.

There is nothing obvious in the logs (but I'm not using debug option), kernel or X.

kernel 2.6.38.2  (kernel-linus-2.6.38.2-1mdv-1-1 ; vanilla kernel packaged by my distro)
xorg 1.9.3
ati 6.14.1
mesa 7.9
libdrm 2.4.24
Comment 1 Mjules 2011-04-03 12:01:19 UTC
Created attachment 45195 [details]
Xorg log

Forgot to say I'm using a radeon 4870 (RV770), with KMS and power profile low.
Comment 2 Mjules 2011-04-03 12:05:34 UTC
Created attachment 45196 [details]
dmesg
Comment 3 Alex Deucher 2011-04-03 16:59:38 UTC
(In reply to comment #0)
> Since I'm using kernel 2.6.38.1, Xorg screen became garbled after some times
> (see screenshot). It occurs on 2.6.38.2 but seems to be less frequent (3-4
> hours for 2.6.38.1 ; 8-10 for 2.6.38.2)
> I didn't get this bug with 2.6.38

Did you update anything besides the kernel between 2.6.38 and 2.6.38.1 (mesa of xf86-video-ati)?  If the same userspace components (mesa, xf86-video-ati) work on 2.6.38 and become garbled on 2.6.38.1, can you bisect between 2.6.38 and 2.6.38.1 and see which commit caused the problem?  If you updated the userspace components as well, this is probably a duplicate of bug 33929.
Comment 4 Mjules 2011-04-04 03:23:59 UTC
thanks for your quick answer.

With the same userspace and configuration, 2.6.38 works fine.

I will try to bisect.

thx
Comment 5 Mjules 2011-04-26 14:20:30 UTC
The result of the bisection is :
ba3ffdc68b4c80fe4cc42bdae6040eca1067ebb2 is the first bad commit
commit ba3ffdc68b4c80fe4cc42bdae6040eca1067ebb2
Author: Stanislav Kinsbursky <skinsbursky@parallels.com>
Date:   Thu Mar 17 18:54:23 2011 +0300

    RPC: killing RPC tasks races fixed
    
    commit 8e26de238fd794c8ea56a5c98bf67c40cfeb051d upstream.
    
    RPC task RPC_TASK_QUEUED bit is set must be checked before trying to wake up
    task rpc_killall_tasks() because task->tk_waitqueue can not be set (equal to
    NULL).
    Also, as Trond Myklebust mentioned, such approach (instead of checking
    tk_waitqueue to NULL) allows us to "optimise away the call to
    rpc_wake_up_queued_task() altogether for those
    tasks that aren't queued".
    Here is an example of dereferencing of tk_waitqueue equal to NULL:
    
    CPU 0               	CPU 1				CPU 2
    --------------------	---------------------	--------------------------
    nfs4_run_open_task
    rpc_run_task
    rpc_execute
    rpc_set_active
    rpc_make_runnable
    (waiting)
    			rpc_async_schedule
    			nfs4_open_prepare
    			nfs_wait_on_sequence
    						nfs_umount_begin
    						rpc_killall_tasks
    						rpc_wake_up_task
    						rpc_wake_up_queued_task
    						spin_lock(tk_waitqueue == NULL)
    						BUG()
    			rpc_sleep_on
    			spin_lock(&q->lock)
    			__rpc_sleep_on
    			task->tk_waitqueue = q
    
    Signed-off-by: Stanislav Kinsbursky <skinsbursky@openvz.org>
    Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

:040000 040000 91b88f0611c9ff1cf2691d9d65ec13c652a554ac e23b6d459a8bef93de6ea4821754e2f51e3ad32f M	net



which seems completely bogus to me (I don't use nfs).

BTW, the bug is still present with 2.6.38.3 and seems more frequent with it. 

Another detail is it occurs only one time during a session.
Comment 6 Mjules 2011-05-30 08:34:19 UTC
Hi,

I can't reproduce reliably (sometimes it occurs 2 times in 5 minutes, sometimes nothing during weeks) and I suspect a hardware bug.

I think it's better to close it, no need to pollute buglist.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.