Bug 72943

Summary:

[NV98] hangs in nvbios_init on probe (worked in 3.2)

Product:

xorg

Reporter:

Darcy Brás da Silva <dardevelin>

Component:

Driver/nouveau

Assignee:

Nouveau Project <nouveau>

Status:

RESOLVED FIXED

QA Contact:

Xorg Project Team <xorg-team>

Severity:

normal

Priority:

medium

CC:

ionut.radu

Version:

unspecified

Hardware:

x86-64 (AMD64)

OS:

Linux (All)

Whiteboard:

i915 platform:

i915 features:

Attachments:

Description	Flags
vbios.rom file	none
full syslog attachment	none
syslog with debug=trace flag	none
syslog with debug=trace flag, more wait time to hit the disk	none
make jump execution conditional	none
vmcore log excerpt	none

Description Darcy Brás da Silva 2013-12-21 03:28:15 UTC

Created attachment 91069 [details]
vbios.rom file

Hi, [warning newb in town]:
As in summary, I can't boot any kernel version above 3.2 in a debian sid install without using nouveau.modeset=0

After discarding the hypoteses of being related to microcode by installing it, and the later doing a fresh install of the distro i took some pictures to try get some help they can be found here
http://cidadecool.com/z-tunes/debian/problem/Foto0810.jpg
http://cidadecool.com/z-tunes/debian/problem/Foto0817.jpg

removing quiet from grub line, i was able to read some nouveau messages which led me to the irc channel on freenode, where 'imirkin' helped getting the appropriated data to report this bub.

ON:Linux version 3.11-2-amd64 (debian-kernel@lists.debian.org) (gcc version 4.8.2 (Debian 4.8.2-7) ) #1 SMP Debian 3.11.10-1 (2013-12-04)

running as root in a virtual terminal 'ctrl+alt+f2' 
rmmod nouveau
modprobe nouveau 

did not generate any kind of messages, so i followed the test with 
rmmod nouveau
modprobe nouveau modeset=1
the machine hang again, syslog which can be found here
http://cidadecool.com/z-tunes/debian/problems/syslog.1-saved
or viewed in the of this file.

imirkin mentioned it would be useful to provide vbios.rom so i booted into the working kernel (3.2), but did not found /sys/kernel/debug/dri/0/vbios.rom
so i compiled https://github.com/envytools/envytools 
and ran sudo ./nvagetbios > ~/gforce9300gsm-vbios.rom which is attached and can also be found here
at http://cidadecool.com/z-tunes/debian/problem/gforce9300gsm-vbios.rom

hope this helps you helping me and thank you for your time.

I hope i did not screw the important/relevant section and here is a partial copy of the syslog, please use the link if you need the full one.

Syslog Partial copy: 
ec 21 00:43:59 ycradnileved kernel: [ 3818.384770] nouveau  [  DEVICE][0000:01:00.0] BOOT0  : 0x298480a2
Dec 21 00:43:59 ycradnileved kernel: [ 3818.384873] nouveau  [  DEVICE][0000:01:00.0] Chipset: G98 (NV98)
Dec 21 00:43:59 ycradnileved kernel: [ 3818.384969] nouveau  [  DEVICE][0000:01:00.0] Family : NV50
Dec 21 00:43:59 ycradnileved kernel: [ 3818.387139] nouveau  [   VBIOS][0000:01:00.0] checking PRAMIN for image...
Dec 21 00:43:59 ycradnileved kernel: [ 3818.478081] nouveau  [   VBIOS][0000:01:00.0] ... appears to be valid
Dec 21 00:43:59 ycradnileved kernel: [ 3818.478178] nouveau  [   VBIOS][0000:01:00.0] using image from PRAMIN
Dec 21 00:43:59 ycradnileved kernel: [ 3818.478385] nouveau  [   VBIOS][0000:01:00.0] BIT signature found
Dec 21 00:43:59 ycradnileved kernel: [ 3818.478479] nouveau  [   VBIOS][0000:01:00.0] version 62.98.39.00.05
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006] INFO: rcu_sched self-detected stall on CPU { 0}  (t=5250 jiffies g=14778 c=14777 q=144)
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006] sending NMI to all CPUs:
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006] NMI backtrace for cpu 0
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006] CPU: 0 PID: 5556 Comm: modprobe Not tainted 3.11-2-amd64 #1 Debian 3.11.10-1
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006] Hardware name: LG Electronics R510/QL8, BIOS QL8L3B42 07/30/2008
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006] task: ffff8800855e8080 ti: ffff8800bb2a6000 task.ti: ffff8800bb2a6000
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006] RIP: 0010:[<ffffffff812600f9>]  [<ffffffff812600f9>] __const_udelay+0x9/0x30
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006] RSP: 0018:ffff8800bf803e48  EFLAGS: 00000046
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006] RAX: 0000000000000000 RBX: 0000000000002710 RCX: 0000000000000008
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006] RDX: 00000000008a0444 RSI: 0000000000000200 RDI: 0000000000418958
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006] RBP: ffffffff818405c0 R08: 0000000000000000 R09: 000000000000043d
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006] R10: 0000000000000000 R11: ffff8800bf803bb6 R12: ffffffff818405c0
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006] R13: ffff8800bb2a6000 R14: 0000000000000090 R15: ffff8800bf80ed80
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006] FS:  00007f29516aa700(0000) GS:ffff8800bf800000(0000) knlGS:0000000000000000
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006] CR2: 0000000000f922e0 CR3: 00000000bb633000 CR4: 00000000000007f0
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006] Stack:
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  ffffffff8103ed6a ffffffff8189bea0 ffffffff810dbe01 0000000000000000
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  0000000000000001 ffffffff810a3100 ffff8800b90f1c50 ffff8800855e8080
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  0000000000000000 0000000000000000 ffff8800bf80e220 ffff8800bf803f68
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006] Call Trace:
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  <IRQ> 
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffff8103ed6a>] ? arch_trigger_all_cpu_backtrace+0x5a/0x80
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffff810dbe01>] ? rcu_check_callbacks+0x321/0x610
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffff810a3100>] ? do_timer+0x200/0x640
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffff810aa4e0>] ? tick_sched_handle.isra.15+0x60/0x60
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffff810654ab>] ? update_process_times+0x3b/0x70
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffff810aa49b>] ? tick_sched_handle.isra.15+0x1b/0x60
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffff810aa517>] ? tick_sched_timer+0x37/0x60
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffff8107b194>] ? __run_hrtimer+0x74/0x1b0
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffff8107b939>] ? hrtimer_interrupt+0xe9/0x220
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffff8103d386>] ? smp_apic_timer_interrupt+0x36/0x50
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffff8148205d>] ? apic_timer_interrupt+0x6d/0x80
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  <EOI> 
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffffa041071a>] ? init_nv_reg+0x6a/0xe0 [nouveau]
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffffa0410709>] ? init_nv_reg+0x59/0xe0 [nouveau]
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffffa0415045>] ? nvbios_exec+0x35/0xd0 [nouveau]
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffffa0415508>] ? nvbios_init+0x98/0x150 [nouveau]
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffffa041a875>] ? nv50_devinit_init+0x45/0x190 [nouveau]
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffffa0409c72>] ? nouveau_object_inc+0xb2/0x1a0 [nouveau]
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffffa042d478>] ? nouveau_devobj_ctor+0x1d8/0x6a0 [nouveau]
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffffa0409618>] ? nouveau_object_ctor+0x28/0xe0 [nouveau]
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffffa0409ee0>] ? nouveau_object_new+0x180/0x240 [nouveau]
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffffa045e41e>] ? nouveau_drm_load+0x15e/0x600 [nouveau]
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffffa0252e4a>] ? drm_get_minor+0x1ca/0x260 [drm]
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffffa0255035>] ? drm_get_pci_dev+0x155/0x270 [drm]
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffffa045eab2>] ? nouveau_drm_probe+0x1f2/0x280 [nouveau]
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffff8127dfe4>] ? local_pci_probe+0x34/0x60
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffff8127f202>] ? pci_device_probe+0x112/0x120
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffff8132f348>] ? driver_probe_device+0x68/0x220
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffff8132f5bb>] ? __driver_attach+0x7b/0x80
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffff8132f540>] ? __device_attach+0x40/0x40
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffff8132d633>] ? bus_for_each_dev+0x53/0x90
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffff8132ead8>] ? bus_add_driver+0x1d8/0x270
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffff8132fb6a>] ? driver_register+0x6a/0x140
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffffa07a7000>] ? 0xffffffffa07a6fff
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffff8100210a>] ? do_one_initcall+0x10a/0x160
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffff810b56c2>] ? load_module+0x1bf2/0x24a0
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffff810b2340>] ? m_show+0x1c0/0x1c0
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffff810b6001>] ? SyS_init_module+0x91/0xc0
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006]  [<ffffffff81481469>] ? system_call_fastpath+0x16/0x1b
Dec 21 00:44:20 ycradnileved kernel: [ 3839.484006] Code: 00 00 48 ff c8 75 fb 48 ff c8 c3 0f 1f 80 00 00 00 00 48 8b 05 51 87 60 00 ff e0 0f 1f 80 00 00 00 00 65 48 8b 14 25 20 3b 01 00 <48> 8d 0c 12 48 c1 e2 06 48 8d 04 bd 00 00 00 00 48 29 ca f7 e2 
Dec 21 00:44:20 ycradnileved kernel: [ 3839.488055] NMI backtrace for cpu 1
Dec 21 00:44:20 ycradnileved kernel: [ 3839.488055] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.11-2-amd64 #1 Debian 3.11.10-1
Dec 21 00:44:20 ycradnileved kernel: [ 3839.488055] Hardware name: LG Electronics R510/QL8, BIOS QL8L3B42 07/30/2008
Dec 21 00:44:20 ycradnileved kernel: [ 3839.488055] task: ffff8800beeea780 ti: ffff8800bef0a000 task.ti: ffff8800bef0a000
Dec 21 00:44:20 ycradnileved kernel: [ 3839.488055] RIP: 0010:[<ffffffff8103a56e>]  [<ffffffff8103a56e>] mwait_idle_with_hints+0x5e/0x70
Dec 21 00:44:20 ycradnileved kernel: [ 3839.488055] RSP: 0018:ffff8800bef0be30  EFLAGS: 00000046
Dec 21 00:44:20 ycradnileved kernel: [ 3839.488055] RAX: 0000000000000020 RBX: ffff8800bee7cc00 RCX: 0000000000000001
Dec 21 00:44:20 ycradnileved kernel: [ 3839.488055] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000020
Dec 21 00:44:20 ycradnileved kernel: [ 3839.488055] RBP: ffff8800bee7ccd4 R08: ffff8800bef0bfd8 R09: 0000000000000018
Dec 21 00:44:20 ycradnileved kernel: [ 3839.488055] R10: 0000000000000c01 R11: 0000000000000007 R12: ffffffffa0225090
Dec 21 00:44:20 ycradnileved kernel: [ 3839.488055] R13: 0000000000000003 R14: ffff8800b91f5200 R15: ffffffffa02251b0
Dec 21 00:44:20 ycradnileved kernel: [ 3839.488055] FS:  0000000000000000(0000) GS:ffff8800bf880000(0000) knlGS:0000000000000000
Dec 21 00:44:20 ycradnileved kernel: [ 3839.488055] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Dec 21 00:44:20 ycradnileved kernel: [ 3839.488055] CR2: 00007fe04b6180c0 CR3: 000000000180c000 CR4: 00000000000007e0
Dec 21 00:44:20 ycradnileved kernel: [ 3839.488055] Stack:
Dec 21 00:44:20 ycradnileved kernel: [ 3839.488055]  ffffffffa0221e40 0000000000000019 ffff8800b91f5200 ffff8800bef0bea0
Dec 21 00:44:20 ycradnileved kernel: [ 3839.488055]  ffffffffa0225090 0000037df316a58e 0000000000000003 ffffffffa02251b0
Dec 21 00:44:20 ycradnileved kernel: [ 3839.488055]  ffffffff81360c7b 0000000000000003 0000000000000003 0000000000000000
Dec 21 00:44:20 ycradnileved kernel: [ 3839.488055] Call Trace:
Dec 21 00:44:20 ycradnileved kernel: [ 3839.488055]  [<ffffffffa0221e40>] ? acpi_idle_enter_bm+0x1a7/0x1fb [processor]
Dec 21 00:44:20 ycradnileved kernel: [ 3839.488055]  [<ffffffff81360c7b>] ? cpuidle_enter_state+0x3b/0xc0
Dec 21 00:44:20 ycradnileved kernel: [ 3839.488055]  [<ffffffff81360dbb>] ? cpuidle_idle_call+0xbb/0x1f0
Dec 21 00:44:20 ycradnileved kernel: [ 3839.488055]  [<ffffffff8101aac5>] ? arch_cpu_idle+0x5/0x30
Dec 21 00:44:20 ycradnileved kernel: [ 3839.488055]  [<ffffffff810a096e>] ? cpu_startup_entry+0xde/0x280
Dec 21 00:44:20 ycradnileved kernel: [ 3839.488055]  [<ffffffff8103b892>] ? start_secondary+0x1d2/0x230
Dec 21 00:44:20 ycradnileved kernel: [ 3839.488055] Code: 89 d1 48 2d c8 1f 00 00 0f 01 c8 0f ae f0 65 48 8b 04 25 30 c8 00 00 48 8b 80 38 e0 ff ff a8 08 75 09 48 89 f8 48 89 f1 0f 01 c9 <f3> c3 41 0f ae b8 38 e0 ff ff eb bd 66 0f 1f 44 00 00 65 8b 14 
Dec 21 00:44:20 ycradnileved kernel: [ 3839.488055] INFO: NMI handler (arch_trigger_all_cpu_backtrace_handler) took too long to run: 3.455 msecs
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006] BUG: soft lockup - CPU#0 stuck for 22s! [modprobe:5556]
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006] Modules linked in: nouveau(+) parport_pc ppdev lp parport rfcomm bnep uinput binfmt_misc nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc fuse joydev uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core videodev coretemp media kvm_intel kvm iTCO_wdt iTCO_vendor_support snd_hda_codec_hdmi lpc_ich snd_hda_codec_realtek psmouse microcode snd_hda_intel snd_hda_codec snd_hwdep pcspkr mxm_wmi mfd_core i2c_i801 snd_pcm battery serio_raw evdev wmi acpi_cpufreq video iwlwifi ac mperf snd_page_alloc button btusb ttm drm_kms_helper drm i2c_algo_bit i2c_core snd_seq processor snd_seq_device cfg80211 bluetooth snd_timer snd soundcore rfkill ext4 crc16 mbcache jbd2 sg ums_realtek usb_storage sr_mod sd_mod cdrom crc_t10dif ahci libahci libata scsi_mod thermal thermal_sys r8169 mii ehci_pci uhci_hcd ehci_hcd usbcore usb_common [last unloaded: nouveau]
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006] CPU: 0 PID: 5556 Comm: modprobe Not tainted 3.11-2-amd64 #1 Debian 3.11.10-1
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006] Hardware name: LG Electronics R510/QL8, BIOS QL8L3B42 07/30/2008
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006] task: ffff8800855e8080 ti: ffff8800bb2a6000 task.ti: ffff8800bb2a6000
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006] RIP: 0010:[<ffffffffa041044d>]  [<ffffffffa041044d>] init_mask+0x2d/0x290 [nouveau]
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006] RSP: 0018:ffff8800bb2a7870  EFLAGS: 00000246
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006] RAX: ffff8800857a7c80 RBX: 000000000000d9d5 RCX: 0000000008000000
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006] RDX: 0000000000000000 RSI: ffff880037014c00 RDI: ffff8800857a7c80
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006] RBP: ffff8800bb2a78c8 R08: 000000000000d9f0 R09: 0000000000000020
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006] R10: ffff8800857a7c80 R11: 00000000000000f0 R12: ffff8800857a7c80
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006] R13: ffff8800bb2a78c8 R14: 02c900120149d9d5 R15: 00000000e6f5528a
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006] FS:  00007f29516aa700(0000) GS:ffff8800bf800000(0000) knlGS:0000000000000000
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006] CR2: 0000000000f922e0 CR3: 00000000bb633000 CR4: 00000000000007f0
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006] Stack:
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006]  ffff8800ffffffff ffff8800bb2a78c8 0000000000000000 ffff8800857a7c80
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006]  ffff8800bb2a78c8 ffff88009a56c0c0 ffffffffa0415045 ffff8800857a7c80
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006]  ffff8800bb2a78c8 ffffffffa0415508 0000000000000002 ffff88009a56c0c0
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006] Call Trace:
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006]  [<ffffffffa0415045>] ? nvbios_exec+0x35/0xd0 [nouveau]
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006]  [<ffffffffa0415508>] ? nvbios_init+0x98/0x150 [nouveau]
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006]  [<ffffffffa041a875>] ? nv50_devinit_init+0x45/0x190 [nouveau]
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006]  [<ffffffffa0409c72>] ? nouveau_object_inc+0xb2/0x1a0 [nouveau]
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006]  [<ffffffffa042d478>] ? nouveau_devobj_ctor+0x1d8/0x6a0 [nouveau]
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006]  [<ffffffffa0409618>] ? nouveau_object_ctor+0x28/0xe0 [nouveau]
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006]  [<ffffffffa0409ee0>] ? nouveau_object_new+0x180/0x240 [nouveau]
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006]  [<ffffffffa045e41e>] ? nouveau_drm_load+0x15e/0x600 [nouveau]
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006]  [<ffffffffa0252e4a>] ? drm_get_minor+0x1ca/0x260 [drm]
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006]  [<ffffffffa0255035>] ? drm_get_pci_dev+0x155/0x270 [drm]
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006]  [<ffffffffa045eab2>] ? nouveau_drm_probe+0x1f2/0x280 [nouveau]
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006]  [<ffffffff8127dfe4>] ? local_pci_probe+0x34/0x60
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006]  [<ffffffff8127f202>] ? pci_device_probe+0x112/0x120
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006]  [<ffffffff8132f348>] ? driver_probe_device+0x68/0x220
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006]  [<ffffffff8132f5bb>] ? __driver_attach+0x7b/0x80
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006]  [<ffffffff8132f540>] ? __device_attach+0x40/0x40
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006]  [<ffffffff8132d633>] ? bus_for_each_dev+0x53/0x90
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006]  [<ffffffff8132ead8>] ? bus_add_driver+0x1d8/0x270
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006]  [<ffffffff8132fb6a>] ? driver_register+0x6a/0x140
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006]  [<ffffffffa07a7000>] ? 0xffffffffa07a6fff
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006]  [<ffffffff8100210a>] ? do_one_initcall+0x10a/0x160
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006]  [<ffffffff810b56c2>] ? load_module+0x1bf2/0x24a0
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006]  [<ffffffff810b2340>] ? m_show+0x1c0/0x1c0
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006]  [<ffffffff810b6001>] ? SyS_init_module+0x91/0xc0
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006]  [<ffffffff81481469>] ? system_call_fastpath+0x16/0x1b
Dec 21 00:44:45 ycradnileved kernel: [ 3864.072006] Code: 41 55 41 89 cd 41 54 41 89 d4 55 48 89 fd 53 89 f3 83 e3 fc 48 83 ec 08 48 8b 7f 08 48 8b 47 10 48 85 c0 48 0f 44 c7 48 8b 70 08 <48> 85 f6 48 0f 44 f0 0f b6 45 24 83 be e0 00 00 00 4f 76 14 85 
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006] BUG: soft lockup - CPU#0 stuck for 22s! [modprobe:5556]
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006] Modules linked in: nouveau(+) parport_pc ppdev lp parport rfcomm bnep uinput binfmt_misc nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc fuse joydev uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core videodev coretemp media kvm_intel kvm iTCO_wdt iTCO_vendor_support snd_hda_codec_hdmi lpc_ich snd_hda_codec_realtek psmouse microcode snd_hda_intel snd_hda_codec snd_hwdep pcspkr mxm_wmi mfd_core i2c_i801 snd_pcm battery serio_raw evdev wmi acpi_cpufreq video iwlwifi ac mperf snd_page_alloc button btusb ttm drm_kms_helper drm i2c_algo_bit i2c_core snd_seq processor snd_seq_device cfg80211 bluetooth snd_timer snd soundcore rfkill ext4 crc16 mbcache jbd2 sg ums_realtek usb_storage sr_mod sd_mod cdrom crc_t10dif ahci libahci libata scsi_mod thermal thermal_sys r8169 mii ehci_pci uhci_hcd ehci_hcd usbcore usb_common [last unloaded: nouveau]
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006] CPU: 0 PID: 5556 Comm: modprobe Not tainted 3.11-2-amd64 #1 Debian 3.11.10-1
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006] Hardware name: LG Electronics R510/QL8, BIOS QL8L3B42 07/30/2008
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006] task: ffff8800855e8080 ti: ffff8800bb2a6000 task.ti: ffff8800bb2a6000
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006] RIP: 0010:[<ffffffffa040a7ce>]  [<ffffffffa040a7ce>] nv_printk_+0x13e/0x1c0 [nouveau]
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006] RSP: 0018:ffff8800bb2a7838  EFLAGS: 00000282
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006] RAX: 0000000000000000 RBX: ffffffffa049cd02 RCX: ffffffffa049c8d1
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006] RDX: 0000000000000005 RSI: ffffffffa049caf4 RDI: ffff8800857a7c80
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006] RBP: ffff8800bb2a7858 R08: 000000000000d9e3 R09: 0000000000000020
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006] R10: ffff8800857a7c80 R11: 00000000000000f0 R12: ffff8800bb2a7806
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006] R13: ffff8800857a7c80 R14: ffffffffa040cd4b R15: 0000000000000018
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006] FS:  00007f29516aa700(0000) GS:ffff8800bf800000(0000) knlGS:0000000000000000
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006] CR2: 0000000000f922e0 CR3: 00000000bb633000 CR4: 00000000000007f0
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006] Stack:
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006]  ffff8800bb2a78c8 000000000000e820 00000000ffffffff ffff88009a56c0c0
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006]  0000000080000000 ffffffffa0410750 ffff88000000e820 ffff8800ffffffff
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006]  ffff880080000000 ffff8800bb2a78c8 0000000000000000 ffff8800857a7c80
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006] Call Trace:
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006]  [<ffffffffa0410750>] ? init_nv_reg+0xa0/0xe0 [nouveau]
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006]  [<ffffffffa0415045>] ? nvbios_exec+0x35/0xd0 [nouveau]
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006]  [<ffffffffa0415508>] ? nvbios_init+0x98/0x150 [nouveau]
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006]  [<ffffffffa041a875>] ? nv50_devinit_init+0x45/0x190 [nouveau]
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006]  [<ffffffffa0409c72>] ? nouveau_object_inc+0xb2/0x1a0 [nouveau]
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006]  [<ffffffffa042d478>] ? nouveau_devobj_ctor+0x1d8/0x6a0 [nouveau]
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006]  [<ffffffffa0409618>] ? nouveau_object_ctor+0x28/0xe0 [nouveau]
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006]  [<ffffffffa0409ee0>] ? nouveau_object_new+0x180/0x240 [nouveau]
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006]  [<ffffffffa045e41e>] ? nouveau_drm_load+0x15e/0x600 [nouveau]
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006]  [<ffffffffa0252e4a>] ? drm_get_minor+0x1ca/0x260 [drm]
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006]  [<ffffffffa0255035>] ? drm_get_pci_dev+0x155/0x270 [drm]
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006]  [<ffffffffa045eab2>] ? nouveau_drm_probe+0x1f2/0x280 [nouveau]
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006]  [<ffffffff8127dfe4>] ? local_pci_probe+0x34/0x60
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006]  [<ffffffff8127f202>] ? pci_device_probe+0x112/0x120
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006]  [<ffffffff8132f348>] ? driver_probe_device+0x68/0x220
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006]  [<ffffffff8132f5bb>] ? __driver_attach+0x7b/0x80
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006]  [<ffffffff8132f540>] ? __device_attach+0x40/0x40
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006]  [<ffffffff8132d633>] ? bus_for_each_dev+0x53/0x90
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006]  [<ffffffff8132ead8>] ? bus_add_driver+0x1d8/0x270
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006]  [<ffffffff8132fb6a>] ? driver_register+0x6a/0x140
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006]  [<ffffffffa07a7000>] ? 0xffffffffa07a6fff
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006]  [<ffffffff8100210a>] ? do_one_initcall+0x10a/0x160
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006]  [<ffffffff810b56c2>] ? load_module+0x1bf2/0x24a0
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006]  [<ffffffff810b2340>] ? m_show+0x1c0/0x1c0
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006]  [<ffffffff810b6001>] ? SyS_init_module+0x91/0xc0
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006]  [<ffffffff81481469>] ? system_call_fastpath+0x16/0x1b
Dec 21 00:45:13 ycradnileved kernel: [ 3892.072006] Code: 8d 84 24 78 01 00 00 48 89 44 24 28 e8 dc 4d c9 e0 48 8b 84 24 70 01 00 00 65 48 33 04 25 28 00 00 00 75 7f 48 81 c4 a8 01 00 00 <5b> 41 5c 41 5d 41 5e 5d c3 66 0f 1f 84 00 00 00 00 00 44 3b a7

Comment 1 Darcy Brás da Silva 2013-12-21 03:29:10 UTC

Created attachment 91070 [details]
full syslog attachment

Comment 2 Ilia Mirkin 2013-12-21 03:53:15 UTC

It's obviously getting stuck somewhere in the vbios execution logic. I should have thought of this while we were talking on IRC, but can you try this again with

modprobe nouveau modeset=1 debug=trace

That should produce a bunch more logs that show exactly what is being executed. Hopefully.

Comment 3 Darcy Brás da Silva 2013-12-23 06:25:05 UTC

Hi, sorry for not being able to reply earlier. I am now attaching a syslog.1 with debug=trace flag.
Feel free to request me any testing/data which may help you help me. :D
A copy of the log may also be found http://cidadecool.com/z-tunes/debian/problem/syslog.1-debug=trace

PS: Dec 23 [around 5-6AM]

Comment 4 Darcy Brás da Silva 2013-12-23 06:27:40 UTC

Created attachment 91141 [details]
syslog with debug=trace flag

Comment 5 Ilia Mirkin 2013-12-23 16:45:23 UTC

Not sure why you assigned this to yourself... you shouldn't need to touch any of those fields.

Your latest log doesn't seem to contain any messages from nouveau. Did you remember to unload nouveau before running the modprobe command I had indicated?

Perhaps the logs didn't hit disk... not sure. Maybe waiting longer before rebooting would allow the flush to go through or something. (Like a minute.)

Comment 6 Darcy Brás da Silva 2013-12-25 06:53:47 UTC

Hi, i have followed all previous steps and simply added the debug=trace flag.
I also waited for at least 2 minutes before reboot. I will try have a second run at it though. Thank's for the heads up.

Comment 7 Darcy Brás da Silva 2013-12-25 07:09:20 UTC

Created attachment 91182 [details]
syslog with debug=trace flag, more wait time to hit the disk

I think it hit the disk this time. the date is DEC25.
I am sorry if the process seems slow on getting back to you. but I am getting back as fast as I can :) . hope that does not turn you off/down.

As usual the log file is also available under http://cidadecool.com/z-tunes/debian/problem/syslog.1-debug=trace-DEC25-

Comment 8 Ionut Radu 2014-01-04 17:36:23 UTC

It looks like I have the same issue. I'm a Fedora user 
and I was able to boot kernel 3.6.10 from Fedora 18 live image but not able to boot Fedora 19 and Fedora 20 live images.
I'm also not able to boot kernel-3.13.0:

https://bugzilla.redhat.com/show_bug.cgi?id=1026073

Thanks,
Ionut Radu.

Comment 9 Ilia Mirkin 2014-01-06 00:00:02 UTC

(In reply to comment #7)
> Created attachment 91182 [details]
> syslog with debug=trace flag, more wait time to hit the disk
> 
> I think it hit the disk this time. the date is DEC25.

The only messages on Dec 25 are from 3.2. Perhaps you have a second computer and can use netconsole to send the messages?

It'd be really useful to get a log with the trace since that should immediately identify the failure.

Comment 10 Ilia Mirkin 2014-01-06 00:58:22 UTC

Created attachment 91531 [details] [review]
make jump execution conditional

Please try this patch, I'm pretty sure it will help things out. The problem VBIOS has the following snippet:

0xd9d0: 74 64 00                                       TIME     0x0064
0xd9d3: 75 10                                          CONDITION        0x10
0xd9d5: 38                                             NOT
0xd9d6: 6e 24 e8 00 00 ff ff ff ff 00 00 20 00         NV_REG   R[0x00e824] &= 0xffffffff |= 0x00200000
0xd9e3: 6e 20 e8 00 00 ff ff ff ff 00 00 00 80         NV_REG   R[0x00e820] &= 0xffffffff |= 0x80000000
0xd9f0: 6e 18 e8 00 00 ff ff ff ff 00 00 00 08         NV_REG   R[0x00e818] &= 0xffffffff |= 0x08000000
0xd9fd: 6e 18 e8 00 00 ff ff ff 7f 00 00 00 00         NV_REG   R[0x00e818] &= 0x7fffffff |= 0x00000000
0xda0a: 6e 18 e8 00 00 ff ff ff 7f 00 00 00 80         NV_REG   R[0x00e818] &= 0x7fffffff |= 0x80000000
0xda17: 74 64 00                                       TIME     0x0064
0xda1a: 5c d0 d9                                       JUMP     0xd9d0

With the old code, the JUMP was always executed and so there was no way to break out of the loop. The new code makes JUMP conditional the same way NV_REG/etc are.

Comment 11 Ionut Radu 2014-01-07 19:17:23 UTC

Created attachment 91612 [details]
vmcore log excerpt


Hi,

For me the issue is not fixed.
Please see the attached vmcore-excerpt.log.
For vmcore, please check:

https://www.dropbox.com/sh/e77p700zr8g1v4z/y3ldY3npQB

thanks,
Ionut Radu.

Comment 12 Ilia Mirkin 2014-01-07 19:24:37 UTC

(In reply to comment #11)
> Created attachment 91612 [details]
> vmcore log excerpt
> 
> 
> Hi,
> 
> For me the issue is not fixed.
> Please see the attached vmcore-excerpt.log.

That's very unfortunate. Can you please triple-check that you booted a kernel with that patch applied? Assuming that you did, mind adding "nouveau.debug=trace" to the kernel cmdline? That should reveal where it's looping. (Or perhaps the condition just never becomes true?) Do note that it will produce *vast* amounts of log lines if it is indeed looping the way I think it is...

Comment 13 Ilia Mirkin 2014-01-07 19:28:48 UTC

Ionut, also, please upload a copy of your vbios (see http://nouveau.freedesktop.org/wiki/DumpingVideoBios/ for instructions on retrieving it... your issue might be different than Darcy's)

Comment 14 Ionut Radu 2014-01-07 20:01:10 UTC

Hi Ilia,

It's very likely to be the same issue. I have the same graphic card  
as Darcy and slightly different vbios version. 
I'm sure I have applied the patch correctly. In fact the original kernel is 3.13.0...fc21, while I have compiled it on Fedora 20 and it got the fc20 suffix.
I'll try to obtain a vbios.rom file.
Regarding debug=trace, I have issues with journalctl flooding, and the vmcore log 
gets truncated with no useful information added.
Can't you use vmcore to debug the issue ? It should contain enough information.

thanks,
Ionut Radu.

Comment 15 Ilia Mirkin 2014-01-08 01:28:04 UTC

Well, I downloaded

https://www.dropbox.com/sh/e77p700zr8g1v4z/kb_av1ZSS6/127.0.0.1-2014.01.07-20%3A43%3A35

Which includes

kernel-debuginfo-common-x86_64-3.13.0-0.rc6.git0.1.fc20.x86_64.rpm

Which contains the source. I can only assume that this is the source that you built from. And the drivers/gpu/drm/nouveau/core/subdev/bios/init.c in there does not appear to have the patch applied to init_jump.

So I'd like to ask again... can you check that the kernel you're booting is the kernel that has the patch applied to it? Am I misunderstanding the situation? You could prove it to yourself by modifying some common print, for example, that you would be able to identify from kernel messages. Or you could just build the kernel directly without fancy tools that obscure these things.

(I did try loading the nouveau.ko.debug in gdb, but I think it _only_ contains the debug symbols, since the init_jump function was full of add    %al,(%rax) instructions which I think is just opcode 0.)

Comment 16 Darcy Brás da Silva 2014-01-08 01:45:51 UTC

Unfortunately I don't have another computer available at the hours I can test this. Will try to get one during this week, but i am very likely to need some assistance on getting the messages out using this netconsole. regarding the patch, to which source tree should I try to apply ? was that for me at all ? 

Thanks in advance

Comment 17 Darcy Brás da Silva 2014-01-08 05:25:01 UTC

Hi, Applying the patch provided by Ilia Mirkin to  linux kernel 3.13.0-rc6 solves all the reported bug/behavior on my side.

*/me is impressed with Ilia Mirkin dedication*
Thanks a million.

Comment 18 Ionut Radu 2014-01-08 21:18:38 UTC

Hi Ilia,

I was wrong. The fix is good in my case too.
Great work. Thanks a lot.

Regards,
Ionut.

Comment 19 Ionut Radu 2014-01-09 09:08:19 UTC


So after all there was the same issue as expected.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.