Created attachment 88189 [details] Kernel panic log. I usually get kernel panics on my Alienware M14X R2. Using Archlinux up to date. Linux alienarch 3.11.6-1-ARCH #1 SMP PREEMPT Fri Oct 18 23:22:36 CEST 2013 x86_64 GNU/Linux Nouveau is at 9.2.2 Kernel panic log attached. What else can I do in order to help? Thanks.
Created attachment 88193 [details] Kernel log vw Updated log. Previous was not ok.
From the code in the trace, looks like node->mem is somehow null. Can you supply a full dmesg?
Created attachment 88210 [details] Full log Now having a look to the full log I can see a lot of other things that should be taken into account. Sorry for cutting the information.
This can't be good. Happens after switcheroo turns the card off and it goes to D3cold. I suspect the crash is related to this. ------------[ cut here ]------------ WARNING: CPU: 6 PID: 401 at kernel/watchdog.c:245 watchdog_overflow_callback+0x9c/0xd0() Watchdog detected hard LOCKUP on cpu 6 Modules linked in: joydev uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core videodev media nls_cp437 vfat fat snd_hda_codec_hdmi snd_hda_codec_ca0132 x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crc32_pclmul crc32c_intel snd_hda_intel ghash_clmulni_intel aesni_intel aes_x86_64 snd_hda_codec lrw gf128mul glue_helper arc4 dell_wmi snd_hwdep snd_pcm sparse_keymap snd_page_alloc ath9k ath9k_common ath9k_hw ath mac80211 nouveau cfg80211 mxm_wmi iTCO_wdt rfkill ttm psmouse iTCO_vendor_support snd_timer atl1c serio_raw rtsx_pci_ms memstick ablk_helper cryptd snd fan microcode thermal wmi mperf ac evdev shpchp processor pcspkr soundcore battery lpc_ich i2c_i801 mei_me mei ext4 crc16 mbcache jbd2 hid_generic usbhid hid sr_mod sd_mod cdrom rtsx_pci_sdmmc ahci libahci libata sdhci_pci sdhci ehci_pci xhci_hcd ehci_hcd scsi_mod mmc_core rtsx_pci usbcore usb_common i915 video button i2c_algo_bit intel_agp intel_gtt drm_kms_helper drm i2c_core CPU: 6 PID: 401 Comm: Xorg Not tainted 3.11.6-1-ARCH #1 Hardware name: Alienware M14xR2/M14xR2, BIOS A10 06/29/2012 0000000000000009 ffff88025f386c10 ffffffff814dba02 ffff88025f386c58 ffff88025f386c48 ffffffff8106193d ffff880253688000 0000000000000000 ffff88025f386d78 0000000000000000 ffff88025f386ef8 ffff88025f386ca8 Call Trace: <NMI> [<ffffffff814dba02>] dump_stack+0x54/0x8d [<ffffffff8106193d>] warn_slowpath_common+0x7d/0xa0 [<ffffffff810619ac>] warn_slowpath_fmt+0x4c/0x50 [<ffffffff8101c665>] ? native_sched_clock+0x15/0x80 [<ffffffff8101c6d9>] ? sched_clock+0x9/0x10 [<ffffffff810e9950>] ? watchdog_enable_all_cpus.part.2+0x40/0x40 [<ffffffff810e99ec>] watchdog_overflow_callback+0x9c/0xd0 [<ffffffff8112962e>] __perf_event_overflow+0x8e/0x2b0 [<ffffffff811284b7>] ? perf_event_update_userpage+0xe7/0x160 [<ffffffff8112a1e4>] perf_event_overflow+0x14/0x20 [<ffffffff8103072d>] intel_pmu_handle_irq+0x1bd/0x3c0 [<ffffffff814e489b>] perf_event_nmi_handler+0x2b/0x50 [<ffffffff814e3ea1>] nmi_handle.isra.3+0xa1/0x1d0 [<ffffffff814e4139>] do_nmi+0x169/0x340 [<ffffffff814e34f1>] end_repeat_nmi+0x1e/0x2e [<ffffffff81298a12>] ? ioread32+0x42/0x50 [<ffffffff81298a12>] ? ioread32+0x42/0x50 [<ffffffff81298a12>] ? ioread32+0x42/0x50 <<EOE>> [<ffffffffa078d7cb>] ? nv04_timer_read+0x3b/0x70 [nouveau] [<ffffffffa078d574>] nouveau_timer_wait_eq+0x74/0xd0 [nouveau] [<ffffffffa076f362>] nv84_bar_flush+0x52/0x90 [nouveau] [<ffffffffa0790892>] nvc0_vm_flush+0x42/0x1a0 [nouveau] [<ffffffffa079061c>] ? nvc0_vm_map+0xfc/0x110 [nouveau] [<ffffffffa078e1c5>] nouveau_vm_map_at+0x165/0x1d0 [nouveau] [<ffffffffa078e243>] nouveau_vm_map+0x13/0x20 [nouveau] [<ffffffffa07cb09c>] nouveau_bo_move_ntfy+0xbc/0xd0 [nouveau] [<ffffffffa06b0f1e>] ttm_bo_handle_move_mem+0x20e/0x5c0 [ttm] [<ffffffffa06b19b9>] ? ttm_bo_mem_space+0x179/0x360 [ttm] [<ffffffffa06b1f97>] ttm_bo_move_buffer+0x117/0x130 [ttm] [<ffffffff8120364d>] ? proc_alloc_inode+0x1d/0xb0 [<ffffffffa06b203a>] ttm_bo_validate+0x8a/0x100 [ttm] [<ffffffffa07cc5cc>] nouveau_bo_validate+0x1c/0x20 [nouveau] [<ffffffffa07ce159>] validate_list+0x69/0x310 [nouveau] [<ffffffffa07cf4ca>] nouveau_gem_ioctl_pushbuf+0x9aa/0x1560 [nouveau] [<ffffffff814df7ce>] ? mutex_unlock+0xe/0x10 [<ffffffffa00111a2>] drm_ioctl+0x532/0x660 [drm] [<ffffffff81072aa7>] ? kill_pid_info+0x47/0x60 [<ffffffff811b1c05>] do_vfs_ioctl+0x2e5/0x4d0 [<ffffffff810711a2>] ? __set_task_blocked+0x32/0x70 [<ffffffff811a15ee>] ? ____fput+0xe/0x10 [<ffffffff811b1e71>] SyS_ioctl+0x81/0xa0 [<ffffffff814e665e>] ? do_page_fault+0xe/0x10 [<ffffffff814ea5dd>] system_call_fastpath+0x1a/0x1f ---[ end trace 7fcf10949e51422c ]--- Then, when turning the card back on, nouveau E[ VM][0000:01:00.0] vm timeout 1: 0xbadf1200 1 Which probably leaves the vm uninitialized (?), and the BUG which happens due to node->mem being NULL: BUG: unable to handle kernel NULL pointer dereference at 00000000000000e0 IP: [<ffffffffa07887ab>] nv50_instobj_wr32+0x2b/0xc0 [nouveau] PGD 24b816067 PUD 2524ef067 PMD 0 Oops: 0000 [#1] PREEMPT SMP Modules linked in: joydev uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core videodev media nls_cp437 vfat fat snd_hda_codec_hdmi snd_hda_codec_ca0132 x86_pkg_temp_thermal intel_powerclamp kvm_intel kvm crc32_pclmul crc32c_intel snd_hda_intel ghash_clmulni_intel aesni_intel aes_x86_64 snd_hda_codec lrw gf128mul glue_helper arc4 dell_wmi snd_hwdep snd_pcm sparse_keymap snd_page_alloc ath9k ath9k_common ath9k_hw ath mac80211 nouveau cfg80211 mxm_wmi iTCO_wdt rfkill ttm psmouse iTCO_vendor_support snd_timer atl1c serio_raw rtsx_pci_ms memstick ablk_helper cryptd snd fan microcode thermal wmi mperf ac evdev shpchp processor pcspkr soundcore battery lpc_ich i2c_i801 mei_me mei ext4 crc16 mbcache jbd2 hid_generic usbhid hid sr_mod sd_mod cdrom rtsx_pci_sdmmc ahci libahci libata sdhci_pci sdhci ehci_pci xhci_hcd ehci_hcd scsi_mod mmc_core rtsx_pci usbcore usb_common i915 video button i2c_algo_bit intel_agp intel_gtt drm_kms_helper drm i2c_core [last unloaded: coretemp] CPU: 3 PID: 375 Comm: bumblebeed Tainted: G W 3.11.6-1-ARCH #1 Hardware name: Alienware M14xR2/M14xR2, BIOS A10 06/29/2012 task: ffff88024f82a1c0 ti: ffff88025251a000 task.ti: ffff88025251a000 RIP: 0010:[<ffffffffa07887ab>] [<ffffffffa07887ab>] nv50_instobj_wr32+0x2b/0xc0 [nouveau] RSP: 0018:ffff88025251bc60 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff880252d8d900 RCX: ffffffffa0814ac0 RDX: 00000000ffeefeff RSI: 0000000000000000 RDI: ffff88024ee58060 RBP: ffff88025251bc90 R08: 0000000000000000 R09: ffffffff8116b8ca R10: ffff88025251bfd8 R11: 0000000000000001 R12: ffff88024ee58060 R13: 00000ffffff00000 R14: 00000000ffeefeff R15: 0000000000000000 FS: 00007fc9e4f01700(0000) GS:ffff88025f2c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000e0 CR3: 000000025230f000 CR4: 00000000001407e0 Stack: 0000000000008000 0000000000000004 ffff88024ee58060 ffff880252d8d970 ffff880252d8d920 0000000000000000 ffff88025251bcc0 ffffffffa0787f84 ffff880252d8d900 0000000000000000 ffff88025354b000 0000000000000000 Call Trace: [<ffffffffa0787f84>] nouveau_instmem_init+0x84/0xc0 [nouveau] [<ffffffffa07880be>] _nouveau_instmem_init+0xe/0x10 [nouveau] [<ffffffffa076dffd>] nouveau_object_inc+0xbd/0x1b0 [nouveau] [<ffffffffa07937c5>] nouveau_device_init+0x25/0xa0 [nouveau] [<ffffffffa076dffd>] nouveau_object_inc+0xbd/0x1b0 [nouveau] [<ffffffffa076dfd7>] nouveau_object_inc+0x97/0x1b0 [nouveau] [<ffffffffa076c79b>] nouveau_handle_init+0x7b/0x230 [nouveau] [<ffffffffa076c831>] nouveau_handle_init+0x111/0x230 [nouveau] [<ffffffffa076b162>] nouveau_client_init+0x32/0x60 [nouveau] [<ffffffffa07c6744>] nouveau_do_resume+0x64/0x130 [nouveau] [<ffffffffa07c6870>] nouveau_pmops_resume+0x60/0x70 [nouveau] [<ffffffffa07c96c0>] nouveau_switcheroo_set_state+0x90/0xb0 [nouveau] [<ffffffff81371a95>] vga_switchon+0x35/0x50 [<ffffffff81372328>] vga_switcheroo_debugfs_write+0x368/0x3b0 [<ffffffff8119fafd>] vfs_write+0xbd/0x1e0 [<ffffffff811a0559>] SyS_write+0x49/0xa0 [<ffffffff814ea5dd>] system_call_fastpath+0x1a/0x1f Code: 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 89 d6 41 55 49 bd 00 00 f0 ff ff 0f 00 00 41 54 53 48 83 ec 08 48 8b 47 48 48 8b 5f 10 <48> 03 b0 e0 00 00 00 4c 8d a3 90 00 00 00 4c 89 e7 49 21 f5 81 RIP [<ffffffffa07887ab>] nv50_instobj_wr32+0x2b/0xc0 [nouveau] RSP <ffff88025251bc60>
Can you check whether this still happens with recent kernels? With 3.13.x the card should automatically power on/off as needed.
No response to retest request over a year ago
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.