Bug 97438 - Running a lot of Firefox instances causes kernel page fault.
Summary: Running a lot of Firefox instances causes kernel page fault.
Status: REOPENED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/nouveau (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium major
Assignee: Nouveau Project
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-08-22 17:07 UTC by accelrtr
Modified: 2017-12-31 12:46 UTC (History)
4 users (show)

See Also:
i915 platform:
i915 features:


Attachments
dmesg output from startup with showed blocked processes (75.54 KB, text/plain)
2016-08-22 17:07 UTC, accelrtr
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description accelrtr 2016-08-22 17:07:55 UTC
Created attachment 125951 [details]
dmesg output from startup with showed blocked processes

I noticed that sometimes application like firefox, mvp and other qt5 apps stops launching. Already running applications are not affected.

Steps to reproduce:

Using Firefox is the fastest method to reproduce bug.

1. Launch Firefox instances until kernel oops. Last instance freezes.

[  398.040470] resource sanity check: requesting [mem 0xf9fcf000-0xfa035fff], which spans more than 0000:01:00.0 [mem 0xf8000000-0xf9ffffff 64bit]
[  398.040518] caller nv50_instobj_boot+0xaa/0x120 [nouveau] mapping multiple BARs
[  398.040539] BUG: unable to handle kernel paging request at ffffc90000d30000
[  398.040567] IP: [<ffffffff813a73d1>] iowrite32+0x31/0x40
[  398.040581] PGD 12b085067 PUD 12b086067 PMD c9657067 PTE 0
[  398.040595] Oops: 0002 [#1] PREEMPT SMP
[  398.040600] Modules linked in: fuse nouveau wmi atl1c video pcspkr fbcon bitblit softcursor font i2c_algo_bit backlight drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt fb_sys_fops cfbcopyarea ttm drm agpgart fb fbdev
[  398.040663] CPU: 1 PID: 4704 Comm: firefox Not tainted 4.7.2-gentoo #1
[  398.040670] Hardware name: Gigabyte Technology Co., Ltd. G41M-Combo/G41M-Combo, BIOS FA 02/29/2012
[  398.040678] task: ffff88003c14a640 ti: ffff8801070b4000 task.ti: ffff8801070b4000
[  398.040685] RIP: 0010:[<ffffffff813a73d1>]  [<ffffffff813a73d1>] iowrite32+0x31/0x40
[  398.040695] RSP: 0018:ffff8801070b7838  EFLAGS: 00010292
[  398.040700] RAX: ffffffffa020b940 RBX: 0000000000010000 RCX: 0000000000000000
[  398.040707] RDX: ffffc90000d30000 RSI: ffffc90000d30000 RDI: 000000003defa281
[  398.040713] RBP: ffff8801070b7840 R08: 0000000000000067 R09: 000000003dec9000
[  398.040719] R10: ffff880046925900 R11: 000000003dec9000 R12: 0000000000000000
[  398.040726] R13: 000000003defa281 R14: 0000000000000100 R15: ffff88012a7daf40
[  398.040733] FS:  00007f1e9b8d9740(0000) GS:ffff88012fa80000(0000) knlGS:0000000000000000
[  398.040739] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  398.040745] CR2: ffffc90000d30000 CR3: 0000000119d6f000 CR4: 00000000000406e0
[  398.040750] Stack:
[  398.040754]  ffffffffa017cb6f ffff8801070b7898 ffffffffa0182663 ffff880046925900
[  398.040766]  ffff88005e153198 000000003df1a001 000000b000000000 0000000000000067
[  398.040776]  0000000000000000 0000000000002036 ffff88012a07ba00 ffff88005e153198
[  398.040787] Call Trace:
[  398.040811]  [<ffffffffa017cb6f>] ? nvkm_instobj_wr32+0xf/0x20 [nouveau]
[  398.040836]  [<ffffffffa0182663>] nv50_vm_map+0x153/0x1a0 [nouveau]
[  398.040860]  [<ffffffffa0180917>] nvkm_vm_map_at+0xd7/0x1c0 [nouveau]
[  398.040884]  [<ffffffffa017d7e6>] nv50_instobj_map+0x16/0x20 [nouveau]
[  398.040908]  [<ffffffffa017da50>] nv50_instobj_boot+0xc0/0x120 [nouveau]
[  398.040933]  [<ffffffffa017d966>] nv50_instobj_acquire+0x46/0x70 [nouveau]
[  398.040960]  [<ffffffffa017cae2>] nvkm_instobj_acquire_slow+0x12/0x30 [nouveau]
[  398.040984]  [<ffffffffa017ce26>] nvkm_instobj_new+0x56/0x150 [nouveau]
[  398.041003]  [<ffffffffa0140100>] nvkm_memory_new+0x30/0x50 [nouveau]
[  398.041010]  [<ffffffffa013ef99>] nvkm_gpuobj_new+0x119/0x200 [nouveau]
[  398.041010]  [<ffffffffa01b4657>] nv50_gr_chan_bind+0x27/0x60 [nouveau]
[  398.041010]  [<ffffffffa0140ee2>] nvkm_object_bind+0x12/0x20 [nouveau]
[  398.041010]  [<ffffffffa01a8804>] g84_fifo_chan_engine_ctor+0x34/0x40 [nouveau]
[  398.041010]  [<ffffffffa01a7b3f>] nvkm_fifo_chan_child_new+0x14f/0x270 [nouveau]
[  398.041010]  [<ffffffffa013fc53>] nvkm_ioctl_new+0x123/0x280 [nouveau]
[  398.041010]  [<ffffffff81197839>] ? ___slab_alloc+0x1e9/0x4a0
[  398.041010]  [<ffffffffa01a74cc>] ? nvkm_fifo_chan_child_get+0x5c/0x110 [nouveau]
[  398.041010]  [<ffffffffa01a79f0>] ? nvkm_fifo_chan_dtor+0xd0/0xd0 [nouveau]
[  398.041010]  [<ffffffffa0141570>] ? nvkm_object_new_+0x60/0x60 [nouveau]
[  398.041010]  [<ffffffffa013fffc>] nvkm_ioctl+0x1cc/0x240 [nouveau]
[  398.041010]  [<ffffffffa01d71ed>] nvkm_client_ioctl+0xd/0x10 [nouveau]
[  398.041010]  [<ffffffffa013d03d>] nvif_object_ioctl+0x3d/0x50 [nouveau]
[  398.041010]  [<ffffffffa013d59d>] nvif_object_init+0xbd/0x130 [nouveau]
[  398.041010]  [<ffffffffa01eae51>] nouveau_abi16_ioctl_grobj_alloc+0x141/0x2d0 [nouveau]
[  398.041010]  [<ffffffffa00276bd>] drm_ioctl+0x13d/0x560 [drm]
[  398.041010]  [<ffffffffa01ead10>] ? nouveau_abi16_ioctl_channel_free+0x90/0x90 [nouveau]
[  398.041010]  [<ffffffff81091fd5>] ? preempt_count_add+0xa5/0xc0
[  398.041010]  [<ffffffff818288eb>] ? _raw_spin_unlock_irqrestore+0x1b/0x30
[  398.041010]  [<ffffffffa01d4e32>] nouveau_drm_ioctl+0x62/0xb0 [nouveau]
[  398.041010]  [<ffffffff811b8b1b>] do_vfs_ioctl+0x8b/0x590
[  398.041010]  [<ffffffff811b9094>] SyS_ioctl+0x74/0x80
[  398.041010]  [<ffffffff81828e72>] entry_SYSCALL_64_fastpath+0x1a/0xa4
[  398.041010] Code: 00 48 89 f2 77 25 48 81 fe 00 00 01 00 76 07 0f b7 d6 89 f8 ef c3 55 48 c7 c6 1c 72 bb 81 48 89 d7 48 89 e5 e8 51 fe ff ff 5d c3 <89> 3e c3 66 90 66 2e 0f 1f 84 00 00 00 00 00 48 81 ff ff ff 03 
[  398.041010] RIP  [<ffffffff813a73d1>] iowrite32+0x31/0x40
[  398.041010]  RSP <ffff8801070b7838>
[  398.041010] CR2: ffffc90000d30000
[  398.041010] ---[ end trace 0441d3b445912110 ]---
[  598.401939] sysrq: SysRq : Show Blocked State
[  598.401946]   task                        PC stack   pid father
[  598.402183] firefox         D ffff8801070b73f8 12256  4704   4699 0x00000002
[  598.402193]  ffff8801070b73f8 0000000000000000 ffff88012ab61980 ffff88003c14a640
[  598.402197]  ffff8801070b7420 ffff8801070b8000 ffff88002674d15c ffff88003c14a640
[  598.402200]  00000000ffffffff ffff88002674d160 ffff8801070b7410 ffffffff81824dea
[  598.402204] Call Trace:
[  598.402212]  [<ffffffff81824dea>] schedule+0x3a/0x90
[  598.402215]  [<ffffffff81825233>] schedule_preempt_disabled+0x13/0x20
[  598.402217]  [<ffffffff81826c50>] __mutex_lock_slowpath+0x90/0x120
[  598.402220]  [<ffffffff81826cf2>] mutex_lock+0x12/0x30
[  598.402250]  [<ffffffffa01d52b0>] nouveau_drm_preclose+0x30/0x90 [nouveau]
[  598.402259]  [<ffffffffa0025fc3>] drm_release+0xb3/0x4c0 [drm]
[  598.402263]  [<ffffffff811a7a89>] __fput+0xc9/0x1d0
[  598.402265]  [<ffffffff811a7bc9>] ____fput+0x9/0x10
[  598.402269]  [<ffffffff8108a18c>] task_work_run+0x7c/0xa0
[  598.402272]  [<ffffffff81070fe7>] do_exit+0x367/0xbf0
[  598.402276]  [<ffffffff8102fa07>] oops_end+0x97/0xd0
[  598.402279]  [<ffffffff8105ef7d>] no_context+0x10d/0x360
[  598.402282]  [<ffffffff8105f247>] __bad_area_nosemaphore+0x77/0x1b0
[  598.402285]  [<ffffffff8105f38f>] bad_area_nosemaphore+0xf/0x20
[  598.402287]  [<ffffffff8105fa38>] __do_page_fault+0x88/0x510
[  598.402291]  [<ffffffff810ba2b7>] ? print_time.part.13+0x67/0x90
[  598.402293]  [<ffffffff810ba341>] ? print_prefix+0x61/0xa0
[  598.402295]  [<ffffffff8105fee2>] do_page_fault+0x22/0x30
[  598.402298]  [<ffffffff8182ae88>] page_fault+0x28/0x30
[  598.402303]  [<ffffffff813a73d1>] ? iowrite32+0x31/0x40
[  598.402322]  [<ffffffffa017cb6f>] ? nvkm_instobj_wr32+0xf/0x20 [nouveau]
[  598.402341]  [<ffffffffa0182663>] nv50_vm_map+0x153/0x1a0 [nouveau]
[  598.402361]  [<ffffffffa0180917>] nvkm_vm_map_at+0xd7/0x1c0 [nouveau]
[  598.402380]  [<ffffffffa017d7e6>] nv50_instobj_map+0x16/0x20 [nouveau]
[  598.402399]  [<ffffffffa017da50>] nv50_instobj_boot+0xc0/0x120 [nouveau]
[  598.402418]  [<ffffffffa017d966>] nv50_instobj_acquire+0x46/0x70 [nouveau]
[  598.402437]  [<ffffffffa017cae2>] nvkm_instobj_acquire_slow+0x12/0x30 [nouveau]
[  598.402456]  [<ffffffffa017ce26>] nvkm_instobj_new+0x56/0x150 [nouveau]
[  598.402470]  [<ffffffffa0140100>] nvkm_memory_new+0x30/0x50 [nouveau]
[  598.402484]  [<ffffffffa013ef99>] nvkm_gpuobj_new+0x119/0x200 [nouveau]
[  598.402502]  [<ffffffffa01b4657>] nv50_gr_chan_bind+0x27/0x60 [nouveau]
[  598.402516]  [<ffffffffa0140ee2>] nvkm_object_bind+0x12/0x20 [nouveau]
[  598.402535]  [<ffffffffa01a8804>] g84_fifo_chan_engine_ctor+0x34/0x40 [nouveau]
[  598.402553]  [<ffffffffa01a7b3f>] nvkm_fifo_chan_child_new+0x14f/0x270 [nouveau]
[  598.402567]  [<ffffffffa013fc53>] nvkm_ioctl_new+0x123/0x280 [nouveau]
[  598.402570]  [<ffffffff81197839>] ? ___slab_alloc+0x1e9/0x4a0
[  598.402589]  [<ffffffffa01a74cc>] ? nvkm_fifo_chan_child_get+0x5c/0x110 [nouveau]
[  598.402607]  [<ffffffffa01a79f0>] ? nvkm_fifo_chan_dtor+0xd0/0xd0 [nouveau]
[  598.402622]  [<ffffffffa0141570>] ? nvkm_object_new_+0x60/0x60 [nouveau]
[  598.402636]  [<ffffffffa013fffc>] nvkm_ioctl+0x1cc/0x240 [nouveau]
[  598.402652]  [<ffffffffa01d71ed>] nvkm_client_ioctl+0xd/0x10 [nouveau]
[  598.402665]  [<ffffffffa013d03d>] nvif_object_ioctl+0x3d/0x50 [nouveau]
[  598.402678]  [<ffffffffa013d59d>] nvif_object_init+0xbd/0x130 [nouveau]
[  598.402694]  [<ffffffffa01eae51>] nouveau_abi16_ioctl_grobj_alloc+0x141/0x2d0 [nouveau]
[  598.402700]  [<ffffffffa00276bd>] drm_ioctl+0x13d/0x560 [drm]
[  598.402715]  [<ffffffffa01ead10>] ? nouveau_abi16_ioctl_channel_free+0x90/0x90 [nouveau]
[  598.402719]  [<ffffffff81091fd5>] ? preempt_count_add+0xa5/0xc0
[  598.402722]  [<ffffffff818288eb>] ? _raw_spin_unlock_irqrestore+0x1b/0x30
[  598.402738]  [<ffffffffa01d4e32>] nouveau_drm_ioctl+0x62/0xb0 [nouveau]
[  598.402741]  [<ffffffff811b8b1b>] do_vfs_ioctl+0x8b/0x590
[  598.402744]  [<ffffffff811b9094>] SyS_ioctl+0x74/0x80
[  598.402747]  [<ffffffff81828e72>] entry_SYSCALL_64_fastpath+0x1a/0xa4


3. Try to switch VT. Mouse disappears, Xorg freezes and switching terminal no longer possible. 


This bug also reproducible on 4.4.6-gentoo kernel
Comment 1 accelrtr 2016-09-05 15:13:42 UTC
I destroyed my card. Please, delete this bug.
Comment 2 Ilia Mirkin 2016-09-05 15:18:13 UTC
Closing as requested.
Comment 3 freedesktop.loser137 2016-09-12 16:15:10 UTC
I have the same problem. When trying to start a qt5-application like kontact, dolphin or systemsettings5, they do not start and there is the following oops in dmesg:
[  279.775857] resource sanity check: requesting [mem 0xcff02000-0xd0001fff], which spans more than 0000:01:00.0 [mem 0xce000000-0xcfffffff 64bit pref]
[  279.775907] caller nv50_instobj_boot+0xac/0x120 [nouveau] mapping multiple BARs
[  279.776152] BUG: unable to handle kernel paging request at ffffc900009d0000
[  279.776461] IP: [<ffffffff81304b2e>] iowrite32+0x2e/0x40
[  279.776831] PGD 1334a0067 PUD 1334a1067 PMD 7f948067 PTE 0
[  279.779156] Oops: 0002 [#1] PREEMPT SMP
[  279.781473] Modules linked in: rfcomm bnep xt_owner iptable_filter btusb sha256_ssse3 sha256_generic cbc btrtl btbcm dm_crypt btintel bluetooth dm_mod hid_generic usbhid uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core videodev media uas hid arc4 iwlmvm mac80211 mousedev intel_powerclamp joydev coretemp acer_wmi kvm_intel iTCO_wdt iTCO_vendor_support sparse_keymap kvm iwlwifi irqbypass crc32c_intel mei_me mei snd_hda_codec_hdmi lpc_ich cfg80211 rfkill snd_hda_codec_generic intel_cstate psmouse input_leds led_class pcspkr serio_raw intel_ips fjes snd_hda_intel snd_hda_codec ac thermal battery intel_agp snd_hda_core shpchp intel_gtt evdev snd_hwdep snd_pcm i2c_i801 mac_hid acpi_cpufreq tpm_tis tpm sch_fq_codel snd_seq_dummy usbip_host usbip_core loop snd_seq_oss snd_seq_midi_event
[  279.783730]  snd_seq snd_seq_device snd_timer snd soundcore cuse fuse sg vhba(O) ip_tables x_tables ext4 crc16 jbd2 mbcache usb_storage sr_mod cdrom sd_mod atkbd libps2 ahci libahci libata ehci_pci ehci_hcd scsi_mod usbcore usb_common i8042 serio tg3 ptp pps_core broadcom bcm_phy_lib libphy nouveau button video mxm_wmi wmi i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm
[  279.785630] CPU: 2 PID: 1722 Comm: systemsettings5 Tainted: G           O    4.7.2-1-ARCH #1
[  279.786280] Hardware name: Acer             TravelMate8572TG /BAP50-CP        , BIOS V1.27   04/25/2011
[  279.786929] task: ffff880049dcb400 ti: ffff88003e8e4000 task.ti: ffff88003e8e4000
[  279.787581] RIP: 0010:[<ffffffff81304b2e>]  [<ffffffff81304b2e>] iowrite32+0x2e/0x40
[  279.788238] RSP: 0018:ffff88003e8e7948  EFLAGS: 00010292
[  279.788896] RAX: ffffffffa01e0240 RBX: 0000000000010000 RCX: 0000000000000009
[  279.789558] RDX: 000000003dffb081 RSI: ffffc900009d0000 RDI: 000000003dffb081
[  279.790224] RBP: ffff88003e8e7950 R08: 0000000000000100 R09: 000000003defd000
[  279.790893] R10: ffff88003c137400 R11: 000000003defd000 R12: 0000000000000000
[  279.791555] R13: 000000003dffb081 R14: 0000000000000010 R15: ffff88007f93a440
[  279.792225] FS:  00007f1fab406180(0000) GS:ffff880137d00000(0000) knlGS:0000000000000000
[  279.792890] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  279.793556] CR2: ffffc900009d0000 CR3: 000000003c488000 CR4: 00000000000006e0
[  279.794237] Stack:
[  279.794910]  ffffffffa014cec4 ffff88003e8e79a8 ffffffffa0152e57 ffff88003c137400
[  279.795589]  ffff88004a293918 000000003dffd001 0000000000000000 0000000000000100
[  279.796283]  0000000000000000 0000000000002002 ffff8800b9b56000 ffff88004a293918
[  279.796966] Call Trace:
[  279.797668]  [<ffffffffa014cec4>] ? nvkm_instobj_wr32+0x14/0x20 [nouveau]
[  279.798344]  [<ffffffffa0152e57>] nv50_vm_map+0x157/0x1a0 [nouveau]
[  279.799032]  [<ffffffffa0150f6d>] nvkm_vm_map_at+0xdd/0x1d0 [nouveau]
[  279.799722]  [<ffffffffa014dc8b>] nv50_instobj_map+0x1b/0x20 [nouveau]
[  279.800412]  [<ffffffffa014def2>] nv50_instobj_boot+0xc2/0x120 [nouveau]
[  279.801100]  [<ffffffffa014de0b>] nv50_instobj_acquire+0x4b/0x70 [nouveau]
[  279.801789]  [<ffffffffa014cdf7>] nvkm_instobj_acquire_slow+0x17/0x30 [nouveau]
[  279.802482]  [<ffffffffa014d19a>] nvkm_instobj_new+0x6a/0x180 [nouveau]
[  279.803166]  [<ffffffffa01075d4>] nvkm_memory_new+0x44/0x80 [nouveau]
[  279.803859]  [<ffffffffa0151a6f>] nvkm_vm_get+0x15f/0x250 [nouveau]
[  279.804553]  [<ffffffffa01af152>] nouveau_bo_vma_add+0x32/0xa0 [nouveau]
[  279.805239]  [<ffffffffa01ad26e>] ? nouveau_bo_map+0x6e/0xb0 [nouveau]
[  279.805932]  [<ffffffffa01c0612>] nouveau_channel_prep+0x1d2/0x290 [nouveau]
[  279.806634]  [<ffffffffa01c0727>] nouveau_channel_new+0x57/0x6e0 [nouveau]
[  279.807334]  [<ffffffffa0104a2d>] ? nvif_device_init+0x2d/0x30 [nouveau]
[  279.808029]  [<ffffffff811d71b0>] ? kmem_cache_alloc_trace+0x1c0/0x1e0
[  279.808744]  [<ffffffffa01bf824>] nouveau_abi16_ioctl_channel_alloc+0xe4/0x340 [nouveau]
[  279.809456]  [<ffffffffa00059a2>] drm_ioctl+0x152/0x540 [drm]
[  279.810171]  [<ffffffffa01bf740>] ? nouveau_abi16_ioctl_setparam+0x10/0x10 [nouveau]
[  279.810891]  [<ffffffffa01a8554>] nouveau_drm_ioctl+0x74/0xc0 [nouveau]
[  279.811598]  [<ffffffff8120cd72>] do_vfs_ioctl+0xa2/0x5d0
[  279.812318]  [<ffffffff81217e07>] ? __fget+0x77/0xb0
[  279.813030]  [<ffffffff8120d319>] SyS_ioctl+0x79/0x90
[  279.813744]  [<ffffffff815de7b2>] entry_SYSCALL_64_fastpath+0x1a/0xa4
[  279.814454] Code: ff ff 03 00 77 25 48 81 fe 00 00 01 00 76 07 0f b7 d6 89 f8 ef c3 55 48 89 f7 48 c7 c6 f2 37 73 81 48 89 e5 e8 54 fe ff ff 5d c3 <89> 3e c3 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 81 ff 
[  279.815989] RIP  [<ffffffff81304b2e>] iowrite32+0x2e/0x40
[  279.816741]  RSP <ffff88003e8e7948>
[  279.817490] CR2: ffffc900009d0000
[  279.822806] ---[ end trace ec9eb859880e3d78 ]---

Like accelrtr I cannot switch to VT afterwards.
Should I upload additional logs here or open a new bug?
Comment 4 Reuben 2016-09-21 01:24:48 UTC
Same issue.

Kernel 4.7.4
Mesa 12.0.3

35.241925] resource sanity check: requesting [mem 0xfbfbb000-0xfc020fff], which spans more than 0000:0a:00.0 [mem 0xfa000000-0xfbffffff 64bit]
[   35.241956] caller nv50_instobj_boot+0x94/0x100 [nouveau] mapping multiple BARs
[   35.241969] BUG: unable to handle kernel paging request at ffffc900039f0000
[   35.241981] IP: [<ffffffff81478f6b>] iowrite32+0x2b/0x40
[   35.241994] PGD 437025067 PUD 437026067 PMD 432772067 PTE 0
[   35.242001] Oops: 0002 [#1] PREEMPT SMP
[   35.242003] Modules linked in: nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter xt_conntrack ebtable_nat ebtable_broute ip6table_nat ip6table_raw ip6table_mangle ip6table_security iptable_nat iptable_raw iptable_mangle iptable_security ebtable_filter ebtables hid_logitech_hidpp wacom hid_logitech_dj nouveau snd_hda_codec_realtek snd_hda_codec_generic k10temp snd_hda_intel snd_hda_codec 8250 snd_hwdep r8169 snd_rme96 snd_hda_core ttm 8250_base i2c_piix4 serial_core mii wmi sch_fq_codel tipc snd_aloop snd_pcm
[   35.242061] CPU: 2 PID: 1546 Comm: kded4 Not tainted 4.7.4-gentoo #1
[   35.242065] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8 02/24/2011
[   35.242069] task: ffff8803a49e8b80 ti: ffff88039f654000 task.ti: ffff88039f654000
[   35.242076] RIP: 0010:[<ffffffff81478f6b>]  [<ffffffff81478f6b>] iowrite32+0x2b/0x40
[   35.242082] RSP: 0018:ffff88039f657a48  EFLAGS: 00010292
[   35.242084] RAX: ffffffffa021ef20 RBX: 0000000000010000 RCX: 0000000000000009
[   35.242088] RDX: ffffc900039f0000 RSI: ffffc900039f0000 RDI: 000000001df13281
[   35.242092] RBP: 000000001df13281 R08: 0000000000000066 R09: 000000001dece000
[   35.242095] R10: ffffffffa0220060 R11: 000000001dece000 R12: 0000000000000000
[   35.242099] R13: 0000000000000008 R14: 0000000000000100 R15: ffff8804303089c0
[   35.242103] FS:  00007fe55c501800(0000) GS:ffff880447c80000(0000) knlGS:0000000000000000
[   35.242107] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   35.242110] CR2: ffffc900039f0000 CR3: 00000003a4bfa000 CR4: 00000000000006e0
[   35.242114] Stack:
[   35.242115]  ffffffffa019679b 000000001df33001 ffff8800b788a618 0000000000000010
[   35.242121]  0000000000000000 ffff88040f56af00 0000000000000066 0000000000000000
[   35.242127]  0000000000002021 ffff880433ca3000 0000000000000000 ffff8800b788a618
[   35.242133] Call Trace:
[   35.242148]  [<ffffffffa019679b>] ? nv50_vm_map+0x15b/0x1a0 [nouveau]
[   35.242164]  [<ffffffffa01949ec>] ? nvkm_vm_map_at+0xcc/0x1e0 [nouveau]
[   35.242179]  [<ffffffffa0191bea>] ? nv50_instobj_boot+0xaa/0x100 [nouveau]
[   35.242195]  [<ffffffffa0191b1e>] ? nv50_instobj_acquire+0x3e/0x60 [nouveau]
[   35.242210]  [<ffffffffa0190bce>] ? nvkm_instobj_acquire_slow+0xe/0x20 [nouveau]
[   35.242225]  [<ffffffffa0190f12>] ? nvkm_instobj_new+0x52/0x140 [nouveau]
[   35.242249]  [<ffffffffa015430c>] ? nvkm_memory_new+0x2c/0x60 [nouveau]
[   35.242260]  [<ffffffffa015320f>] ? nvkm_gpuobj_new+0x10f/0x200 [nouveau]
[   35.242271]  [<ffffffffa01528ee>] ? nvkm_engine_ref+0x2e/0x60 [nouveau]
[   35.242284]  [<ffffffffa01c8082>] ? nv50_gr_chan_bind+0x22/0x60 [nouveau]
[   35.242298]  [<ffffffffa01bb984>] ? nvkm_fifo_chan_child_new+0x124/0x240 [nouveau]
[   35.242309]  [<ffffffffa0153ade>] ? nvkm_ioctl_new+0x11e/0x260 [nouveau]
[   35.242321]  [<ffffffffa01fe3b6>] ? nouveau_abi16_ioctl_notifierobj_alloc+0x1b6/0x2a0 [nouveau]
[   35.242336]  [<ffffffffa01bb860>] ? nvkm_fifo_chan_dtor+0xc0/0xc0 [nouveau]
[   35.242347]  [<ffffffffa01557a0>] ? nvkm_object_new_+0x60/0x60 [nouveau]
[   35.242358]  [<ffffffffa0154117>] ? nvkm_ioctl+0xf7/0x240 [nouveau]
[   35.242371]  [<ffffffffa01ea353>] ? usif_ioctl+0x4f3/0x760 [nouveau]
[   35.242383]  [<ffffffffa01e7761>] ? nouveau_drm_ioctl+0xa1/0xc0 [nouveau]
[   35.242390]  [<ffffffff811b42aa>] ? do_vfs_ioctl+0x8a/0x5a0
[   35.242394]  [<ffffffff8110e6a8>] ? audit_filter_inodes+0xe8/0x100
[   35.242397]  [<ffffffff8110e0c2>] ? audit_filter_syscall+0xa2/0x100
[   35.242401]  [<ffffffff811b482f>] ? SyS_ioctl+0x6f/0x80
[   35.242405]  [<ffffffff81002691>] ? do_syscall_64+0x51/0x100
[   35.242409]  [<ffffffff8193393c>] ? entry_SYSCALL64_slow_path+0x25/0x25
[   35.242412] Code: 48 81 fe ff ff 03 00 48 89 f2 77 1f 48 81 fe 00 00 01 00 76 07 0f b7 d6 89 f8 ef c3 48 c7 c6 0b 81 be 81 48 89 d7 e9 55 fe ff ff <89> 3e c3 0f 1f 84 00 00 00 00 00 66 2e 0f 1f 84 00 00 00 00 00 
[   35.245837] RIP  [<ffffffff81478f6b>] iowrite32+0x2b/0x40
[   35.245841]  RSP <ffff88039f657a48>
[   35.245843] CR2: ffffc900039f0000
[   35.252148] ---[ end trace a033f52a09912e38 ]---
Comment 5 Reuben 2016-09-21 01:28:19 UTC
I can rebuild kernel / libraries with debugging enabled if that would provide info that is actually useful.
Comment 6 Reuben 2016-10-08 21:12:49 UTC
This seems to be limited to NV50. I have the exact same setup on other systems with newer GPUs that do not exhibit this problem.
Comment 7 Ben Skeggs 2016-10-10 17:48:38 UTC
Give this[1] patch a try, it might be relevant.

[1] https://github.com/skeggsb/nouveau/commit/8135089922f84c4ef7b47aeaa707490a2ee29a1d
Comment 8 freedesktop.loser137 2016-10-14 06:31:43 UTC
I applied the patch from comment 7 to kernel 4.8.1 and the crashes are gone. There are new messages in dmesg:
[  230.257319] nouveau 0000:01:00.0: imem: PRAMIN exhausted
[  239.329904] nouveau 0000:01:00.0: imem: PRAMIN exhausted
Comment 9 Ben Skeggs 2016-10-14 06:51:55 UTC
(In reply to freedesktop.loser137 from comment #8)
> I applied the patch from comment 7 to kernel 4.8.1 and the crashes are gone.
> There are new messages in dmesg:
> [  230.257319] nouveau 0000:01:00.0: imem: PRAMIN exhausted
> [  239.329904] nouveau 0000:01:00.0: imem: PRAMIN exhausted

Yep, that's expected.  It's a limited resource, and it's all been used.  The driver will fall back to another (slower) method of accessing the data that it needs to map there.
Comment 10 Maris Nartiss 2017-02-26 11:24:49 UTC
(In reply to Ben Skeggs from comment #7)
> Give this[1] patch a try, it might be relevant.
> 
> [1]
> https://github.com/skeggsb/nouveau/commit/
> 8135089922f84c4ef7b47aeaa707490a2ee29a1d

I experienced a few crashes with similarly looking backtraces with lots of FF and KDE5 Okular et al. windows open. Applied this patch to kernel 4.9.6 and since then haven't had any issues. Although I had no reliable way how reproduce the crash to 100% confirm patch's effectiveness, at least the patch seems to not cause any bad side effects.

Tested on ~AMD64 Gentoo with G98M [Quadro NVS 160M]


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.