Bug 90569 - [NV67] GUI freezes after startup of Ubuntu 15.04. on Aspire 7520p [BUG: unable to handle kernel paging request at f84c8000]
Summary: [NV67] GUI freezes after startup of Ubuntu 15.04. on Aspire 7520p [BUG: unabl...
Status: RESOLVED INVALID
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/nouveau (show other bugs)
Version: unspecified
Hardware: x86 (IA32) Linux (All)
: medium blocker
Assignee: Nouveau Project
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-05-21 21:43 UTC by Jedet Ilrihm
Modified: 2016-02-23 07:39 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
System log file of 7 BUGs in nouveau (2.53 MB, text/plain)
2015-05-21 21:43 UTC, Jedet Ilrihm
no flags Details

Description Jedet Ilrihm 2015-05-21 21:43:01 UTC
Created attachment 115959 [details]
System log file of 7 BUGs in nouveau

Problem description:
--------------------
GUI freezes. No user interaction (mouse, keyboard) possible. Only hard shutdown possible.

System specifications:
----------------------
- Operating System: Ubuntu 15.04.
- Computer-Hardware: Acer Aspire 7520p
- Graphic-Card: Nvidia 7000m

Steps to reproduce:
-------------------
1. Start Ubuntu and log in
2. wait or accelerate the freeze by
2.1. starting firefox
2.2. starting a terminal

Why it looks like a nouveau bug?
--------------------------------
Well, I see something like the following in /var/log/systemlog each time, when the GUI freezes. (I compared the seconds of the clock with the log-entries and they match - always)
============================================================================
Apr 29 22:16:06 h kernel: [   78.633151] BUG: unable to handle kernel paging request at f84c8000
Apr 29 22:16:06 h kernel: [   78.633265] IP: [<f8fb6be9>] nouveau_bo_wr32+0x29/0x60 [nouveau]
Apr 29 22:16:06 h kernel: [   78.633430] *pdpt = 0000000001b79001 *pde = 0000000034c88067 *pte = 0000000000000000 
Apr 29 22:16:06 h kernel: [   78.633546] Oops: 0002 [#1] SMP 
Apr 29 22:16:06 h kernel: [   78.633596] Modules linked in: ctr ccm rfcomm bnep snd_hda_codec_realtek snd_hda_codec_generic arc4 acer_wmi sparse_keymap snd_hda_intel snd_hda_controller kvm_amd snd_hda_codec snd_hwdep kvm nouveau ath5k ath snd_pcm mxm_wmi ttm mac80211 joydev serio_raw edac_core cfg80211 drm_kms_helper snd_seq_midi k8temp edac_mce_amd snd_seq_midi_event drm r852 sm_common nand nand_ecc nand_bch i2c_algo_bit bch shpchp nand_ids snd_rawmidi mtd r592 memstick btusb snd_seq bluetooth snd_seq_device snd_timer snd ir_lirc_codec lirc_dev ir_xmp_decoder soundcore ir_mce_kbd_decoder ir_sharp_decoder ir_sanyo_decoder ir_sony_decoder ir_jvc_decoder ir_rc6_decoder ir_rc5_decoder ir_nec_decoder rc_rc6_mce ene_ir rc_core mac_hid wmi video i2c_nforce2 parport_pc ppdev lp parport autofs4 pata_acpi firewire_ohci sdhci_pci psmouse forcedeth sdhci ahci firewire_core libahci crc_itu_t pata_amd
Apr 29 22:16:06 h kernel: [   78.634843] CPU: 1 PID: 777 Comm: Xorg Not tainted 3.19.0-15-generic #15-Ubuntu
Apr 29 22:16:06 h kernel: [   78.634942] Hardware name: Acer             Aspire 7520     /Fuquene, BIOS V1.30 12/31/2007
Apr 29 22:16:06 h kernel: [   78.635055] task: edd70000 ti: ec07e000 task.ti: ec07e000
Apr 29 22:16:06 h kernel: [   78.635127] EIP: 0060:[<f8fb6be9>] EFLAGS: 00213246 CPU: 1
Apr 29 22:16:06 h kernel: [   78.635246] EIP is at nouveau_bo_wr32+0x29/0x60 [nouveau]
Apr 29 22:16:06 h kernel: [   78.635317] EAX: c0164400 EBX: 00000000 ECX: 00044130 EDX: f84c8000
Apr 29 22:16:06 h kernel: [   78.635399] ESI: f84b8000 EDI: 00000000 EBP: ec07fd94 ESP: ec07fd8c
Apr 29 22:16:06 h kernel: [   78.635481]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Apr 29 22:16:06 h kernel: [   78.635552] CR0: 8005003b CR2: f84c8000 CR3: 32fa1000 CR4: 000007f0
Apr 29 22:16:06 h kernel: [   78.635633] Stack:
Apr 29 22:16:06 h kernel: [   78.635661]  f4231044 c0168000 ec07fdfc f8fc398b 00000001 00000000 a0ee6df4 ec161800
Apr 29 22:16:06 h kernel: [   78.635783]  01b20000 00000000 00000000 00000000 00000020 00000000 f4c01d01 edd9cf80
Apr 29 22:16:06 h kernel: [   78.635904]  ee280000 e94ce400 f22cd800 f4232000 c0168000 edd9cb40 ec161800 e784ac00
Apr 29 22:16:06 h kernel: [   78.636025] Call Trace:
Apr 29 22:16:06 h kernel: [   78.636109]  [<f8fc398b>] nouveau_crtc_page_flip+0x34b/0x780 [nouveau]
Apr 29 22:16:06 h kernel: [   78.636225]  [<f8a95328>] drm_mode_page_flip_ioctl+0x198/0x300 [drm]
Apr 29 22:16:06 h kernel: [   78.636328]  [<f8a95190>] ? drm_mode_gamma_get_ioctl+0xd0/0xd0 [drm]
Apr 29 22:16:06 h kernel: [   78.636425]  [<f8a861a5>] drm_ioctl+0x1f5/0x560 [drm]
Apr 29 22:16:06 h kernel: [   78.636425]  [<c11bf66f>] ? core_sys_select+0x18f/0x270
Apr 29 22:16:06 h kernel: [   78.636425]  [<f8a95190>] ? drm_mode_gamma_get_ioctl+0xd0/0xd0 [drm]
Apr 29 22:16:06 h kernel: [   78.636425]  [<c1348798>] ? timerqueue_add+0x58/0xc0
Apr 29 22:16:06 h kernel: [   78.636425]  [<c147b071>] ? __pm_runtime_resume+0x51/0x70
Apr 29 22:16:06 h kernel: [   78.636425]  [<f8faf7cb>] nouveau_drm_ioctl+0x5b/0xb0 [nouveau]
Apr 29 22:16:06 h kernel: [   78.636425]  [<f8faf770>] ? nouveau_pmops_thaw+0x20/0x20 [nouveau]
Apr 29 22:16:06 h kernel: [   78.636425]  [<c11bdf72>] do_vfs_ioctl+0x322/0x550
Apr 29 22:16:06 h kernel: [   78.636425]  [<c10c1247>] ? hrtimer_start+0x27/0x30
Apr 29 22:16:06 h kernel: [   78.636425]  [<c10c277b>] ? do_setitimer+0x26b/0x270
Apr 29 22:16:06 h kernel: [   78.636425]  [<c10c288b>] ? SyS_setitimer+0xab/0xe0
Apr 29 22:16:06 h kernel: [   78.636425]  [<c11bf7ea>] ? SyS_select+0x9a/0xd0
Apr 29 22:16:06 h kernel: [   78.636425]  [<c11be200>] SyS_ioctl+0x60/0x90
Apr 29 22:16:06 h kernel: [   78.636425]  [<c16ef5df>] sysenter_do_call+0x12/0x12
Apr 29 22:16:06 h kernel: [   78.636425] Code: 00 00 55 89 e5 56 53 3e 8d 74 26 00 8b 98 84 01 00 00 8b b0 7c 01 00 00 c1 e2 02 81 e3 80 00 00 00 85 f6 74 1d 01 f2 85 db 75 07 <89> 0a 5b 5e 5d c3 90 89 c8 e8 79 e8 39 c8 5b 5e 5d c3 90 8d 74
Apr 29 22:16:06 h kernel: [   78.674768] EIP: [<f8fb6be9>] nouveau_bo_wr32+0x29/0x60 [nouveau] SS:ESP 0068:ec07fd8c
Apr 29 22:16:06 h kernel: [   78.674768] CR2: 00000000f84c8000
Apr 29 22:16:06 h kernel: [   78.674768] ---[ end trace a915c30442145198 ]---
============================================================================
For more examples please look into the attached file (namely 'syslog_1').

I think I don't have to explain that the system is pretty unusable, because the freeze happens about 1-5 minutes after the log in.

If you need any help or have questions - please ask :)

Best regards
Jedet
Comment 1 Ilia Mirkin 2015-05-21 21:58:01 UTC
Curious observation:

[   78.633151] BUG: unable to handle kernel paging request at f84c8000
[   78.633265] IP: [<f8fb6be9>] nouveau_bo_wr32+0x29/0x60 [nouveau]
[   78.633430] *pdpt = 0000000001b79001 *pde = 0000000034c88067 *pte = 0000000000000000 

[   50.751850] BUG: unable to handle kernel paging request at f84c8000
[   50.751967] IP: [<f8f51be9>] nouveau_bo_wr32+0x29/0x60 [nouveau]
[   50.752139] *pdpt = 0000000001b79001 *pde = 0000000034c88067 *pte = 0000000000000000 

[   68.844824] BUG: unable to handle kernel paging request at f8598000
[   68.845087] IP: [<f8f17be9>] nouveau_bo_wr32+0x29/0x60 [nouveau]
[   68.845475] *pdpt = 0000000001b79001 *pde = 0000000034c88067 *pte = 0000000000000000 

[  183.768120] BUG: unable to handle kernel paging request at f8498000
[  183.768233] IP: [<f8e91be9>] nouveau_bo_wr32+0x29/0x60 [nouveau]
[  183.768402] *pdpt = 0000000001b79001 *pde = 0000000034c88067 *pte = 0000000000000000 

[  107.772418] BUG: unable to handle kernel paging request at f848d000
[  107.772538] IP: [<f8e4cbe9>] nouveau_bo_wr32+0x29/0x60 [nouveau]
[  107.772718] *pdpt = 0000000001b79001 *pde = 0000000034c88067 *pte = 0000000000000000 

The pdpt/pde/pte values are always the same. Perhaps that's just the unmapped page or something.

Other observations: 32-bit kernel. One instance of a cal_space exhaustion, but it recovered apparently.
Comment 2 Ilia Mirkin 2015-06-29 08:27:20 UTC
Does this patch help at all?

http://lists.freedesktop.org/archives/nouveau/2015-June/021416.html

I believe the conditions are the same as the ones I hit...

  25:   01 f2                   add    %esi,%edx
...
  2b:*  89 0a                   mov    %ecx,(%edx)              <-- trapping instruction

ESI: f84b8000
EDX: f84c8000

So you're going over the 0x10000 size of the ring.
Comment 3 Jedet Ilrihm 2015-06-29 21:38:35 UTC
How should I test your implementation/fix?
Comment 4 Ilia Mirkin 2015-06-29 21:46:02 UTC
Apply the patch to your kernel, compile, install, boot. I'm sure you can find some instructions online for your distro if you're not already familiar with the process.
Comment 5 Jedet Ilrihm 2015-08-01 12:21:14 UTC
Hi Ilia,

I tried to create a kernel-version with the applied patch, but I had several problems:
1. this https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1460768 bug occurred
2. such a bug occurred:
-----------------------
Note: Writing hsi_event.9
Warn: meta author : no refentry/info/author                        hsi_get_channel_id_by_name
Note: meta author : see http://docbook.sf.net/el/author            hsi_get_channel_id_by_name
Warn: meta author : no author data, so inserted a fixme            hsi_get_channel_id_by_name
Note: Writing hsi_get_channel_id_by_name.9
gzip -f Documentation/DocBook/man/*.9
make[3]: Leaving directory '/media/h/C_SSD/kernel2/linux-source-3.19.0'
debian/ruleset/targets/doc.mk:34: recipe for target 'debian/stamp/install/linux-doc-3.19.8-ckt2' failed
make[2]: *** [debian/stamp/install/linux-doc-3.19.8-ckt2] Error 2
make[2]: Leaving directory '/media/h/C_SSD/kernel2/linux-source-3.19.0'
debian/ruleset/common/targets.mk:357: recipe for target 'debian/stamp/do-install-indep' failed
make[1]: *** [debian/stamp/do-install-indep] Error 2
make[1]: *** Auf noch nicht beendete Prozesse wird gewartet …
make[3]: Leaving directory '/media/h/C_SSD/kernel2/linux-source-3.19.0'
====== making target debian/stamp/BIN/linux-image-3.19.8-ckt2-dbg [new prereqs: do-pre-bin-arch pre-linux-image-3.19.8-ckt2-dbg]======

====== making target debian/stamp/dep-binary-arch [new prereqs: pre-bin-arch linux-headers-3.19.8-ckt2 linux-image-3.19.8-ckt2 linux-image-3.19.8-ckt2-dbg linux-uml-3.19.8-ckt2]======
make[2]: Leaving directory '/media/h/C_SSD/kernel2/linux-source-3.19.0'
make[1]: Leaving directory '/media/h/C_SSD/kernel2/linux-source-3.19.0'
dpkg-buildpackage: Fehler: Fehler-Exitstatus von fakeroot debian/rules binary war 2
debian/ruleset/targets/common.mk:401: recipe for target 'debian/stamp/build/buildpackage' failed
make: *** [debian/stamp/build/buildpackage] Error 2
h@h:/media/h/C_SSD/kernel2/linux-source-3.19.0$ 
-----------------------

I think I tried to apply this patch up to 5 times, but it didn't worked.

Could you please provide a working kernel-version with the applied patch? At least that would be very helpful. I have no ideas how I should proceed otherwise. It's like I'm trying something and it is always a dead end. By the way, the compile needed 11 hours and to see that it didn't work afterwards is very frustrating.

I hope you can help me with that and I will be able to apply a kernel-version with the applied patch.

Thanks in advance
Best regards
Jedet
Comment 6 Ilia Mirkin 2015-10-26 04:59:05 UTC
The patch in question should be included in kernels 4.2+ and potentially some of the older stable trees. The upstream patch is:

commit d108142c0840ce389cd9898aa76943b3fb430b83
Author: Ilia Mirkin <imirkin@alum.mit.edu>
Date:   Mon Jun 29 04:07:20 2015 -0400

    drm/nouveau/fbcon/nv11-: correctly account for ring space usage
Comment 7 Christopher M. Penalver 2016-02-23 07:39:36 UTC
Jadet Ilrihm, 15.04 is EOL as of February 4, 2016. For more on this, please see https://wiki.ubuntu.com/Releases .

If this is reproducible in a supported release, it will help immensely if you filed a new report with the Ubuntu repository kernel (not mainline/upstream) via a terminal:
ubuntu-bug linux

Please feel free to subscribe me to it.

For more on why this is helpful, please see https://wiki.ubuntu.com/ReportingBugs.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.