Bug 93580

Summary:

intel_fbdev_restore_mode derefence null pointer

Product:

DRI

Reporter:

gustav.fagerlind

Component:

DRM/Intel

Assignee:

Intel GFX Bugs mailing list <intel-gfx-bugs>

Status:

CLOSED FIXED

QA Contact:

Intel GFX Bugs mailing list <intel-gfx-bugs>

Severity:

normal

Priority:

medium

CC:

dxtr, franck.delache, freedesktop.org, freedesktop.org, intel-gfx-bugs

Version:

unspecified

Hardware:

x86-64 (AMD64)

OS:

Linux (All)

Whiteboard:

i915 platform:

HSW

i915 features:

display/Other

Attachments:

Description	Flags
New journal with patches applied	none

Description gustav.fagerlind 2016-01-04 18:26:22 UTC

kernel: BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8
 kernel: IP: [<ffffffffa0538208>] intel_fbdev_restore_mode+0x48/0x80 [i915]
 kernel: PGD 7ab3a067 PUD 7925b067 PMD 0 
 kernel: Oops: 0000 [#1] PREEMPT SMP 
 kernel: Modules linked in: nvram msr joydev mousedev arc4 cyapatp crc_itu_t atmel_mxt_ts intel_rapl iosf_mbi x86_pkg_temp_thermal intel_powerclamp ath9k kvm_intel ath9k_common ath9k_hw ath kvm mac80211 i915 coretemp iTCO_wdt snd_hda_codec_realtek iTCO_vendor_support drm_kms_helper snd_hda_codec_generic crct10dif_pclmul chromeos_laptop cfg80211 snd_hda_intel crc32_pclmul crc32c_intel evdev input_leds snd_hda_codec cryptd mac_hid serio_raw pcspkr snd_hda_core fan rfkill i2c_i801 thermal battery snd_hwdep ac snd_pcm drm dw_dmac gpio_lynxpoint fjes video snd_timer 8250_dw i2c_designware_platform snd shpchp spi_pxa2xx_platform intel_gtt soundcore syscopyarea dw_dmac_pci button sysfillrect sysimgblt fb_sys_fops i2c_algo_bit i2c_designware_pci i2c_designware_core dw_dmac_core lpc_ich acpi_cpufreq processor
 kernel:  sch_fq_codel ip_tables x_tables ext4 crc16 mbcache jbd2 sd_mod atkbd libps2 ahci libahci libata scsi_mod i8042 sdhci_acpi serio sdhci led_class mmc_core xhci_pci xhci_hcd usbcore usb_common
 kernel: CPU: 0 PID: 224 Comm: Xorg.wrap Not tainted 4.3.3-2-ARCH #1
 kernel: Hardware name: GOOGLE Peppy/Peppy, BIOS 4.0-6588-g4acd8ea-dirty 09/04/2014
 kernel: task: ffff88017487d940 ti: ffff88007a9d4000 task.ti: ffff88007a9d4000
 kernel: RIP: 0010:[<ffffffffa0538208>]  [<ffffffffa0538208>] intel_fbdev_restore_mode+0x48/0x80 [i915]
 kernel: RSP: 0018:ffff88007a9d7d38  EFLAGS: 00010246
 kernel: RAX: 0000000000000000 RBX: ffff880079151000 RCX: 0000000000000000
 kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88007a9df860
 kernel: RBP: ffff88007a9d7d40 R08: 0000000000000000 R09: 0000000000000000
 kernel: R10: ffff880176c55000 R11: 0000000000000033 R12: ffff88007a9df800
 kernel: R13: ffff880176878ae0 R14: ffff88007a9df888 R15: ffff880176878ad8
 kernel: FS:  00007f43936f6700(0000) GS:ffff88017ca00000(0000) knlGS:0000000000000000
 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 kernel: CR2: 00000000000000a8 CR3: 000000007a98b000 CR4: 00000000000406f0
 kernel: Stack:
 kernel:  ffff88007a9df800 ffff88007a9d7d50 ffffffffa05618fe ffff88007a9d7d70
 kernel:  ffffffffa02e1a2e 0000000000000000 ffff88007a9df800 ffff88007a9d7dd0
 kernel:  ffffffffa02e1e56 0000000000000040 0000000000000246 ffff88007abb8cc0
 kernel: Call Trace:
 kernel:  [<ffffffffa05618fe>] i915_driver_lastclose+0xe/0x20 [i915]
 kernel:  [<ffffffffa02e1a2e>] drm_lastclose+0x2e/0x140 [drm]
 kernel:  [<ffffffffa02e1e56>] drm_release+0x316/0x500 [drm]
 kernel:  [<ffffffff811db2fc>] __fput+0x9c/0x1f0
 kernel:  [<ffffffff811db48e>] ____fput+0xe/0x10
 kernel:  [<ffffffff81091173>] task_work_run+0x73/0x90
 kernel:  [<ffffffff810039f6>] prepare_exit_to_usermode+0xd6/0x100
 kernel:  [<ffffffff81003aed>] syscall_return_slowpath+0xcd/0x1d0
 kernel:  [<ffffffff811eb5f5>] ? do_vfs_ioctl+0x295/0x480
 kernel:  [<ffffffff81091074>] ? task_work_add+0x44/0x60
 kernel:  [<ffffffff811db4d7>] ? fput+0x47/0x90
 kernel:  [<ffffffff811d7916>] ? filp_close+0x56/0x70
 kernel:  [<ffffffff811f5e5b>] ? __close_fd+0x8b/0xb0
 kernel:  [<ffffffff81583e8c>] int_ret_from_sys_call+0x25/0x8f
 kernel: Code: e8 ae b7 f6 ff 84 c0 74 0c f6 05 5f 30 de ff 01 75 35 5b 5d c3 48 8b 43 08 48 8d 78 60 e8 31 99 04 e1 48 8b 83 a0 00 00 00 31 f6 <48> 8b b8 a8 00 00 00 e8 6c 59 ff ff 48 8b 7b 08 48 83 c7 60 e8 
 kernel: RIP  [<ffffffffa0538208>] intel_fbdev_restore_mode+0x48/0x80 [i915]
 kernel:  RSP <ffff88007a9d7d38>
 kernel: CR2: 00000000000000a8
 kernel: ---[ end trace 919f6f89825da079 ]---
 systemd[1]: Started Load/Save Screen Backlight Brightness of backlight:intel_backlight.
 systemd[1]: x@vt7.service: Control process exited, code
 systemd[1]: Failed to start X on vt7.

uname -a
Linux ARCHBOOK 4.3.3-2-ARCH #1 SMP PREEMPT Wed Dec 23 20:09:18 CET 2015 x86_64 GNU/Linux

lspci -vnn | grep VGA -A 1
00:02.0 VGA compatible controller [0300]: Intel Corporation Haswell-ULT Integrated Graphics Controller [8086:0a06] (rev 0b) (prog-if 00 [VGA controller])
	Subsystem: Acer Incorporated [ALI] Device [1025:0a11]

Some other info: 
 This has happend once so far on linux 4.3.3, during start up (so ~5%). Boot halted and screen was blank. I do not have any other info, its more of an FYI. Please let me know if there is information left out. Please give me pointers on how I can try to find out the root cause.

Comment 1 gustav.fagerlind 2016-01-05 06:19:57 UTC

It happend again, I could not switch tty after it happend, but no kernel panic.

Comment 2 gustav.fagerlind 2016-01-07 20:55:18 UTC

i have verified the memory (well the ram) with memtest

Comment 3 Chris Wilson 2016-02-03 20:39:07 UTC

*** Bug 93992 has been marked as a duplicate of this bug. ***

Comment 4 Jani Nikula 2016-04-25 09:48:28 UTC

Please try kernel v4.5 or later.

Comment 5 freedesktop.org 2016-05-28 16:15:01 UTC

I just had the same issue with kernel 4.5.4 on Archlinux using a lenovo Thinkpad T60.

$ uname -a
Linux thinkpad 4.5.4-1-ARCH #1 SMP PREEMPT Wed May 11 22:21:28 CEST 2016 x86_64 GNU/Linux

$ lspci -vnn | grep VGA -A 1
00:02.0 VGA compatible controller [0300]: Intel Corporation Mobile 945GM/GMS, 943/940GML Express Integrated Graphics Controller [8086:27a2] (rev 03) (prog-if 00 [VGA controller])
	Subsystem: Lenovo ThinkPad R60/T60/X60 series [17aa:201a]

kernel log:
BUG: unable to handle kernel NULL pointer dereference at 00000000000000a0
IP: [<ffffffffa08855a8>] intel_fbdev_restore_mode+0x48/0x80 [i915]
PGD 733a3067 PUD 733cc067 PMD 0 
Oops: 0000 [#1] PREEMPT SMP 
Modules linked in: mousedev iTCO_wdt iTCO_vendor_support ppdev i915 coretemp pcmcia kvm drm_
 af_alg dm_crypt dm_mod sd_mod sr_mod cdrom ata_generic pata_acpi atkbd libps2 ahci libahci 
CPU: 1 PID: 395 Comm: Xorg.wrap Not tainted 4.5.4-1-ARCH #1
Hardware name: LENOVO 1952VYF/1952VYF, BIOS 79ETD2WW (2.12 ) 04/12/2007
task: ffff88007a3db400 ti: ffff8800733c4000 task.ti: ffff8800733c4000
RIP: 0010:[<ffffffffa08855a8>]  [<ffffffffa08855a8>] intel_fbdev_restore_mode+0x48/0x80 [i91
RSP: 0018:ffff8800733c7dc0  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff880079edbc00 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88000006c060
RBP: ffff8800733c7dc8 R08: 0000000000000000 R09: ffff88007ef157b0
R10: ffff88007a315600 R11: 0000000000000001 R12: ffff88000006c000
R13: ffff88007b410ad8 R14: ffff88000006c088 R15: ffff880079edbcc0
FS:  00007fea41061700(0000) GS:ffff88007ef00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000000000a0 CR3: 0000000073397000 CR4: 00000000000006e0
Stack:
 ffff88000006c000 ffff8800733c7dd8 ffffffffa08b139e ffff8800733c7df8
 ffffffffa04c1b8e 0000000000000000 ffff88000006c000 ffff8800733c7e68
 ffffffffa04c1fcf 0000000800000001 ffff88007abad768 0000000000000246
Call Trace:
 [<ffffffffa08b139e>] i915_driver_lastclose+0xe/0x20 [i915]
 [<ffffffffa04c1b8e>] drm_lastclose+0x2e/0x140 [drm]
 [<ffffffffa04c1fcf>] drm_release+0x32f/0x510 [drm]
 [<ffffffff811f18cf>] __fput+0x9f/0x1e0
 [<ffffffff811f1a4e>] ____fput+0xe/0x10
 [<ffffffff81095458>] task_work_run+0x78/0xa0
 [<ffffffff810036aa>] exit_to_usermode_loop+0xba/0xc0
 [<ffffffff81003bee>] syscall_return_slowpath+0x4e/0x60
 [<ffffffff815b5808>] int_ret_from_sys_call+0x25/0x8f
Code: e8 ae f4 dd ff 85 c0 74 0c f6 05 b3 6c c7 ff 01 75 35 5b 5d c3 48 8b 43 08 48 8d 78 60
RIP  [<ffffffffa08855a8>] intel_fbdev_restore_mode+0x48/0x80 [i915]
 RSP <ffff8800733c7dc0>
CR2: 00000000000000a0
---[ end trace c62f437557d31865 ]---

Comment 6 Chris Wilson 2016-06-17 18:17:16 UTC

*** Bug 96554 has been marked as a duplicate of this bug. ***

Comment 7 Kim Lidström 2016-06-20 23:23:50 UTC

Is it possible to get patch https://patchwork.freedesktop.org/patch/93560/ and https://patchwork.freedesktop.org/patch/93561/ that you gavae me in #96554 for 4.7-rc4?

They don't apply cleanly there.

This is the rejected hunk:

--- drivers/gpu/drm/i915/intel_fbdev.c
+++ drivers/gpu/drm/i915/intel_fbdev.c
@@ -551,12 +550,14 @@ static void intel_fbdev_destroy(struct drm_device *dev,
 	drm_fb_helper_fini(&ifbdev->helper);
 
 	if (ifbdev->fb) {
-		mutex_lock(&dev->struct_mutex);
+		mutex_lock(&ifbdev->helper.dev->struct_mutex);
 		intel_unpin_fb_obj(&ifbdev->fb->base, BIT(DRM_ROTATE_0));
-		mutex_unlock(&dev->struct_mutex);
+		mutex_unlock(&ifbdev->helper.dev->struct_mutex);
 
 		drm_framebuffer_remove(&ifbdev->fb->base);
 	}
+
+	kfree(ifbdev);
 }
 
 /

Comment 8 Kim Lidström 2016-06-20 23:25:19 UTC

I realized that wasn't very clear. The patch that isn't applied cleanly is https://patchwork.freedesktop.org/patch/93560/ .

Comment 9 Chris Wilson 2016-06-21 09:08:51 UTC

Here you go https://cgit.freedesktop.org/~ickle/linux-2.6/log/?h=for-bug93580

Comment 10 Kim Lidström 2016-06-22 14:58:48 UTC

Unfortunately those patches didn't solve the issue for me.

What happens is that the external monitors goes black, the display on the laptop sort of fades into noice and the audio starts repeating the last few seconds (If I'm listening to music)

I still haven't found a pattern here.

Comment 11 Kim Lidström 2016-06-22 14:59:25 UTC

I obviously mean it fades to noise.

Spelling is hard :)

Comment 12 Chris Wilson 2016-06-22 20:00:52 UTC

Is there any change in the oops? That would be useful to know, could you please attach a fresh copy with the patches applied?

Comment 13 Jani Nikula 2016-06-23 08:24:09 UTC

(In reply to Kim Lidström from comment #10)
> Unfortunately those patches didn't solve the issue for me.

Does the patch fix the issue for other reporters? Maybe a different issue?

Comment 14 Chris Wilson 2016-06-23 08:30:41 UTC

(In reply to Jani Nikula from comment #13)
> (In reply to Kim Lidström from comment #10)
> > Unfortunately those patches didn't solve the issue for me.
> 
> Does the patch fix the issue for other reporters? Maybe a different issue?

I know that it fixes an oops that I was able to trigger locally, but as stated in the changelog I believe that was only possible due to how I enabled the whole module/builtin to load asynchronously.

Comment 15 Kim Lidström 2016-06-23 10:38:31 UTC

I suspect (And have suspected all along) that my issue is a separate issue

In fact I wasn't all too sure if the message I posted was related to the crash or if it just was a coincidence that that was the last thing that happened before the box crashed.

The thing is that when the box freezes like this I can't do ANYTHING. The only solution is a hard reboot

I will see if I can get some debug output from the module with drm.debug=0xe

Comment 16 Kim Lidström 2016-06-28 11:37:45 UTC

Created attachment 124760 [details]
New journal with patches applied

Comment 17 Kim Lidström 2016-06-28 11:39:37 UTC

Hi!
I attached the log output I could retrieve from the last freeze.

I'm running 4.7.0-rc5 with the patches applied and drm.debug=0xe

Comment 18 Chris Wilson 2016-06-28 11:47:00 UTC

(In reply to Kim Lidström from comment #17)
> Hi!
> I attached the log output I could retrieve from the last freeze.
> 
> I'm running 4.7.0-rc5 with the patches applied and drm.debug=0xe

I don't see the fbdev issue in there, but lots of
kernel: WARNING: CPU: 3 PID: 970 at drivers/gpu/drm/i915/intel_pm.c:3647 skl_update_other_pipe_wm+0x177/0x180 [i915]
kernel: WARN_ON(!wm_changed)

Please do file a separate but for that WARN and mention all steps that lead up to the freeze, even if just "it freezes at random with no forewarning".

I am going to tentatively close this fbdev one following:

commit e77018f7618960f7ec0e73e63868514ff16f8ddc
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Jun 21 09:16:55 2016 +0100

    drm/i915/fbdev: Flush mode configuration before lastclose

Please do reopen if you can reproduce the fbdev oops on a recent (-nightly) kernel.

Comment 19 Kim Lidström 2016-06-28 11:52:09 UTC

But isn't there already an issue about that warning? https://bugs.freedesktop.org/show_bug.cgi?id=89055

Comment 20 yann 2016-06-28 12:03:26 UTC

(In reply to Kim Lidström from comment #19)
> But isn't there already an issue about that warning?
> https://bugs.freedesktop.org/show_bug.cgi?id=89055

yes, you are correct Kim, and this should be fixed (landing in kernel 4.8).

Comment 21 Kim Lidström 2016-06-28 12:47:18 UTC

So aren't we back to square one now? :)

From what I can tell (By reading the reports and trying out the patches with earlier kernels) that warning is not the cause of these freezes.

Comment 22 Jani Nikula 2016-06-29 08:39:05 UTC

(In reply to Kim Lidström from comment #21)
> So aren't we back to square one now? :)
> 
> From what I can tell (By reading the reports and trying out the patches with
> earlier kernels) that warning is not the cause of these freezes.

We think we have all of the issues you're seeing fixed in drm-intel-nightly branch of http://cgit.freedesktop.org/drm-intel, and the fixes are headed to v4.8 kernel. Please reopen if you can reproduce these issues with drm-intel-nightly. Thanks.

Comment 23 Kim Lidström 2016-06-29 08:46:24 UTC

Alright then! This is good. I'll be trying the nightly today.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.