Bug 101778 - Kernel Error on Lenovo P51 when setting graphics to hybrid (Nvidia Optimus with intel+nvidia)
Summary: Kernel Error on Lenovo P51 when setting graphics to hybrid (Nvidia Optimus wi...
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/nouveau (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Nouveau Project
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-07-13 08:53 UTC by Development
Modified: 2019-07-24 00:53 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
/var/log/kernel.log (169.19 KB, text/x-log)
2017-07-13 13:30 UTC, Development
no flags Details
/var/log/messages.log (127.88 KB, text/x-log)
2017-07-13 13:30 UTC, Development
no flags Details
This makes the system stable, but does not fix detection of displays (911 bytes, patch)
2017-11-08 12:13 UTC, Josef Larsson
no flags Details | Splinter Review

Description Development 2017-07-13 08:53:18 UTC
Hi,

im using a new Lenovo Thinkpad P51 that uses Nvidia Optimus. It has a integrated Intel graphic (Intel® HD Graphics 630) and a descrete Nvidia graphic (Nvidia Quadro M1200). The Thinkpad is used in combination with a Lenovo ThinkPad Workstation Dock (Docking Station).

My OS is Debian Buster (Debian Testing).

When using Nouveau and i915 as drivers while using both graphic cards, a kernel error occurs (bios is set to hybrid graphics):

---[ end trace 0f5b1e9bbdb82fb3 ]---
 ? system_call_fast_compare_end+0xc/0x9b
 ? SyS_ioctl+0x74/0x80
 ? do_munmap+0x353/0x430
 ? do_vfs_ioctl+0x9f/0x600
 ? nouveau_drm_ioctl+0x66/0xc0 [nouveau]
 ? signal_setup_done+0x67/0xb0
 ? drm_gem_handle_create+0x40/0x40 [drm]
 ? drm_ioctl+0x1ef/0x440 [drm]
 ? drm_gem_handle_delete+0x57/0x80 [drm]
 ? drm_gem_object_release_handle+0x50/0x90 [drm]
 ? nouveau_gem_object_del+0x8d/0xe0 [nouveau]
 ? ttm_bo_release_list+0xc8/0x1f0 [ttm]
 ? nouveau_bo_del_ttm+0x77/0x80 [nouveau]
 ? __warn+0xbe/0xe0
 ? dump_stack+0x5c/0x78
Call Trace:
Hardware name: LENOVO 20HH0014GE/20HH0014GE, BIOS N1UET31W (1.05 ) 02/13/2017
CPU: 7 PID: 3782 Comm: Xorg Tainted: P           O    4.11.0-1-amd64 #1 Debian 4.11.6-1
 ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd ahci libahci psmouse e1000e libata xhci_pci ptp i2c_i801 pps_core xhci_hcd rtsx_pci nvme mfd_core nvme_core scsi_mod usbcore usb_common thermal i2c_hid hid
 mxm_wmi snd_hda_codec_realtek rtsx_pci_ms ttm kvm_intel snd_hda_codec_generic drm_kms_helper memstick kvm joydev cfg80211 snd_hda_intel drm snd_hda_codec irqbypass i2c_algo_bit efi_pstore intel_cstate evdev snd_hda_core snd_hwdep mei_me idma64 intel_
Modules linked in: ctr ccm acpi_call(O) veth xt_nat ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype br_netfilter bridge stp llc overlay ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 ip
WARNING: CPU: 7 PID: 3782 at /build/linux-C5oXKu/linux-4.11.6/drivers/gpu/drm/nouveau/nouveau_bo.c:137 nouveau_bo_del_ttm+0x77/0x80 [nouveau]
------------[ cut here ]------------

[   28.615917] Oops: 0000 [#1] SMP
[   28.615924] Modules linked in: cmac ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables ipt_REJECT nf_reject_ipv4 xt_tcpudp xt_conntrack iptable_filter iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_n$
[   28.616065]  memstick ttm efi_pstore drm_kms_helper snd_hda_codec joydev evdev snd_hda_core mei_me snd_hwdep drm snd_pcm idma64 iTCO_wdt pcspkr serio_raw efivars iTCO_vendor_support snd_timer sg i2c_algo_bit mei shpchp intel_pch_thermal intel_lpss_$
[   28.616210]  nvme rtsx_pci mfd_core nvme_core usbcore scsi_mod usb_common thermal i2c_hid hid
[   28.616230] CPU: 0 PID: 57 Comm: kworker/0:1 Tainted: P           O    4.11.0-1-amd64 #1 Debian 4.11.6-1
[   28.616248] Hardware name: LENOVO 20HH0014GE/20HH0014GE, BIOS N1UET31W (1.05 ) 02/13/2017
[   28.616266] Workqueue: events_long drm_dp_mst_link_probe_work [drm_kms_helper]
[   28.616280] task: ffff9e947fd92100 task.stack: ffffb20306494000
[   28.616294] RIP: 0010:drm_fb_helper_add_one_connector+0x17/0xd0 [drm_kms_helper]
[   28.616308] RSP: 0018:ffffb20306497c60 EFLAGS: 00010202
[   28.616319] RAX: 0000000000000000 RBX: ffff9e946c19c010 RCX: 0000000000000000
[   28.616333] RDX: ffff9e947cfe8c58 RSI: ffff9e946c19c010 RDI: 0000000000000000
[   28.616347] RBP: ffff9e946c19c010 R08: ffff9e946c1050e0 R09: 0000000000000000
[   28.616360] R10: ffffb20306497c68 R11: 0000000000000001 R12: ffff9e947dc7c010
[   28.616374] R13: 0000000000000001 R14: ffff9e947a81a780 R15: ffff9e947dc7c000
[   28.616390] FS:  0000000000000000(0000) GS:ffff9e94af400000(0000) knlGS:0000000000000000
[   28.616412] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   28.616424] CR2: 0000000000000008 CR3: 0000000400209000 CR4: 00000000003406f0
[   28.616438] Call Trace:
[   28.616463]  ? nv50_mstm_register_connector+0x2c/0x50 [nouveau]
[   28.616477]  ? drm_dp_add_port+0x32d/0x460 [drm_kms_helper]
[   28.616501]  ? g94_i2c_aux_fini.isra.0+0x27/0x40 [nouveau]
[   28.616523]  ? g94_i2c_aux_xfer+0x6a2/0x7d0 [nouveau]
[   28.616544]  ? nvkm_i2c_aux_release+0x42/0x50 [nouveau]
[   28.616569]  ? nouveau_connector_aux_xfer+0x7f/0xc0 [nouveau]
[   28.616582]  ? drm_dp_dpcd_access+0xee/0x120 [drm_kms_helper]
[   28.616595]  ? drm_dp_send_link_address+0x16b/0x1f0 [drm_kms_helper]
[   28.616610]  ? drm_dp_check_and_send_link_address+0xab/0xb0 [drm_kms_helper]
[   28.616625]  ? drm_dp_mst_link_probe_work+0x4a/0x80 [drm_kms_helper]
[   28.616638]  ? process_one_work+0x197/0x430
[   28.616647]  ? worker_thread+0x4d/0x490
[   28.616656]  ? kthread+0xfc/0x130
[   28.616664]  ? process_one_work+0x430/0x430
[   28.616673]  ? kthread_create_on_node+0x70/0x70
[   28.616683]  ? ret_from_fork+0x26/0x40
[   28.616691] Code: bf f4 ff ff ff e9 7b ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 31 c0 80 3d 72 76 00 00 00 74 6b 41 54 55 48 89 f5 53 <48> 8b 47 08 48 89 fb 48 8b 80 08 02 00 00 48 a9 f8 ff ff ff 74
[   28.616741] RIP: drm_fb_helper_add_one_connector+0x17/0xd0 [drm_kms_helper] RSP: ffffb20306497c60
[   28.616757] CR2: 0000000000000008
[   28.621441] ---[ end trace f1588c8aea537ee6 ]---

When setting the graphic card to the disrecete nvidia card everything works fine.

Also see https://bugzilla.kernel.org/show_bug.cgi?id=196341.
Comment 1 Ilia Mirkin 2017-07-13 12:36:37 UTC
Well, on the up side, MST is getting properly detected now. On the down side, it oopses.

Ignoring the WARNING for a minute, the oops is in

nv50_mstm_register_connector calling drm_fb_helper_add_one_connector.

The Quadro M1200 appears to be a GM107 based on pci.ids. Can you double-check? 

$ lspci -nn -d 10de:

Also can you supply the full dmesg from boot up to the oops, having booted with nouveau.debug=debug drm.debug=0x1e ?
Comment 2 Development 2017-07-13 13:30:11 UTC
Created attachment 132668 [details]
/var/log/kernel.log
Comment 3 Development 2017-07-13 13:30:34 UTC
Created attachment 132669 [details]
/var/log/messages.log
Comment 4 Development 2017-07-13 13:31:08 UTC
$ lspci -nn -d 10de:
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM107GLM [Quadro M1200 Mobile] [10de:13b6] (rev a2)

The logs are attached, I hope they help :)
Comment 5 Development 2017-07-13 14:53:28 UTC
I encountered two more bugs:
1) after undocking and redocking, the external displays are gone
2) the internal display flickers after standby
Comment 6 Josef Larsson 2017-10-17 11:22:10 UTC
Same problem here with my Thinkpad P51 with docking station.

If I add a check for assignment of drm->fbcon in nv50_mstm_register_connector, the system hangs instead upon start of Xorg, so since my line of code has this effect, I guess drm->fbcon == 0?

if (drm->fbcon)
    drm_fb_helper_add_one_connector(&drm->fbcon->helper, connector);
Comment 7 Josef Larsson 2017-10-23 09:41:49 UTC
The set up with the docking station does not work really well with the proprietary nvidia driver either. When switching between VT and Xorg all monitors start blinking and changing resolution etc as if they were just connected, and the display manager does not seem to be able to resolve the situation. The Lenovo thinkpad docking station is in other words quite useless regardless of which driver I choose...
Comment 8 Josef Larsson 2017-10-30 10:46:17 UTC
Same (kernel oops) problem with a different docking station which uses a thunderbolt connector instead.
Comment 9 Josef Larsson 2017-11-08 12:13:51 UTC
Created attachment 135298 [details] [review]
This makes the system stable, but does not fix detection of displays
Comment 10 Karol Herbst 2019-06-26 19:35:34 UTC
Is this still an issue? We fixed a lot of DP and hybrid GPU stuff since 4.11 and I would hope that this is already fixed?
Comment 11 Josef Larsson 2019-07-24 00:53:23 UTC
Yes, this is fixed now. Running Gnome on Wayland with an external monitor connected to the DP port on the docking station without instabilities.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.