Bug 106159 - When connecting or disconnecting a displayport to a DP hub with 4.16.2+ kernel, hard freeze with frozen video output
Summary: When connecting or disconnecting a displayport to a DP hub with 4.16.2+ kerne...
Status: RESOLVED MOVED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/AMDgpu (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-04-20 18:51 UTC by Joel Sass
Modified: 2019-11-19 08:35 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
dpkg -l |grep mesa (10.76 KB, text/plain)
2018-04-20 18:51 UTC, Joel Sass
no flags Details
Xorg.0.log (93.10 KB, text/x-log)
2018-04-20 18:52 UTC, Joel Sass
no flags Details
dmesg output (90.71 KB, text/plain)
2018-04-20 18:52 UTC, Joel Sass
no flags Details
lshw output (66.12 KB, text/plain)
2018-04-20 18:53 UTC, Joel Sass
no flags Details
[PATCH 1/2] drm/amd/display: Update MST edid property every time (1.49 KB, patch)
2018-04-24 19:30 UTC, Harry Wentland
no flags Details | Splinter Review
[PATCH 2/2] drm/amd/display: Check dc_sink every time in MST hotplug (2.27 KB, patch)
2018-04-24 19:31 UTC, Harry Wentland
no flags Details | Splinter Review
amdgpu_dm.c patch I had to manually apply (1.36 KB, patch)
2018-04-25 15:24 UTC, Joel Sass
no flags Details | Splinter Review
amdgpu_dm_mst_types.c patch I had to manually apply (1.74 KB, patch)
2018-04-25 15:25 UTC, Joel Sass
no flags Details | Splinter Review
The modified file causing the problem in my comment. (13.46 KB, text/x-csrc)
2018-04-25 21:01 UTC, Joel Sass
no flags Details
This is the dmesg from an ssh session after attaching a monitor to my MST hub (86.54 KB, text/x-log)
2018-04-26 14:02 UTC, Joel Sass
no flags Details
Here's the kernel .config I used for making the kernel (173.12 KB, text/x-mpsub)
2018-04-26 14:06 UTC, Joel Sass
no flags Details

Description Joel Sass 2018-04-20 18:51:17 UTC
Created attachment 138956 [details]
dpkg -l |grep mesa

When connecting a displayport monitor to an active DP hub with working outputs, my workstation experiences a hard freeze requiring a cold reboot. Network services stop responding, including ping and SSH.

root@nope:~# uname -a
Linux nope 4.16.2+ #1 SMP Fri Apr 13 17:51:14 CEST 2018 x86_64 x86_64 x86_64 GNU/Linux

root@nope:~# lsmod |grep -i amdgpu
amdgpu               2695168  2
chash                  16384  1 amdgpu
i2c_algo_bit           16384  1 amdgpu
gpu_sched              20480  1 amdgpu
ttm                    94208  1 amdgpu
drm_kms_helper        143360  1 amdgpu
drm                   348160  6 amdgpu,gpu_sched,ttm,drm_kms_helper

Mesa drivers from padoka PPA
Comment 1 Joel Sass 2018-04-20 18:52:32 UTC
Created attachment 138957 [details]
Xorg.0.log
Comment 2 Joel Sass 2018-04-20 18:52:58 UTC
Created attachment 138958 [details]
dmesg output
Comment 3 Joel Sass 2018-04-20 18:53:40 UTC
Created attachment 138959 [details]
lshw output
Comment 4 Harry Wentland 2018-04-24 19:30:22 UTC
Created attachment 139069 [details] [review]
[PATCH 1/2] drm/amd/display: Update MST edid property every time
Comment 5 Harry Wentland 2018-04-24 19:31:11 UTC
Created attachment 139070 [details] [review]
[PATCH 2/2] drm/amd/display: Check dc_sink every time in MST hotplug

Can you try patches 1 and 2?
Comment 6 Joel Sass 2018-04-24 20:34:19 UTC
Will do! Sorry, I haven't had much time for testing recently.
Comment 7 dwagner 2018-04-24 21:37:30 UTC
I cannot comment on how useful above patches are for the topic of this bug report, but they are helpful for bug report
https://bugs.freedesktop.org/show_bug.cgi?id=103277
Comment 8 Joel Sass 2018-04-25 15:23:10 UTC
Alright! First, I appreciate the help with this.

I decided to just roll with what I was being given and roll the kernel after applying the patches you'd recommended. I downloaded the newest unstable source this morning from here: git://kernel.ubuntu.com/ubuntu/unstable.git

did a make menuconfig to make sure that amdgpu.dc was included, along with custom processor flags for new core2/xeon processors, and then applied the patches you'd mentioned. Sadly, neither of them were verbatim because I've taken too long to get back to working on this, so I had to apply them manually.

You'll find my diff patches attached. Nothing overtly different.

I'm going to start compiling this, and then go to lunch. Thanks!
Comment 9 Joel Sass 2018-04-25 15:24:36 UTC
Created attachment 139100 [details] [review]
amdgpu_dm.c patch I had to manually apply
Comment 10 Joel Sass 2018-04-25 15:25:04 UTC
Created attachment 139101 [details] [review]
amdgpu_dm_mst_types.c patch I had to manually apply
Comment 11 Joel Sass 2018-04-25 20:58:11 UTC
It appears that the patch I manually created isn't working out so hot. During compiling, I'm seeing this error for amdgpu_dm_mst_types.c

drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_mst_types.c: In function ‘dm_dp_mst_dc_sink_create’:
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_mst_types.c:205:2: error: ‘dc_sink’ undeclared (first use in this function)
  dc_sink = dc_link_add_remote_sink(
  ^
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_mst_types.c:205:2: note: each undeclared identifier is reported only once for each function it appears in
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_mst_types.c:209:4: error: ‘init_params’ undeclared (first use in this function)
   &init_params);
    ^
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_mst_types.c: In function ‘dm_dp_mst_get_modes’:
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_mst_types.c:232:28: warning: unused variable ‘init_params’ [-Wunused-variable]
   struct dc_sink_init_data init_params = {
                            ^
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_mst_types.c:231:19: warning: unused variable ‘dc_sink’ [-Wunused-variable]
   struct dc_sink *dc_sink;
                   ^
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_mst_types.c:251:5: error: request for member ‘sink_signal’ in something not a structure or union
     .sink_signal = SIGNAL_TYPE_DISPLAY_PORT_MST };
     ^
scripts/Makefile.build:332: recipe for target 'drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_mst_types.o' failed
make[4]: *** [drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_mst_types.o] Error 1
scripts/Makefile.build:606: recipe for target 'drivers/gpu/drm/amd/amdgpu' failed
make[3]: *** [drivers/gpu/drm/amd/amdgpu] Error 2
scripts/Makefile.build:606: recipe for target 'drivers/gpu/drm' failed
make[2]: *** [drivers/gpu/drm] Error 2
scripts/Makefile.build:606: recipe for target 'drivers/gpu' failed
make[1]: *** [drivers/gpu] Error 2
make[1]: *** Waiting for unfinished jobs....

Could someone take a look at the file I've attached please?
Comment 12 Joel Sass 2018-04-25 21:01:15 UTC
Created attachment 139112 [details]
The modified file causing the problem in my comment.

Looks like there's some missing nomenclature between the patch you'd suggested, and the kernel source I acquired from git://kernel.ubuntu.com/ubuntu/unstable.git
Comment 13 Alex Deucher 2018-04-25 21:15:33 UTC
Try the patches from this branch:
https://cgit.freedesktop.org/~agd5f/linux/log/?h=drm-fixes-4.17
Comment 14 Joel Sass 2018-04-26 13:57:30 UTC
Alex, I just rebooted to this kernel after building. This problem still exists, but it's not a hard freeze!

I'll attach the dmesg showing the error.
Comment 15 Joel Sass 2018-04-26 14:02:06 UTC
Created attachment 139132 [details]
This is the dmesg from an ssh session after attaching a monitor to my MST hub

root@nope:~/errors# uname -a
Linux nope 4.16.0-rc7+ #2 SMP Thu Apr 26 08:45:00 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux

Kernel git acquired here: https://cgit.freedesktop.org/~agd5f/linux/log/?h=drm-fixes-4.17

Key message from dmesg:

[  526.900234] Call Trace:
[  526.900280]  dm_dp_mst_get_modes+0xce/0x120 [amdgpu]
[  526.900288]  drm_helper_probe_single_connector_modes+0x199/0x6c0 [drm_kms_helper]
[  526.900294]  ? jbd2_journal_stop+0xf3/0x3e0
[  526.900297]  ? __ext4_journal_stop+0x37/0xa0
[  526.900309]  drm_mode_getconnector+0x2c4/0x300 [drm]
[  526.900314]  ? _cond_resched+0x16/0x40
[  526.900324]  ? drm_mode_connector_property_set_ioctl+0x60/0x60 [drm]
[  526.900333]  drm_ioctl_kernel+0x67/0xb0 [drm]
[  526.900342]  drm_ioctl+0x2a9/0x350 [drm]
[  526.900352]  ? drm_mode_connector_property_set_ioctl+0x60/0x60 [drm]
[  526.900381]  amdgpu_drm_ioctl+0x46/0x80 [amdgpu]
[  526.900385]  do_vfs_ioctl+0xa2/0x5f0
[  526.900389]  ? vfs_write+0x162/0x1a0
[  526.900391]  SyS_ioctl+0x74/0x80
[  526.900395]  do_syscall_64+0x60/0x110
[  526.900399]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
Comment 16 Joel Sass 2018-04-26 14:06:51 UTC
Created attachment 139134 [details]
Here's the kernel .config I used for making the kernel

Key changes:

I added AMDGPU module, checked all 4 boxes. Looks like amdgpu.dc switch is gone, assuming that's intentional.

I also checked the optimization for Intel Core 2/Xeon processors kernel checkbox instead of generic x86
Comment 17 Harry Wentland 2018-06-27 15:25:33 UTC
I think we had a fix for that. Can you check if this is still a problem on https://cgit.freedesktop.org/~agd5f/linux/log/?h=amd-staging-drm-next
Comment 18 Martin Peres 2019-11-19 08:35:49 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/348.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.