Bug 89568 - [SNB bisected]igt/pm_rpm/universal-planes unable to stop and cause system unable to reboot.
Summary: [SNB bisected]igt/pm_rpm/universal-planes unable to stop and cause system una...
Status: CLOSED NOTABUG
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: Other All
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-03-13 10:11 UTC by Ding Heng
Modified: 2017-10-06 14:31 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
dmesg (119.05 KB, text/plain)
2015-03-13 10:11 UTC, Ding Heng
no flags Details

Description Ding Heng 2015-03-13 10:11:28 UTC
Created attachment 114281 [details]
dmesg

==System Environment==
--------------------------
Regression: Yes, bisect later

Non-working platforms: SNB

==kernel==
--------------------------
origin/drm-intel-next-queued: 0005372df9ee57e88797fe0e00fa2215595acd23(2015-03-10)

==Bug detailed description==
-----------------------------
When this case start to run, it will no stop, and can not be stoped by press ctrl+c either. And the tring of reboot will get no response.

a8c6ecb3be7029881f7c95e5e201a629094a4e1a is the first bad commit

==Reproduce steps==
---------------------------- 
./pm_rpm --run-subtest universal-planes
Comment 1 Chris Wilson 2015-03-13 11:52:20 UTC
Can you get the netsconsole output?
Comment 2 Ding Heng 2015-03-16 09:07:32 UTC
(In reply to Chris Wilson from comment #1)
> Can you get the netsconsole output?

[root@glody-pc ~]# nc -l -u -4 -p 30000
[  634.671572] netpoll: netconsole: local port 6666
[  634.671613] netpoll: netconsole: local IPv4 address 10.239.156.111
[  634.671652] netpoll: netconsole: interface 'enp0s25'
[  634.671674] netpoll: netconsole: remote port 30000
[  634.671696] netpoll: netconsole: remote IPv4 address 10.239.156.13
[  634.671723] netpoll: netconsole: remote ethernet address 6c:3b:e5:41:24:d7
[  634.671792] console [netcon0] enabled
[  634.671821] netconsole: network logging started
[  660.482156] pm_rpm: executing
[  661.717822] pm_rpm: starting subtest universal-planes
[  701.152195] systemd[1]: Unit rpcbind.service entered failed state.
[  701.152521] systemd[1]: Unit mcelog.service entered failed state.
Comment 3 Paulo Zanoni 2015-03-18 21:29:28 UTC
Bug #89550 says pm_rpm/universal-planes causes a WARN on SNB. This bug says pm_rpm/universal-planes freezes the machine on SNB. And it was suggested that this bug somehow blocks the bisect of bug #89550. Can you please be more specific on what really is the behavior we get on SNB? Also, does the freeze happen only on SNB, or on other machines as well?
Comment 4 Paulo Zanoni 2015-03-18 21:32:41 UTC
(In reply to Ding Heng from comment #2)
> (In reply to Chris Wilson from comment #1)
> > Can you get the netsconsole output?
> 
> [root@glody-pc ~]# nc -l -u -4 -p 30000
> [  634.671572] netpoll: netconsole: local port 6666
> [  634.671613] netpoll: netconsole: local IPv4 address 10.239.156.111
> [  634.671652] netpoll: netconsole: interface 'enp0s25'
> [  634.671674] netpoll: netconsole: remote port 30000
> [  634.671696] netpoll: netconsole: remote IPv4 address 10.239.156.13
> [  634.671723] netpoll: netconsole: remote ethernet address 6c:3b:e5:41:24:d7
> [  634.671792] console [netcon0] enabled
> [  634.671821] netconsole: network logging started
> [  660.482156] pm_rpm: executing
> [  661.717822] pm_rpm: starting subtest universal-planes
> [  701.152195] systemd[1]: Unit rpcbind.service entered failed state.
> [  701.152521] systemd[1]: Unit mcelog.service entered failed state.

I think there's a way to get all the debug output on netconsole, even those that are printed by drm.debug=0xe (which you should be using).
Comment 5 Ding Heng 2015-03-19 06:53:17 UTC
(In reply to Paulo Zanoni from comment #4)
> (In reply to Ding Heng from comment #2)
> > (In reply to Chris Wilson from comment #1)
> > > Can you get the netsconsole output?
> > 
> > [root@glody-pc ~]# nc -l -u -4 -p 30000
> > [  634.671572] netpoll: netconsole: local port 6666
> > [  634.671613] netpoll: netconsole: local IPv4 address 10.239.156.111
> > [  634.671652] netpoll: netconsole: interface 'enp0s25'
> > [  634.671674] netpoll: netconsole: remote port 30000
> > [  634.671696] netpoll: netconsole: remote IPv4 address 10.239.156.13
> > [  634.671723] netpoll: netconsole: remote ethernet address 6c:3b:e5:41:24:d7
> > [  634.671792] console [netcon0] enabled
> > [  634.671821] netconsole: network logging started
> > [  660.482156] pm_rpm: executing
> > [  661.717822] pm_rpm: starting subtest universal-planes
> > [  701.152195] systemd[1]: Unit rpcbind.service entered failed state.
> > [  701.152521] systemd[1]: Unit mcelog.service entered failed state.
> 
> I think there's a way to get all the debug output on netconsole, even those
> that are printed by drm.debug=0xe (which you should be using).

output on device under test:
[root@x-hnr9 ~]# modprobe netconsole netconsole=6666@10.239.156.63/enp0s25,30000@10.239.156.13/6c:3b:e5:41:24:d7
[root@x-hnr9 ~]# modprobe ixgbe
[root@x-hnr9 ~]# rmmod ixgbe
[root@x-hnr9 ~]# lsmod |grep ixgbe
[root@x-hnr9 ~]# cat /proc/cmdline
BOOT_IMAGE=kernels//nightly_parents/2015_03_12/drm-intel-next-queued/3c0401df292f3cdf60435342a05bee3efb099ca0/bzImage_x86_64 root=/dev/sda3 drm.debug=0xe hostname=x-hnr9 modules_path=kernels//nightly_parents/2015_03_12/drm-intel-next-queued/3c0401df292f3cdf60435342a05bee3efb099ca0/modules_x86_64/lib/modules/4.0.0-rc3_drm-intel-next-queued_3c0401_20150312+ kexec_jump_back_entry=0x0a12326a
[root@x-hnr9 ~]# cd /GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests/
(reverse-i-search)`km': ./^Cs_universal_plane --run-subtest cursor-fb-leak-pipe-B
[root@x-hnr9 tests]# ./pm_rpm --run-subtest universal-planes
IGT-Version: 1.10-ga172676 (x86_64) (Linux: 4.0.0-rc3_drm-intel-next-queued_3c0401_20150312+ x86_64)
Runtime PM support: 1
PC8 residency support: 0

^C^C^C

^C^C^C^C

At this moment, I was still able to access this machine through ssh tools, then I tried to reboot on the other ssh connection, I got following message and ssh service is gone:

Failed to issue method call: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
Warning: Unit file of reboot.target changed on disk, 'systemctl --system daemon-reload' recommended.

Then I checked the netconsole output, it was still look similar to comment 2:

[root@glody-pc ~]# nc -l -u -4 -p 30000
[  141.358408] dca service started, version 1.12.1
[  141.389118] ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver - version 4.0.1-k
[  141.389189] ixgbe: Copyright (c) 1999-2014 Intel Corporation.
[  207.887512] pm_rpm: executing
[  209.094113] pm_rpm: starting subtest universal-planes
[  283.107752] NET: Registered protocol family 10
[  283.109111] IPv6: ADDRCONF(NETDEV_UP): enp37s0u1u4: link is not ready
[  485.804193] systemd[1]: Unit rpcbind.service entered failed state.
[  485.805403] systemd[1]: Unit mcelog.service entered failed state.
Comment 6 Ding Heng 2015-03-19 08:21:53 UTC
(In reply to Paulo Zanoni from comment #3)
> Bug #89550 says pm_rpm/universal-planes causes a WARN on SNB. This bug says
> pm_rpm/universal-planes freezes the machine on SNB. And it was suggested
> that this bug somehow blocks the bisect of bug #89550. Can you please be
> more specific on what really is the behavior we get on SNB? Also, does the
> freeze happen only on SNB, or on other machines as well?

Here is the story, I submmited bug 89550 when testing this case with nightly branch, then when I was trying to bisect on that branch, I found some of the commits  was gone(which was need to be test while bisect), so I tried to bisect on next-queued branch and the bug is a little different, so I submitted this bug.
Comment 7 Paulo Zanoni 2015-03-19 14:26:40 UTC
(In reply to Ding Heng from comment #6)
> (In reply to Paulo Zanoni from comment #3)
> > Bug #89550 says pm_rpm/universal-planes causes a WARN on SNB. This bug says
> > pm_rpm/universal-planes freezes the machine on SNB. And it was suggested
> > that this bug somehow blocks the bisect of bug #89550. Can you please be
> > more specific on what really is the behavior we get on SNB? Also, does the
> > freeze happen only on SNB, or on other machines as well?
> 
> Here is the story, I submmited bug 89550 when testing this case with nightly
> branch, then when I was trying to bisect on that branch, I found some of the
> commits  was gone(which was need to be test while bisect), so I tried to
> bisect on next-queued branch and the bug is a little different, so I
> submitted this bug.

Oh, ok, so just to be clear: this bug is present only on drm-intel-next-queued, but not on drm-intel-nightly?

In this case, we can just close it as not-a-bug. I'm sure there's a BKM about bugs on specific trees on QA's documentation (since I helped writing it).
Comment 8 Ding Heng 2015-03-20 02:08:05 UTC
(In reply to Paulo Zanoni from comment #7)
> (In reply to Ding Heng from comment #6)
> > (In reply to Paulo Zanoni from comment #3)
> > > Bug #89550 says pm_rpm/universal-planes causes a WARN on SNB. This bug says
> > > pm_rpm/universal-planes freezes the machine on SNB. And it was suggested
> > > that this bug somehow blocks the bisect of bug #89550. Can you please be
> > > more specific on what really is the behavior we get on SNB? Also, does the
> > > freeze happen only on SNB, or on other machines as well?
> > 
> > Here is the story, I submmited bug 89550 when testing this case with nightly
> > branch, then when I was trying to bisect on that branch, I found some of the
> > commits  was gone(which was need to be test while bisect), so I tried to
> > bisect on next-queued branch and the bug is a little different, so I
> > submitted this bug.
> 
> Oh, ok, so just to be clear: this bug is present only on
> drm-intel-next-queued, but not on drm-intel-nightly?
> 
> In this case, we can just close it as not-a-bug. I'm sure there's a BKM
> about bugs on specific trees on QA's documentation (since I helped writing
> it).

Yes, this issue only happens on next-queued branch. Close bug as not-a-bug.
Comment 9 Elizabeth 2017-10-06 14:31:05 UTC
Closing old verified.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.