On CI_DRM_2600, the machine fi-bsw-n3050 hung on igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b. This may be related to the thousands of "[drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x71450064" found in the logs. Full logs: https://intel-gfx-ci.01.org/CI/CI_DRM_2600/fi-bsw-n3050/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b.html
(In reply to Martin Peres from comment #0) > On CI_DRM_2600, the machine fi-bsw-n3050 hung on > igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b. > > This may be related to the thousands of "[drm:intel_dp_aux_ch [i915]] > dp_aux_ch timeout status 0x71450064" found in the logs. > > Full logs: > https://intel-gfx-ci.01.org/CI/CI_DRM_2600/fi-bsw-n3050/ > igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b.html The hpd task is completing ok as far as I understand. we can try the patch that I mentioned in Bug 100215 to see if it helps[1] but I don't think it will fix it. Did the system hard hanged or just the IGT task? Also did it get back from the suspend? I think I saw some hangs while reading CRC that I couldn't reproduce anymore on SNB or SKLs (not sure), I'll give it another try. Thanks, [1] https://patchwork.freedesktop.org/patch/151486/
(In reply to krisman from comment #1) > (In reply to Martin Peres from comment #0) > > On CI_DRM_2600, the machine fi-bsw-n3050 hung on > > igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b. > > > > This may be related to the thousands of "[drm:intel_dp_aux_ch [i915]] > > dp_aux_ch timeout status 0x71450064" found in the logs. > > > > Full logs: > > https://intel-gfx-ci.01.org/CI/CI_DRM_2600/fi-bsw-n3050/ > > igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b.html > > The hpd task is completing ok as far as I understand. we can try the patch > that I mentioned in Bug 100215 to see if it helps[1] but I don't think it > will fix it. Our system already tested it and found it was fixing the issue: https://patchwork.freedesktop.org/series/23299/ But I guess you meant that we should apply it permanently. The only way to do this is to get it upstream :) > Did the system hard hanged or just the IGT task? Also did it > get back from the suspend? The system hard-hanged or at least could not be reached from the network after resume, and the controller could not read the results anymore (https://intel-gfx-ci.01.org/CI/CI_DRM_2600/fi-bsw-n3050/igt.log). > I think I saw some hangs while reading CRC that > I couldn't reproduce anymore on SNB or SKLs (not sure), I'll give it another > try. Thanks for looking into it!
(In reply to Martin Peres from comment #2) > (In reply to krisman from comment #1) > > (In reply to Martin Peres from comment #0) > > > On CI_DRM_2600, the machine fi-bsw-n3050 hung on > > > igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b. > > > > > > This may be related to the thousands of "[drm:intel_dp_aux_ch [i915]] > > > dp_aux_ch timeout status 0x71450064" found in the logs. > > > > > > Full logs: > > > https://intel-gfx-ci.01.org/CI/CI_DRM_2600/fi-bsw-n3050/ > > > igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b.html > > > > The hpd task is completing ok as far as I understand. we can try the patch > > that I mentioned in Bug 100215 to see if it helps[1] but I don't think it > > will fix it. > > Our system already tested it and found it was fixing the issue: > https://patchwork.freedesktop.org/series/23299/ > > But I guess you meant that we should apply it permanently. The only way to > do this is to get it upstream :) Thanks Martin! I am working on a new version of that patch to go upstream. I'll make myself the assignee for this one.
Last seen: 2017-05-10 Statistics: Failure rate 1/88 run(s) (1%).
krisman, any ETA for patch to try?
(In reply to Jani Saarinen from comment #5) > krisman, any ETA for patch to try? I submitted a new version under the name: drm: i915: Don't try detecting sinks on ports already in use last week which got feedback from Ville and will need more rework. I wonder if this got resolved by Maarten work for 100215.
I guess that IGT change was reverted I think?
I am confused about this bug according to: https://intel-gfx-ci.01.org/CI/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b.html the igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b has been skipped for a very long time and then all of a sudden the result is incomplete on CI_DRM_2699
The test is doing some suboptimal things. It first does the suspend/resume cycle, and then checks if the tested pipe has valid connectors, so it will suspend anyway even when it will skip. In the case of CI_DRM_2699, there were about 3 successful suspend/resumes before suspend-read-crc-pipe-b came along where the DUT never recovered from the suspend. The point being here that it's the suspend/resume that jams, not this particular subtest.
(In reply to Petri Latvala from comment #9) > The test is doing some suboptimal things. It first does the suspend/resume > cycle, and then checks if the tested pipe has valid connectors, so it will > suspend anyway even when it will skip. For the record, I submitted a patch to address the issue with the test suspending before skipping. Petri already reviewed and pushed to igt. 31f71d62d5ff ("igt/kms_pipe_crc_basic: Skip test before system suspend") > In the case of CI_DRM_2699, there were about 3 successful suspend/resumes > before suspend-read-crc-pipe-b came along where the DUT never recovered from > the suspend. The point being here that it's the suspend/resume that jams, > not this particular subtest. Agreed. The issue is more related to the recovery of the suspend than the read_crc itself. Just need to mention, though, that bsw-n3050 has connectors attached, as well as VGA, meaning that it's not a case where my patch will make igt skip. We are also discussing on the list a way to reduce the overhead provoked by the thousands of dp_aux_ch timeout messages below: [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x71450064
(In reply to krisman from comment #10) > (In reply to Petri Latvala from comment #9) > > The test is doing some suboptimal things. It first does the suspend/resume > > cycle, and then checks if the tested pipe has valid connectors, so it will > > suspend anyway even when it will skip. > > For the record, I submitted a patch to address the issue with the test > suspending before skipping. Petri already reviewed and pushed to igt. > > 31f71d62d5ff ("igt/kms_pipe_crc_basic: Skip test before system suspend") Thanks for this! Reducing the noise and the execution time is a Yay from me :) > > > In the case of CI_DRM_2699, there were about 3 successful suspend/resumes > > before suspend-read-crc-pipe-b came along where the DUT never recovered from > > the suspend. The point being here that it's the suspend/resume that jams, > > not this particular subtest. > > Agreed. The issue is more related to the recovery of the suspend than the > read_crc itself. Just need to mention, though, that bsw-n3050 has > connectors attached, as well as VGA, meaning that it's not a case where my > patch will make igt skip. Right, I will still close the bug as the issue is somewhere else than our driver... > > We are also discussing on the list a way to reduce the overhead provoked by > the thousands of dp_aux_ch timeout messages below: > > [drm:intel_dp_aux_ch [i915]] dp_aux_ch timeout status 0x71450064 Right, any success on that?
Since the test is now skipping immediately, this is not a problem anymore for us.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.