The machine fi-skl-6700k produced a fail for the test igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a on CI_DRM_2387 and has not been reproduced yet, 7 runs later. The main relevant part: (kms_pipe_crc_basic:10156) igt-debugfs-CRITICAL: Test assertion failure function igt_assert_crc_equal, file igt_debugfs.c:312: (kms_pipe_crc_basic:10156) igt-debugfs-CRITICAL: Failed assertion: a->crc[i] == b->crc[i] (kms_pipe_crc_basic:10156) igt-debugfs-CRITICAL: Last errno: 9, Bad file descriptor (kms_pipe_crc_basic:10156) igt-debugfs-CRITICAL: error: 0xbed119d0 != 0x521eeb85 Here are all the logs: https://intel-gfx-ci.01.org/CI/CI_DRM_2387/fi-skl-6700k/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a.html
Setting the platform and elevating the priority because it involves our CI.
Statistics: Failure rate 3/94 run(s) (3%)
Seen also on PW run on test-c https://intel-gfx-ci.01.org/CI/Patchwork_4521/fi-skl-6700k/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-c.html Marked this test also for CI
For a: Statistics: Failure rate 3/153 run(s) (1%)
Created attachment 130942 [details] [review] spinlock shuffle This maybe a bit long shot but maybe give it go with this patch where we shuffle around the spinlock acquiring/releasing. Failure rate is quite low so we need to give this a long run.
(In reply to Mika Kahola from comment #5) > Created attachment 130942 [details] [review] [review] > spinlock shuffle > > This maybe a bit long shot but maybe give it go with this patch where we > shuffle around the spinlock acquiring/releasing. Failure rate is quite low > so we need to give this a long run. copy_to_user may sleep, so you can't hold the spinlock.. from a quick look, not sure it will help for testing either, I think this patch will deadlock the interruption handler when adding a new crc entry..
Adding tag into "Whiteboard" field - ReadyForDev The bug still active *Status is correct *Platform is included *Feature is included *Priority and Severity correctly set *Logs included
Created attachment 131406 [details] [review] Reset GPU before running test This patch is related to another bug but could be tested with this bug as well. Because the occurrence of this bug is relatively rare (<3%) it could be assumed that GPU may be left in some weird state. Therefore, the patch proposes to reset GPU before entering to the subtests. Let's see what happens in CI when this patch is applied.
I forgot to mention that running this test alone for couple of hundred times I wasn't able to trigger this reported behavior.
(In reply to Mika Kahola from comment #8) > Created attachment 131406 [details] [review] [review] > Reset GPU before running test > > This patch is related to another bug but could be tested with this bug as > well. Because the occurrence of this bug is relatively rare (<3%) it could > be assumed that GPU may be left in some weird state. Therefore, the patch > proposes to reset GPU before entering to the subtests. Let's see what > happens in CI when this patch is applied. Isn't that cheating? Why not reset the gpu before suspend then? In other news, today, we hit the same bug on pipe B too: https://intel-gfx-ci.01.org/CI/CI_DRM_2634/fi-skl-6700k/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b.html
Well, I think we could start the test "fresh" and hence the reset before running the tests.
These are really hard to reproduce. Might be that we just need to wait few times still and close if not reproduced.
Unable to replicate the issue and the issue hasn't surfaced on CI runs either.
(In reply to Mika Kahola from comment #13) > Unable to replicate the issue and the issue hasn't surfaced on CI runs > either. Guess who's back? https://intel-gfx-ci.01.org/CI/CI_DRM_2788/fi-skl-6700k/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a.html
oh, that surfaced again. Not so cool.
Test Affected machines (Last seen on) igt@kms_pipe_crc_basic@suspend-read-crc-pipe-c fi-skl-6700k: CI_DRM_2998: 2017-08-24 / 87 runs ago, with result 'fail' ( raw data, history ), failure rate of 8 / 795 runs (1 %) igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b fi-skl-6700k: CI_DRM_2987: 2017-08-22 / 98 runs ago, with result 'fail' ( raw data, history ), failure rate of 2 / 795 runs (0 %) igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a fi-skl-6700k: CI_DRM_2788: 2017-06-30 / 273 runs ago, with result 'fail' ( raw data, history ), failure rate of 5 / 795 runs (1 %) https://intel-gfx-ci.01.org/cibuglog/index.html%3Faction_failures_history=92.html Really sporadic. Dropping priority.
On https://intel-gfx-ci.01.org/cibuglog/index.html%3Faction_failures_history=-1&failures_test=igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b.html this is only atomic update failure that is fixed by Daniel? https://intel-gfx-ci.01.org/cibuglog/index.html%3Faction_failures_history=-1&failures_test=igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a.html Seen last 15th of Sep And https://intel-gfx-ci.01.org/cibuglog/index.html%3Faction_failures_history=-1&failures_test=igt@kms_pipe_crc_basic@suspend-read-crc-pipe-c.html seen last CI_DRM_2998: 2017-08-24 / 227 runs ago Are we actually good to resolve this as works for me?
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.