Bug 109635 - [CI][BAT] igt@kms_chamelium@hdmi-crc|cmp-* - fail - Chamelium RPC call failed: RPC failed at server. <class 'chameleond.devices.input_flow.InputFlowError'>:Video input not stable
Summary: [CI][BAT] igt@kms_chamelium@hdmi-crc|cmp-* - fail - Chamelium RPC call failed...
Status: RESOLVED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: Other All
: medium normal
Assignee: emersion
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2019-02-14 17:47 UTC by Lakshmi
Modified: 2019-08-16 12:33 UTC (History)
3 users (show)

See Also:
i915 platform: ICL
i915 features: display/Other


Attachments
attachment-20827-0.html (3.15 KB, text/html)
2019-05-09 21:10 UTC, Stuart Summers
no flags Details
attachment-27415-0.html (3.15 KB, text/html)
2019-05-20 12:03 UTC, Stuart Summers
no flags Details

Description Lakshmi 2019-02-14 17:47:32 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5601/fi-icl-u2/igt@kms_chamelium@hdmi-crc-fast.html

Starting subtest: hdmi-crc-fast
(kms_chamelium:2989) igt_chamelium-CRITICAL: Test assertion failure function chamelium_rpc, file ../lib/igt_chamelium.c:303:
(kms_chamelium:2989) igt_chamelium-CRITICAL: Failed assertion: !chamelium->env.fault_occurred
(kms_chamelium:2989) igt_chamelium-CRITICAL: Chamelium RPC call failed: RPC failed at server.  <class 'chameleond.devices.input_flow.InputFlowError'>:Video input not stable.
Subtest hdmi-crc-fast failed.
Comment 3 Stuart Summers 2019-02-20 17:39:05 UTC
The failure is basically saying the chamelium is trying to read an I2C slave address and never gets the expected value indicating the video input status reads stable. I'm wondering if there's something faulty with the i2c lines on one of the chamelium boards in CI? There are a couple of I2C related sightings open right now.

I haven't been able to reproduce this or https://bugs.freedesktop.org/show_bug.cgi?id=109483 with a Chamelium board I'm running locally (only running HDMI to a kbl NUC). These failures are all ICL related I'm seeing. That said, the I2C timeout is on the Chamelium side, not the GPU side. Maybe there's some other timing constraint being exposed by ICL here?
Comment 4 James Ausmus 2019-03-04 23:44:51 UTC
I'm seeing that the Chamelium is complaining that the video input isn't stable, but the test never appears to call chamelium_port_wait_video_input_stable - I wonder if we're just trying to capture CRCs before the Chamelium has had a chance to decide the input is stable?

Might be worth trying to just throw in a call to chamelium_port_wait_video_input_stable before the chamelium_capture call.
Comment 5 CI Bug Log 2019-03-13 10:36:33 UTC
A CI Bug Log filter associated to this bug has been updated:

{- ICL: igt@kms_chamelium@hdmi-crc|cmp-* - fail - Failed assertion: !chamelium-&gt;env.fault_occurred -}
{+ KBL ICL: igt@kms_chamelium@hdmi-crc|cmp-* - fail - Failed assertion: !chamelium-&gt;env.fault_occurred +}

New failures caught by the filter:

* https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_240/fi-kbl-7500u/igt@kms_chamelium@hdmi-crc-planes-random.html
* https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_240/fi-kbl-7567u/igt@kms_chamelium@hdmi-crc-planes-random.html
Comment 6 Neel 2019-03-21 19:20:48 UTC
(In reply to James Ausmus from comment #4)
> I'm seeing that the Chamelium is complaining that the video input isn't
> stable, but the test never appears to call
> chamelium_port_wait_video_input_stable - I wonder if we're just trying to
> capture CRCs before the Chamelium has had a chance to decide the input is
> stable?
> 
> Might be worth trying to just throw in a call to
> chamelium_port_wait_video_input_stable before the chamelium_capture call.

James, adding the call to chamelium_port_wait_video_input_stable() does not help. 

chamelium_port_wait_video_input_stable() with a timeout of 20s itself times out. 

(kms_chamelium:17683) igt_debugfs-DEBUG: Opening debugfs directory '/sys/kernel/debug/dri/0'
(kms_chamelium:17683) igt_chamelium-DEBUG: Waiting for video input to stabalize on DP-1
(kms_chamelium:17683) igt_chamelium-CRITICAL: Test assertion failure function chamelium_rpc, file ../lib/igt_chamelium.c:358:
(kms_chamelium:17683) igt_chamelium-CRITICAL: Failed assertion: !chamelium->env.fault_occurred
(kms_chamelium:17683) igt_chamelium-CRITICAL: Chamelium RPC call WaitVideoInputStable(ii) failed: RPC failed at server.  <class 'chameleond.devices.input_flow.InputFlowError'>:Timeout waiting video output stable
(kms_chamelium:17683) igt_core-INFO: Stack trace:
(kms_chamelium:17683) igt_core-INFO:   #0 ../lib/igt_core.c:1474 __igt_fail_assert()
(kms_chamelium:17683) igt_core-INFO:   #1 ../lib/igt_chamelium.c:361 chamelium_rpc()
(kms_chamelium:17683) igt_core-INFO:   #2 ../lib/igt_chamelium.c:438 chamelium_port_wait_video_input_stable()
(kms_chamelium:17683) igt_core-INFO:   #3 ../tests/kms_chamelium.c:568 do_test_display()
(kms_chamelium:17683) igt_core-INFO:   #4 ../tests/kms_chamelium.c:626 test_display_one_mode()
(kms_chamelium:17683) igt_core-INFO:   #5 ../tests/kms_chamelium.c:934 __real_main783()
(kms_chamelium:17683) igt_core-INFO:   #6 ../tests/kms_chamelium.c:783 main()
(kms_chamelium:17683) igt_core-INFO:   #7 [__libc_start_main+0xf3]
(kms_chamelium:17683) igt_core-INFO:   #8 [_start+0x2e]
****  END  ****
Comment 7 Neel 2019-03-21 19:24:26 UTC
I also see the following error after I reboot the chamelium and run setup hdmi.

Starting subtest: hdmi-crc-fast
(kms_chamelium:19878) igt_chamelium-CRITICAL: Test assertion failure function chamelium_rpc, file ../lib/igt_chamelium.c:358:
(kms_chamelium:19878) igt_chamelium-CRITICAL: Failed assertion: !chamelium->env.fault_occurred
(kms_chamelium:19878) igt_chamelium-CRITICAL: Chamelium RPC call Reset() failed: RPC failed at server.  <class 'chameleond.utils.audio_utils.AudioCaptureManagerError'>:No audio data was captured. Perhaps this input is not plugged ?
Stack trace:
  #0 ../lib/igt_core.c:1474 __igt_fail_assert()
  #1 ../lib/igt_chamelium.c:361 chamelium_rpc()
  #2 ../lib/igt_chamelium.c:1607 chamelium_reset()
  #3 ../tests/kms_chamelium.c:214 reset_state()
  #4 ../tests/kms_chamelium.c:614 test_display_one_mode()
  #5 ../tests/kms_chamelium.c:934 __real_main783()
  #6 ../tests/kms_chamelium.c:783 main()
  #7 [__libc_start_main+0xf3]
  #8 [_start+0x2e]
Subtest hdmi-crc-fast failed.


This only happens once after a reboot. I do not see the error in subsequent runs
Comment 8 CI Bug Log 2019-04-16 07:41:25 UTC
A CI Bug Log filter associated to this bug has been updated:

{- KBL ICL: igt@kms_chamelium@hdmi-crc|cmp-* - fail - Failed assertion: !chamelium-&gt;env.fault_occurred -}
{+ KBL ICL: igt@kms_chamelium@* - fail - Failed assertion: !chamelium-&gt;env.fault_occurred +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_255/fi-kbl-7500u/igt@kms_chamelium@dp-frame-dump.html
Comment 9 Martin Peres 2019-04-23 12:52:38 UTC
This looks very interesting! To me, this might mean that we failed to do a proper modeset...

The issue is visible over 80% of the time on ICL, and 17% on 2 KBL. This is quite worrying don't you think?

Bumping the priority because this needs more attention!
Comment 10 emersion 2019-04-23 15:12:11 UTC
>I'm wondering if there's something faulty with the i2c lines on one of the chamelium boards in CI?(In reply to Stuart Summers from comment #3)
> The failure is basically saying the chamelium is trying to read an I2C slave
> address and never gets the expected value indicating the video input status
> reads stable. I'm wondering if there's something faulty with the i2c lines
> on one of the chamelium boards in CI? There are a couple of I2C related
> sightings open right now.

I believe this isn't an I2C issue.

The receiver on the Chamelium has an I2C register that indicates whether there's a stable video signal received from the DUT. The Chamelium device will wait for this register to indicate that video is stable. That is, it will poll the register each second and check whether video is stable. If after 10 seconds the register still indicates video isn't stable, the Chamelium server will give up and report this error. So this isn't an I2C timeout, it's just the Chamelium device not receiving stable video even after waiting for 10 seconds.

This error usually means that no data is received.
Comment 11 emersion 2019-04-23 15:13:07 UTC
> This error usually means that no data is received.

Gah. I meant: this error usually means that no data is sent.
Comment 12 CI Bug Log 2019-04-26 08:18:34 UTC
A CI Bug Log filter associated to this bug has been updated:

{- KBL ICL: igt@kms_chamelium@* - fail - Failed assertion: !chamelium-&gt;env.fault_occurred -}
{+ KBL ICL: igt@kms_chamelium@* - fail - Failed assertion: !chamelium-&gt;env.fault_occurred +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_261/fi-kbl-7500u/igt@kms_chamelium@dp-crc-single.html
Comment 13 Maarten Lankhorst 2019-04-26 10:15:33 UTC
So could be a chamelium issue, but also that we fail to set up HDMI properly and the test fails because no image is shown on screen.

Worst case for user is ending up with a black screen.
Comment 14 Maarten Lankhorst 2019-04-26 10:17:04 UTC
Does it also happen after a suspend/resume cycle while the chamelium is disabled before suspend?
Comment 15 Jani Saarinen 2019-04-26 13:29:36 UTC
So can someone check if we have Chamelium configuration issue or real issue in driver?
Comment 16 Jani Saarinen 2019-04-27 08:16:32 UTC
Also on https://bugs.freedesktop.org/show_bug.cgi?id=108896 we fail same test but different symptom?
Comment 17 CI Bug Log 2019-04-29 08:33:34 UTC
A CI Bug Log filter associated to this bug has been updated:

{- KBL ICL: igt@kms_chamelium@* - fail - Failed assertion: !chamelium-&gt;env.fault_occurred -}
{+ KBL ICL: igt@kms_chamelium@* - fail - Failed assertion: !chamelium-&gt;env.fault_occurred +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_266/fi-kbl-7500u/igt@kms_chamelium@dp-audio.html
  * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_268/fi-kbl-7500u/igt@kms_chamelium@dp-audio.html
  * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_269/fi-kbl-7500u/igt@kms_chamelium@dp-audio.html
Comment 18 emersion 2019-04-30 07:06:02 UTC
(In reply to CI Bug Log from comment #17)
> A CI Bug Log filter associated to this bug has been updated:
> 
> {- KBL ICL: igt@kms_chamelium@* - fail - Failed assertion:
> !chamelium-&gt;env.fault_occurred -}
> {+ KBL ICL: igt@kms_chamelium@* - fail - Failed assertion:
> !chamelium-&gt;env.fault_occurred +}
> 
> New failures caught by the filter:
> 
>   *
> https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_266/fi-kbl-7500u/
> igt@kms_chamelium@dp-audio.html
>   *
> https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_268/fi-kbl-7500u/
> igt@kms_chamelium@dp-audio.html
>   *
> https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_269/fi-kbl-7500u/
> igt@kms_chamelium@dp-audio.html

Note that these failures are completely unrelated, and are caused because the Chamelium devices need to be updated to support audio tests.
Comment 19 emersion 2019-04-30 07:10:28 UTC
(In reply to Neel from comment #7)
> I also see the following error after I reboot the chamelium and run setup
> hdmi.
> 
> Starting subtest: hdmi-crc-fast
> (kms_chamelium:19878) igt_chamelium-CRITICAL: Test assertion failure
> function chamelium_rpc, file ../lib/igt_chamelium.c:358:
> (kms_chamelium:19878) igt_chamelium-CRITICAL: Failed assertion:
> !chamelium->env.fault_occurred
> (kms_chamelium:19878) igt_chamelium-CRITICAL: Chamelium RPC call Reset()
> failed: RPC failed at server.  <class
> 'chameleond.utils.audio_utils.AudioCaptureManagerError'>:No audio data was
> captured. Perhaps this input is not plugged ?
> Stack trace:
>   #0 ../lib/igt_core.c:1474 __igt_fail_assert()
>   #1 ../lib/igt_chamelium.c:361 chamelium_rpc()
>   #2 ../lib/igt_chamelium.c:1607 chamelium_reset()
>   #3 ../tests/kms_chamelium.c:214 reset_state()
>   #4 ../tests/kms_chamelium.c:614 test_display_one_mode()
>   #5 ../tests/kms_chamelium.c:934 __real_main783()
>   #6 ../tests/kms_chamelium.c:783 main()
>   #7 [__libc_start_main+0xf3]
>   #8 [_start+0x2e]
> Subtest hdmi-crc-fast failed.
> 
> 
> This only happens once after a reboot. I do not see the error in subsequent
> runs

Oh, and this one is unrelated too. This particular one happens because Reset is buggy, I have a patch to fix it available at https://chromium-review.googlesource.com/c/chromiumos/platform/chameleon/+/1583770
Comment 20 Jani Saarinen 2019-05-06 06:56:15 UTC
These tests now pass on system where chamelium connected properly, maybe this was just config issue?
Comment 21 CI Bug Log 2019-05-09 08:01:03 UTC
A CI Bug Log filter associated to this bug has been updated:

{- KBL ICL: igt@kms_chamelium@* - fail - Failed assertion: !chamelium-&gt;env.fault_occurred -}
{+ KBL ICL: igt@kms_chamelium@* - fail - Failed assertion: !chamelium-&gt;env.fault_occurred +}


  No new failures caught with the new filter
Comment 22 Lakshmi 2019-05-09 08:07:35 UTC
(In reply to emersion from comment #18)
> (In reply to CI Bug Log from comment #17)
> > A CI Bug Log filter associated to this bug has been updated:
> > 
> > {- KBL ICL: igt@kms_chamelium@* - fail - Failed assertion:
> > !chamelium-&gt;env.fault_occurred -}
> > {+ KBL ICL: igt@kms_chamelium@* - fail - Failed assertion:
> > !chamelium-&gt;env.fault_occurred +}
> > 
> > New failures caught by the filter:
> > 
> >   *
> > https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_266/fi-kbl-7500u/
> > igt@kms_chamelium@dp-audio.html
> >   *
> > https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_268/fi-kbl-7500u/
> > igt@kms_chamelium@dp-audio.html
> >   *
> > https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_269/fi-kbl-7500u/
> > igt@kms_chamelium@dp-audio.html
> 
> Note that these failures are completely unrelated, and are caused because
> the Chamelium devices need to be updated to support audio tests.

Thanks Simon. I have created a separate bug 110651 to capture these failures.
Comment 23 Stuart Summers 2019-05-09 21:10:00 UTC
Created attachment 144209 [details]
attachment-20827-0.html

Hello,

I am currently out of the office scheduled back Monday WW20.1. Please expect delays in email response during this time.

For emergencies, please contact Sudeep Dutt.

Thanks,
Stuart
Comment 24 emersion 2019-05-10 11:56:29 UTC
Here are the Chamelium logs:

2018-06-04 10:34:38,691 INFO Apply EDID #1185 to port #1
2018-06-04 10:34:38,722 INFO Plug port #1                            
2018-06-04 10:34:39,827 INFO Select FpgaInputFlow #1. 
2018-06-04 10:34:39,832 INFO Initialize FpgaInputFlow #1.               
2018-06-04 10:34:39,939 INFO Initialize DisplayPort RX chip.
2018-06-04 10:34:44,983 WARNING Timeout on waiting for condition IsVideoInputStable == True
2018-06-04 10:34:44,983 INFO Send DP HPD pulse to reset source...
2018-06-04 10:34:50,096 WARNING Timeout on waiting for condition IsVideoInputStable == True
2018-06-04 10:34:50,098 ERROR DP FSM failed

Here are some excerpts of the kernel logs:

<4> [821.574471] snd_hda_codec_hdmi hdaudioC0D2: HDMI: pin nid 5 not registered
<3> [834.609638] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun
<3> [834.783652] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe B FIFO underrun
<3> [834.791391] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe C FIFO underrun

This looks like a genuine driver bug.

User impact: screen stays black after plugging in a monitor. I don't know whether replugging the monitor helps.
Comment 25 Mika Kahola 2019-05-10 12:34:53 UTC
I guess you don't see HDMI sound card in /proc/asound either?

I have been working on with the bug 

https://bugs.freedesktop.org/show_bug.cgi?id=102370

where I can see similar failure on sound card initialization 

snd_hda_codec_hdmi hdaudioC0D2: HDMI: pin nid 5 not registered

(see sound/pci/hda/patch_hdmi.c line 265)

I suspect that this is the root cause for both of these bugs.
Comment 26 emersion 2019-05-10 12:45:40 UTC
(In reply to Mika Kahola from comment #25)
> I guess you don't see HDMI sound card in /proc/asound either?
> 
> I have been working on with the bug 
> 
> https://bugs.freedesktop.org/show_bug.cgi?id=102370
> 
> where I can see similar failure on sound card initialization 
> 
> snd_hda_codec_hdmi hdaudioC0D2: HDMI: pin nid 5 not registered
> 
> (see sound/pci/hda/patch_hdmi.c line 265)
> 
> I suspect that this is the root cause for both of these bugs.

Oh, sorry, I should've been more explicit. The line about audio was only quoted because it have loglevel <4>, it's not actually relevant for this bug.

These Chamelium tests don't care at all about audio, so I don't think fixing this would also fix this bug. Looking at the code confirms that even if pin_id_to_pin_index fails, video shouldn't be affected.

tl;dr I think it's a separate driver bug.
Comment 27 Stanislav Lisovskiy 2019-05-20 12:03:28 UTC
Obviously FIFO underruns can affect video stability. Wondering if we are having again bw/watermark issue here, as there is 3840x2160 eDP, and two 1920x1080 resolution connectors configured. However we don't use way too many planes here.

However can't reproduce it on my icelake with similar configuration.
Comment 28 Stuart Summers 2019-05-20 12:03:34 UTC
Created attachment 144305 [details]
attachment-27415-0.html

Hello,

I am currently out of the office scheduled back Monday WW21.2. Please expect delays in email response during this time.

For emergencies, please contact Sudeep Dutt.

Thanks,
Stuart
Comment 29 Jani Saarinen 2019-06-07 13:10:52 UTC
After BW series landed this issue not seen in ~2 weeks. Dropping priority.
Comment 30 emersion 2019-08-16 12:33:40 UTC
Not seen in two months. Closing.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.