Bug 109679 - [CI][BAT] CHAMELIUM: igt@kms_chamelium@hdmi-crc-fast - fail - Chamelium RPC call failed: RPC failed at server. <type 'exceptions.ZeroDivisionError'>:integer division or
Summary: [CI][BAT] CHAMELIUM: igt@kms_chamelium@hdmi-crc-fast - fail - Chamelium RPC c...
Alias: None
Product: DRI
Classification: Unclassified
Component: IGT (show other bugs)
Version: XOrg git
Hardware: Other All
: high normal
Assignee: Stuart Summers
QA Contact:
Whiteboard: ReadyForDev
Depends on:
Reported: 2019-02-19 16:16 UTC by Martin Peres
Modified: 2019-06-14 12:27 UTC (History)
2 users (show)

See Also:
i915 platform: KBL
i915 features: display/HDMI


Note You need to log in before you can comment on or make changes to this bug.
Description Martin Peres 2019-02-19 16:16:36 UTC

Starting subtest: hdmi-crc-fast
(kms_chamelium:2877) igt_chamelium-CRITICAL: Test assertion failure function chamelium_rpc, file ../lib/igt_chamelium.c:303:
(kms_chamelium:2877) igt_chamelium-CRITICAL: Failed assertion: !chamelium->env.fault_occurred
(kms_chamelium:2877) igt_chamelium-CRITICAL: Chamelium RPC call failed: RPC failed at server.  <type 'exceptions.ZeroDivisionError'>:integer division or modulo by zero
Subtest hdmi-crc-fast failed.
Comment 1 Martin Peres 2019-02-19 16:18:41 UTC
Actually, moving this to IGT as this is unlikely a problem of i915.
Comment 2 CI Bug Log 2019-02-19 16:19:05 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* CHAMELIUM: igt@kms_chamelium@hdmi-crc-fast - fail - Chamelium RPC call failed: RPC failed at server.  &lt;type &#39;exceptions.ZeroDivisionError&#39;&gt;:integer division or 
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5629/fi-kbl-7500u/igt@kms_chamelium@hdmi-crc-fast.html
Comment 3 Stuart Summers 2019-02-19 18:10:53 UTC
Agree this looks like an issue in IGT or possibly in the chameleond daemon. At a quick glance, there aren't a whole lot of places in the CaptureVideo path that might trigger this exception in chameleond. It does look feasible that if we were to pass a 0 for width and height, we could potentially get a div-by-zero exception when chameleond prepares the video output:
CaptureVideo -> _PrepareCapturingVideo/_captured_params['max_frame_limit']/flow_manager.GetMaxFrameLimit -> input_flow.GetMaxFrameLimit -> frame_manager.GetMaxFrameLimit -> field_manager.GetMaxFieldLimit -> VideoDumper.GetMaxFieldLimit:
  def GetMaxFieldLimit(cls, width, height):                          
    """Returns of the maximal number of fields which can be dumped.""" 
    BYTE_PER_PIXEL = 3                                                 
    PAGE_SIZE = 4096                                                         
    field_size = width * height * BYTE_PER_PIXEL   
    field_size = ((field_size - 1) / PAGE_SIZE + 1) * PAGE_SIZE
    return cls._DUMP_BUFFER_SIZE / field_size

That said, running this myself locally does not result in the failure of this sighting, and from kms_chamelium.c, during the CRC check, we are passing 0 for width and height:
  chamelium_capture(data->chamelium, port, 0, 0, 0, 0, count);

So I'd expect this to fail all the time if the issue were in the Python call I indicated above.

And I'd also expect for CI to have seen this already if that were the case, given this code has been present for some time.
Comment 4 Stuart Summers 2019-02-19 18:31:09 UTC
Sent that previous comment a little too quickly... It looks like we aren't actually sending those 0's in the first place, in addition to my off-by-one (given we are taking 0 - 1, not 0) miss:
  (w && h) ? "(iiiiii)" : "(iinnnn)",

From chameleond, this will autofill x, y, width, and height based on the current resolution. Maybe the resolution was misconfigured? Or perhaps there's an issue in the chamelium FPGA?

I still think it would be interesting here to get the chameleond log (see request here: https://gitlab.freedesktop.org/gfx-ci/i915-infra/issues/29 + the recent patch I posted to igt). There is quite a bit of state logging in chameleond with that patch run locally with --d.

The only other thing I see interesting in the logs is this, just before the error:
(kms_chamelium:2877) igt_chamelium-DEBUG: Chamelium needs FSM, handling

I don't see that when I run locally. Maybe we're getting a spurious hotplug event while trying to capture video? Or maybe the video capture is taking to long and/or hung for some reason, causing the hotplug to timeout?
Comment 5 Jani Saarinen 2019-04-22 13:24:16 UTC
Seen only once on KBL. Not making assesment but proposing to close.
Comment 6 CI Bug Log 2019-06-14 12:27:14 UTC
The CI Bug Log issue associated to this bug has been archived.

New failures matching the above filters will not be associated to this bug anymore.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.