Bug 94100 - [HSW] compute indirect dispatch with 0 work groups causes gpu hang
Summary: [HSW] compute indirect dispatch with 0 work groups causes gpu hang
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: git
Hardware: Other All
: medium normal
Assignee: Ian Romanick
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-02-11 19:47 UTC by Ilia Mirkin
Modified: 2016-02-17 18:13 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
dirty patch to conditionally terminate batch on hsw (1.26 KB, text/plain)
2016-02-13 01:51 UTC, Ilia Mirkin
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ilia Mirkin 2016-02-11 19:47:59 UTC
From the ARB_compute_shader spec:

        The command

        void DispatchComputeIndirect(intptr indirect);

    is equivalent (assuming no errors are generated) to calling
    DispatchCompute with <num_groups_x>, <num_groups_y> and <num_groups_z>
    initialized with the three uint values contained in the buffer currently
    bound to the DISPATCH_INDIRECT_BUFFER binding at an offset, in basic
    machine units, specified by <indirect>.  
    ...
    If any of <num_groups_x>, <num_groups_y> or <num_groups_z>
    is greater than MAX_COMPUTE_WORK_GROUP_COUNT for the corresponding
    dimension then the results are undefined.

And earlier there:

"If the work group count in any dimension is zero, no work groups are dispatched."

I suspect that GLES 3.1 says something similar, as there is a dEQP test that exercises this:

dEQP-GLES31.functional.compute.indirect_dispatch.upload_buffer.empty_command

Running it on a HSW with kernel 4.4.1 results in a GPU hang:

[17904.184491] [drm] stuck on render ring
[17904.185787] [drm] GPU HANG: ecode 7:0:0x8fd0ffff, in deqp-gles31 [17164], reason: Ring hung, action: reset
[17904.185861] ------------[ cut here ]------------
[17904.185880] WARNING: CPU: 2 PID: 15876 at drivers/gpu/drm/i915/intel_display.c:11289 intel_mmio_flip_work_func+0x37d/0x3c0()
[17904.185883] WARN_ON(__i915_wait_request(mmio_flip->req, mmio_flip->crtc->reset_counter, false, NULL, &mmio_flip->i915->rps.mmioflips))
[17904.185886] Modules linked in:
[17904.185890]  cfg80211 bcma x86_pkg_temp_thermal
[17904.185900] CPU: 2 PID: 15876 Comm: kworker/2:0 Tainted: G        W       4.4.1 #6
[17904.185903] Hardware name: Dell Inc. XPS 8700/0KWVT8, BIOS A08 04/16/2014
[17904.185910] Workqueue: events intel_mmio_flip_work_func
[17904.185914]  ffffffff81c1b398 ffff8801cb457d30 ffffffff813d684f ffff8801cb457d78
[17904.185920]  ffff8801cb457d68 ffffffff810fd966 ffff8801fb7bfa80 ffff8801ac5fa240
[17904.185925]  ffff88022fa94d00 0000000000000080 ffff88022fa99400 ffff8801cb457dc8
[17904.185931] Call Trace:
[17904.185942]  [<ffffffff813d684f>] dump_stack+0x44/0x55
[17904.185951]  [<ffffffff810fd966>] warn_slowpath_common+0x86/0xc0
[17904.185957]  [<ffffffff810fd9ec>] warn_slowpath_fmt+0x4c/0x50
[17904.185963]  [<ffffffff81566d1d>] intel_mmio_flip_work_func+0x37d/0x3c0
[17904.185973]  [<ffffffff81113dcc>] process_one_work+0x14c/0x3d0
[17904.185979]  [<ffffffff8111436b>] worker_thread+0x4b/0x440
[17904.185986]  [<ffffffff81114320>] ? rescuer_thread+0x2d0/0x2d0
[17904.185991]  [<ffffffff81118f09>] kthread+0xc9/0xe0
[17904.185996]  [<ffffffff81118e40>] ? kthread_park+0x60/0x60
[17904.186002]  [<ffffffff818df89f>] ret_from_fork+0x3f/0x70
[17904.186006]  [<ffffffff81118e40>] ? kthread_park+0x60/0x60
[17904.186010] ---[ end trace 1c4e38c670f10909 ]---
[17904.187422] drm/i915: Resetting chip after gpu hang

Please let me know if any additional information is required.
Comment 1 Ilia Mirkin 2016-02-13 01:51:19 UTC
Created attachment 121729 [details]
dirty patch to conditionally terminate batch on hsw

This very very dirty patch looks like does the trick. 0x36 is MI_CONDITIONAL_BATCH_BUFFER_END. This will need a helper to deal with RELOC64 on gen8+ of course. Or perhaps gen8 can deal with 0-sized workgroups directly.
Comment 2 Jordan Justen 2016-02-17 18:13:25 UTC
commit 9a939ebb47a0d37a6b29e3dbb1b20bdc9538a721

    i965/gen7: Use predicated rendering for indirect compute


bug/show.html.tmpl processed on Jan 18, 2017 at 04:06:24.
(provided by the Example extension).