Bug 111486 - [kbl/apl] [drm] GPU hang while running dEQP test - multithread gen_delete
Summary: [kbl/apl] [drm] GPU hang while running dEQP test - multithread gen_delete
Status: RESOLVED MOVED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: 19.0
Hardware: x86-64 (AMD64) All
: not set not set
Assignee: Intel 3D Bugs Mailing List
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-08-26 10:05 UTC by Ren Chenglei
Modified: 2019-09-25 20:35 UTC (History)
3 users (show)

See Also:
i915 platform:
i915 features:


Attachments
GPU error state log (39.12 KB, text/plain)
2019-08-26 10:06 UTC, Ren Chenglei
Details
GPU error decode with aubinator_error_decode (4.20 MB, text/plain)
2019-08-26 10:07 UTC, Ren Chenglei
Details
Kernel Batch buffer log (11.44 MB, text/plain)
2019-08-26 10:13 UTC, Ren Chenglei
Details
One simple change to help print batch from kernel (2.55 KB, patch)
2019-08-26 10:14 UTC, Ren Chenglei
Details | Splinter Review

Note You need to log in before you can comment on or make changes to this bug.
Description Ren Chenglei 2019-08-26 10:05:30 UTC
When we run one single dEQP test case(dEQP-EGL.functional.sharing.gles2.multithread.random.textures.gen_delete.15), we encountered GPU hang on Android with APL/KBL. It could be reproduced with 100%.
Note: It could be reproduced with 100%, when we run just the one test at a time.
Comment 1 Ren Chenglei 2019-08-26 10:06:43 UTC
Created attachment 145164 [details]
GPU error state log
Comment 2 Ren Chenglei 2019-08-26 10:07:58 UTC
Created attachment 145165 [details]
GPU error decode with aubinator_error_decode
Comment 3 Ren Chenglei 2019-08-26 10:13:20 UTC
Created attachment 145166 [details]
Kernel Batch buffer log
Comment 4 Ren Chenglei 2019-08-26 10:14:15 UTC
Created attachment 145167 [details] [review]
One simple change to help print batch from kernel
Comment 5 Ren Chenglei 2019-08-28 03:24:16 UTC
From kernel log, we could find NULL batch info:
deqp:testercore-25325 [003] ....  3541.743874: i915_gem_do_execbuffer: Android - kernel - bb obj 00000000b23761e1 start addr fffee2014000:
deqp:testercore-25325 [003] ....  3541.743876: i915_gem_do_execbuffer: Android - kernel -    00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Comment 6 Tapani Pälli 2019-08-29 05:36:37 UTC
One thing that bothers is that this bug cannot be reproduced on desktop Linux. However, I've investigated these multithread tests on desktop with helgrind tool and found couple of possible data races, one in Mesa and one in dEQP framework. I've sent patches for both of these, unfortunately these don't seem to help the bug though.
Comment 7 Mark Janes 2019-08-29 06:18:42 UTC
since you can reproduce it, you should be able to bisect it.

If you can't bisect it, then perhaps there is a bug elsewhere in your stack?

Test passes reliably on debian for all platforms:

https://mesa-ci.01.org/mesa_master_daily/builds/5203/group/b22ced51e2e7542022a306128586f887
Comment 8 Ren Chenglei 2019-08-29 06:57:05 UTC
Thanks Tapani & Mark. I just pull latest mesa and build on Android. But issue is still here, there should be some specific issues on Android. :(
Comment 9 Tapani Pälli 2019-08-29 07:05:05 UTC
(In reply to Ren Chenglei from comment #8)
> Thanks Tapani & Mark. I just pull latest mesa and build on Android. But
> issue is still here, there should be some specific issues on Android. :(

Did we always have this issue with these tests, also with Mesa 18.x?
Comment 10 Ren Chenglei 2019-08-29 07:11:05 UTC
(In reply to Tapani Pälli from comment #9)
> (In reply to Ren Chenglei from comment #8)
> > Thanks Tapani & Mark. I just pull latest mesa and build on Android. But
> > issue is still here, there should be some specific issues on Android. :(
> 
> Did we always have this issue with these tests, also with Mesa 18.x?

Yes, this issue also can be reproduced on 18.2.
Comment 11 GitLab Migration User 2019-09-25 20:35:06 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1830.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.