Created attachment 139411 [details] logcat of few bootanimation cycles Hi, I'm testing Android 8.1 (oreo-x86 branch of android-x86) with the following gfx stack: - drm_hwcomposer of freedesktop.org (hwctwo enabled and also with robherring branch handle-rework) - latest gbm_gralloc (robherring branch handle-rework, but also happening with branches prior to handle-rework) - latest libdrm, but it happens will all releases from 2.4.89(and before) to 2.4.91 - latest kernel 4.17rc4, but it happens with all kernels Hardware: Laptop Lenovo T460 with Skylake GT2 Synthomps: Bootanimation completes, but then the Android GUI hangs and surfaceflinger service is killed, Bootanimations restarts (GUI bootloop) Very difficult to get debug logs, about what is causing signal 1 (SIGHUP) killing surfaceflinger, but I could finally trace it. 05-07 21:36:58.689 0 0 I : [drm] GPU HANG: ecode 9:0:0x00280000, in surfaceflinger [2497], reason: No progress on rcs0, action: reset 05-07 21:36:58.689 0 0 I : [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. 05-07 21:36:58.689 0 0 I : [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel 05-07 21:36:58.689 0 0 I : [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. 05-07 21:36:58.689 0 0 I : [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. 05-07 21:36:58.689 0 0 I : [drm] GPU crash dump saved to /sys/class/drm/card0/error 05-07 21:36:58.689 0 0 I i915 0000: 00:02.0: Resetting rcs0 after gpu hang logcat, dmesg and GPU crash dump /sys/class/drm/card0/error are provided. Please help to identify the root cause as this is probably the only show stopper preventing drm_hwcomposer (hwctwo) + gbm_gralloc + libdrm from booting Android oreo-x86 with full freedesktop stack. I am available to support further investigations and testing of patches Mauro Rossi android-x86 team
Created attachment 139412 [details] dmesg of few bootanimation cycles
Created attachment 139413 [details] Dump of /sys/class/drm/card0/error
The batch has been overwritten by pixel data. Instinct would be that the (foreign?) buffer allocation didn't match expectations.
Have you tried bisecting Mesa if some particular commit causes this? This could be also some difference between cros_gralloc and gbm_gralloc. Unfortunately I can't test anything on Android ATM but I would advice trying bisecting first.
Created attachment 140072 [details] dmesg of several Android GUI restarts Hi, as an update I have conducted further tests on Mesa branches 18.0, 18.1 (including also Android-IA patches) and mesa-dev prior to start blind bisecting, which with my setup would take ages. The results are that for all versions from 18.0 to mesa-dev I get the same problem that Android boot does not complete, but the GUI restart does not happen always at the same time and in the same way. In most cases the bootanimation is interrupted, in some others GUI freezes at Status Bar drawing. Reiterating the GUI restarts for a while I have collected dmesg and logcat with drm.debug=30 in order to trace for problems happening during several attempts Could you please have a look, because GPU hang is not systematic, I'd like to understand what is causing SurfaceFlinger process to Hang. Mauro
Created attachment 140073 [details] logcat with drm.debug=30 zipped
Since Android-IA works OK one option would be to simply go through differences in kernel, Mesa, minigbm vs gbm_gralloc. Have you tried to use Android-IA kernel tree?
Hi, this one can be closed as using latest mesa-dev with default support for dma-bufs (prime fd) the oreo-x86 could boot. I think the problem was coming from combining gbm_gralloc, which supports prime fd with mesa in a commit supporting flink. Mauro
resolving, see comment #8
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.