On CI_DRM_3063, the machine shard-hsw hits the following assert when running igt@gem_flink_race@flink_close: (gem_flink_race:1665) CRITICAL: Test assertion failure function test_flink_close, file gem_flink_race.c:180: (gem_flink_race:1665) CRITICAL: Failed assertion: obj_count == 0 (gem_flink_race:1665) CRITICAL: Last errno: 9, Bad file descriptor (gem_flink_race:1665) CRITICAL: error: -1 != 0 Full logs: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3063/shard-hsw3/igt@gem_flink_race@flink_close.html
(gem_flink_race:1665) INFO: leaked -1 objects If only all of our tests were that efficient! No more oom!
Given all the floating references held on objects, I'm not sure if we can do a "stable_obj_count" without the loop and sleeping. I'm not sure of all the places that we do have such objects in a timer, so getting the interval right or adding a flush control is hard. E.g. one thing we are not flushing before counting are the kms workqueues which hold a reference on the old object until finished.
*** Bug 102696 has been marked as a duplicate of this bug. ***
Also, CI_DRM_3223 HSW-shards (prime_self_import:1525) CRITICAL: Test assertion failure function test_export_close_race, file prime_self_import.c:363: (prime_self_import:1525) CRITICAL: Failed assertion: obj_count == 0 (prime_self_import:1525) CRITICAL: Last errno: 9, Bad file descriptor (prime_self_import:1525) CRITICAL: error: -32 != 0 Subtest export-vs-gem_close-race failed. https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3223/shard-hsw4/igt@prime_self_import@export-vs-gem_close-race.html and: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_339/shard-hsw3/igt@prime_self_import@export-vs-gem_close-race.html
Had a thought: drm/i915: Flush the idle-worker for debugfs/i915_drop_caches https://patchwork.freedesktop.org/patch/183116/
This should help: commit 8d03573de74ebd38d1047131a698a2068605efed Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Oct 18 13:28:14 2017 +0100 lib: Flush the driver's internal cache of objects before counting As the driver itself keeps a cache of objects, these too need to be flushed prior to producing a stable count of objects. there is still an open for dropping framebuffers and their ilk, but unlikely to be affecting this test. Treating as fixed, we will know if it reoccurs again.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.