Description
Sitsofe Wheeler
2010-08-28 10:18:17 UTC
Created attachment 38242 [details]
modetest output when no battery is in the laptop
Created attachment 38243 [details]
2.6.36rc2 modetest output when a battery is in the laptop
Created attachment 38244 [details]
2.6.35 modetest output (with or without a battery in the laptop)
Created attachment 38245 [details]
2.6.34 (and earlier kernels) modetest output (with or without a battery in the laptop)
Created attachment 38246 [details] [review] debug patch Can you try this debug patch. I've converted the BUG_ON to a WARN, so the system should continue to function (also increased the size of pin_count to prevent problems). As soon as the first few WARN_ONs have hit, please attache the complete dmesg. Created attachment 38247 [details]
dmesg produced with debugging patch
Things got a bit weird after enough suspend/resume cycles. The screen eventually flickered white and stayed black while the system continued to respond. I've chopped off the head of the dmesg and started from the drm messages.
Created attachment 38248 [details] [review] new debug patch Use this one instead of the old one. Hopefully this spit out something interesting. Created attachment 38249 [details]
dmesg from tester with the new debug patch
A two things to note:
13 times "no obj to unpin": Once on boot-up is allright, the other 12 are surplus.
pincount at the end = 14. Subtracting 12 yields 2, which looks like the correct value (one pin for the kernel fb console, one pin because the fb console is the current scanout buffer).
The other hilarity is how often set_base gets called with the dev->mode_config.mutex ...
Created attachment 38250 [details]
dmesg of initial boot with drm.debug=0xe
Created attachment 38256 [details]
dmesg of initial boot with drm.debug=0xe where the pincount rises to 3
This boot was done with init=/bin/bash to double check that there wasn't any chance of anything else being run.
Created attachment 38257 [details]
dmesg of boot followed by suspend/resume with drm.debug=0xe (pincount rises quickly)
This log actually stops at pincount 6 and is gzip'd due to its size.
Created attachment 38557 [details] [review] Only decouple fb when calling mode_set*() Created attachment 38559 [details] [review] Drop fb pin on DPMS_ON Just testing the theory, we need to rewrite our prepare/commit to not use DPMS (or enable/disable) so that we can move the pin/unpin into enable/disable. I've pushed a revised pair of patches in -staging. Created attachment 38567 [details]
Oops in intel_crtc_disable
After reverting 300387c0b57d75e5218e2881d6ad2720657a8bcf to make the issue easier to reproduce thing blew up with drm-intel-staging after a number of suspend/resume cycles with the attached oops.
As the fb pin leak appears to have been exacerbated by the intel_wait_for_vblank() regression, I'm not planning to push for a fix in 2.6.36. Obviously, I will re-assess its priority if we hit the same fb-pincount-leak BUG in -fixes. Created attachment 38589 [details]
dmesg produced of -staging with debugging patch
I had to manually fix up the debugging patch to apply to -staging but I think the change was trivial so I hopefully did it correctly.
OK I have just been retesting -staging as I am not sure I was using it with my previous comment. With commits b7ffdc988523fb57ac1ef454b77d6ecc01dda4d3 (drm: Use a nondestructive mode for output detect when polling) and 300387c0b57d75e5218e2881d6ad2720657a8bcf (drm/i915: Clear the vblank status bit before polling for the next vblank) in place I can't reproduce the issue because modetest no longer returns quickly. Without these, I get the oops mentioned in comment #15. Sitsofe thankyou for clarifying that, I was going to ask you later. :) Concerning the priority, have you seen the fb-pin OOPS just with -fixes? (In reply to comment #19) > > Concerning the priority, have you seen the fb-pin OOPS just with -fixes? Sorry, only just realised you asked me a question! Unless I revert 300387c0b57d75e5218e2881d6ad2720657a8bcf (drm/i915: Clear the vblank status bit before polling for the next vblank) I can't reproduce this issue in -fixes (I did 30 suspend/resume cycles after echoing devices in /sys/power/pm_test while running modetest in a loop). When modetest doesn't return quickly I think this issue is going to be incredibly hard to trigger. Hmm, something more subtle is happening that I can't quite get a handle on - do only disconnects cause the upin leak, why? Ok, this is definitely -next material. See also #29325. That should have been bug 29230 *** Bug 32776 has been marked as a duplicate of this bug. *** Created attachment 42634 [details] [review] Don't switch fb after a no-op commit 9334ef755f060e251f3f395caeda1a58b6834ea3 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Jan 28 11:53:03 2011 +0000 drm: Don't switch fb when disabling an output In drm_crtc_helper_set_config, we call drm_crtc_helper_set_mode which may return early and do no operation if the crtc is to be disabled. In this case we merrily swap to the new fb, discarding the old_fb believing that it has been cleaned up. However, due to the early return, the old_fb was not presented to the backend for correct reaping, and nor was the new one - which is about to be reaped via the drm_helper_disable_unused_functions(), leading to incorrect refcounting of the pinned objects. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=27722 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=29857 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=29230 Tested-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.