Bug 61011 - [ILK] system stalls with vaapi rendering
Summary: [ILK] system stalls with vaapi rendering
Status: CLOSED WONTFIX
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-02-17 17:33 UTC by Tobias Jakobi
Modified: 2017-07-24 22:58 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
Xorg log (28.03 KB, text/plain)
2013-02-22 22:08 UTC, Tobias Jakobi
no flags Details
dmesg after boot (35.55 KB, text/plain)
2013-02-22 22:08 UTC, Tobias Jakobi
no flags Details

Description Tobias Jakobi 2013-02-17 17:33:50 UTC
Hello,

first of all an overview of used system components:
xf86-video-intel git (43ba22ef4a4142f334e9ae2d926250988ecbe8bc)
libdrm git (36d18211b196cad4761ac70c4fd08aba323f5b0d)
libva-1.1.0
intel-driver git (7803fcae62c868c26f24425df04aee0405342563)
vanilla kernel 3.7.8

The DDX is using the SNA backend, however switching to UXA doesn't help.

The issue appears for me in combination with h264 playback through vaapi (mplayer-vaapi -vo vaapi -va vaapi). I can't exclude that it also happens in other situations.

THe issue: Playback stalls for several 100ms (very much noticeable, always accompanied by sound dropout (audio backend is pulseaudio on a remote machine)).

mplayer itself doesn't notice these stalls. If it would be high CPU usage spikes, it would display the usual "system too slow, blabla" message. However it doesn't, and monitoring CPU usage with top reveals that nothing is really happening during these stalls. The XFCE CPU monitor graph even indicates that CPU load drops to flat zero during these stalls.

I fiddled around with different versions of the DDX, libva's intel-driver and SNA/UXA -- to no avail.
I could restore the original (stall-free) behaviour by going back to vanilla 3.4.31, so I presume that the issue was introduced by changes to the DRM code.

Any ideas on how to triage this (apart from bisecting of course)? E.g. I have no idea how to find out what exactly is stalling playback -- which would be a good starting point.

Greets,
Tobias

PS: Oh yes, nothing interesting in dmesg, Xorg.log or syslog in general.
Comment 1 Daniel Vetter 2013-02-17 20:33:38 UTC
Please attach dmesg and Xorg.log, thanks.
Comment 2 Tobias Jakobi 2013-02-22 22:08:26 UTC
Created attachment 75378 [details]
Xorg log

I had to downgrade xf86-video-intel to 2.20.19, since git is unstable for me and X randomly crashing is not nice at all.
Comment 3 Tobias Jakobi 2013-02-22 22:08:50 UTC
Created attachment 75379 [details]
dmesg after boot
Comment 4 Tobias Jakobi 2013-02-22 22:09:30 UTC
@Daniel: Logs added!
Comment 5 Chris Wilson 2013-02-22 22:17:30 UTC
(In reply to comment #2) 
> I had to downgrade xf86-video-intel to 2.20.19, since git is unstable for me
> and X randomly crashing is not nice at all.

WHAT? Please do report critical bugs...
Comment 6 Tobias Jakobi 2013-02-22 22:26:21 UTC
I'm not even sure when the crash was introduced. Currently it looks like that 2.21.0 is already not working for me. However I don't have time to triage this (and the 3.7 kernel also introduces ACPI bugs for me).
Comment 7 Tobias Jakobi 2013-02-22 22:28:28 UTC
Forgot to mention: I had iommu disabled since some other report mentioned stalls on 3.2 with ILK (but accompanied by output from the kernel). It turned out that this doesn't seem the issue here -- read: iommu enabled or not, it doesn't change the stalls.
Comment 8 Chris Wilson 2013-02-22 22:32:16 UTC
(In reply to comment #6)
> I'm not even sure when the crash was introduced. Currently it looks like
> that 2.21.0 is already not working for me. However I don't have time to
> triage this (and the 3.7 kernel also introduces ACPI bugs for me).

Just the Xorg.0.log with the crash would be sufficient (ok, I may prefer a symbolic stacktrace) but any information about a crash is better than none. Even just a heads up is useful.
Comment 9 Chris Wilson 2013-03-08 11:17:57 UTC
Please retest without 'i915.i915_enable_rc6=1 i915.lvds_downclock=1'. If you do have a downclocking mode available, then you need a bug fix to prevent one frequent stall.
Comment 10 Tobias Jakobi 2013-03-08 23:43:54 UTC
liquid@leena ~ $ cat /proc/cmdline 
BOOT_IMAGE=/kernel-3.7.10-vanilla root=/dev/sda4 ro rootfstype=ext4 usbhid.mousepoll=4 i915.i915_enable_rc6=0 i915.lvds_downclock=0 video.brightness_switch_enabled=0

No change at all.
Comment 11 Tobias Jakobi 2013-03-09 16:31:32 UTC
I checked for the issue with the kernel versions inbetween:

3.5.7: GOOD
3.6.11: BAD

So looks like it was introduced from 3.5 to 3.6
Comment 12 Daniel Vetter 2013-03-11 18:26:55 UTC
Can you please attempt to bisect down to a commit with git? My favourite howto is

http://www.reactivated.net/weblog/archives/2006/01/using-git-bisect-to-find-buggy-kernel-patches/
Comment 13 Chris Wilson 2013-06-07 18:28:33 UTC
Pretty please?
Comment 14 Tobias Jakobi 2013-06-07 19:22:13 UTC
Sorry for the late reply, I tried to bisect this twice in the last weeks but due to the nature of bug without any success. I didn't try again because it was already very time consuming (not the compiling, but the verification if the stalling is actually there).

With the numerous bugs (ACPI, etc.) that seem to creep into recent kernel versions, and the fact that Arrandale/Ironlake doesn't really profit from them anyway, I decided to stay on the 3.4.x branch until this system goes out of service. Since 3.4 will get fixes until mid (?) of 2014, this seems like a good solution for me.

Feel free to set this bug to abandoned.
Comment 15 Chris Wilson 2013-06-25 14:37:21 UTC
Please do reopen if you are able to characterise the lag or find the specific commit introducing the stall.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.