76564 – [AMD Fusion E-350] HDMI refresh rates doesn't match expectations

Bug 76564 - [AMD Fusion E-350] HDMI refresh rates doesn't match expectations

Summary: [AMD Fusion E-350] HDMI refresh rates doesn't match expectations

Status:	RESOLVED FIXED

Alias:	None

Product:	DRI
Classification:	Unclassified
Component:	DRM/Radeon (show other bugs)
Version:	unspecified
Hardware:	x86-64 (AMD64) Linux (All)

Importance:	medium normal
Assignee:	Default DRI bug account
QA Contact:

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2014-03-24 17:32 UTC by jeroen
Modified:	2016-06-15 12:03 UTC (History)
CC List:	4 users (show)

See Also:
i915 platform:
i915 features:

Attachments
dmesg (59.56 KB, text/plain) 2014-03-24 17:32 UTC, jeroen	no flags	Details
lspci output (60.55 KB, text/plain) 2014-03-24 17:34 UTC, jeroen	no flags	Details
drm_debug log (917.59 KB, application/octet-stream) 2014-03-25 19:16 UTC, jeroen	no flags	Details
drm debug log with kms and driver level (775.23 KB, application/octet-stream) 2014-03-25 20:14 UTC, jeroen	no flags	Details
fgrlx xrandr (15.19 KB, text/plain) 2014-03-27 18:22 UTC, jeroen	no flags	Details
oss xrandr (5.90 KB, text/plain) 2014-03-27 18:23 UTC, jeroen	no flags	Details
Possible fix (2.67 KB, patch) 2014-03-28 17:56 UTC, Christian König	no flags	Details \| Splinter Review
Possible fix v2 (2.70 KB, patch) 2014-03-28 18:06 UTC, Christian König	no flags	Details \| Splinter Review
Possible fix v3 (7.66 KB, patch) 2014-03-29 21:10 UTC, Christian König	no flags	Details \| Splinter Review
logs working and not (16.83 KB, application/octet-stream) 2014-04-02 16:20 UTC, Garrett	no flags	Details
xrandr 1920x1080x24p and OE crashed lcd (6.78 KB, application/octet-stream) 2014-04-03 03:24 UTC, Garrett	no flags	Details
xrandr 1920x1080x23.98 crashed (4.24 KB, application/octet-stream) 2014-04-03 03:34 UTC, Garrett	no flags	Details
dmesg xrandr old code working 23.98 (4.24 KB, application/octet-stream) 2014-04-03 04:16 UTC, Garrett	no flags	Details
dmesg after switching to 25hz and back to 24hz (61.63 KB, text/plain) 2014-04-03 19:26 UTC, Rackow, Detlev	no flags	Details
Xorg.0.log after switching to 25hz and back to 24hz (89.77 KB, text/plain) 2014-04-03 19:29 UTC, Rackow, Detlev	no flags	Details
dmesg xrandr new patch slow 18fps on 24p (139.00 KB, text/plain) 2014-04-24 13:49 UTC, Garrett	no flags	Details
adb76 dmesg output (53.04 KB, text/plain) 2014-05-04 08:40 UTC, adb76	no flags	Details
adb76 lspci output (2.79 KB, text/plain) 2014-05-04 08:41 UTC, adb76	no flags	Details
adb76 xorg.log (50.01 KB, text/plain) 2014-05-04 08:41 UTC, adb76	no flags	Details
adb76 xbmc-xrandr output (3.04 KB, text/plain) 2014-05-04 08:42 UTC, adb76	no flags	Details
Show Obsolete (2) View All

Description jeroen 2014-03-24 17:32:02 UTC

Created attachment 96302 [details]
dmesg

I'm currently experiencing problems when playing videos on my AMD fusion e-350 with HD6310 graphics(r600) with OpenELEC 4 beta2, which uses Mesa 10.1 and the latest Linux 3.13 kernel. This is with the system connected through HDMI to my television.
Either the video is being decoded too slow, which causes skipped frames, or it is decoding too fast, which causes missed frames.

23.976fps becomes 23.92/23.95 and sometimes goes to 22.93fps
25fps becomes 25.02 or 25.05fps
29.97 interlaced decodes with around 58fps instead of 59.94fps.

It is if like the clock that is used for decoding is all over the place (PLL issue?)

I confirmed with the old OpenELEC 3, which uses AMD's fglrx, and everything is playing perfectly and the fps is spot on, as in. 23.98fps, 25fps and 59.94fps.

I also tried disabling the new VDPAU hardware acceleration and VDPAU mixer, but with no effect.
All tests were done with the television frame rate being matched to the content, so for example in the case of 23.976fps content the television is at the same frequency.

Comment 1 jeroen 2014-03-24 17:34:30 UTC

Created attachment 96303 [details]
lspci output

Comment 2 Christian König 2014-03-24 17:39:24 UTC

HDMI refresh rate seems to be off, UVD decoding speed is completely unrelated to this.

Comment 3 Christian König 2014-03-24 17:42:46 UTC

(In reply to comment #0)
> 23.976fps becomes 23.92/23.95 and sometimes goes to 22.93fps
> 25fps becomes 25.02 or 25.05fps
> 29.97 interlaced decodes with around 58fps instead of 59.94fps.

That the refresh frequency is a bit off might happen, but not that much.

How did you measured those nunmbers?

Comment 4 jeroen 2014-03-24 18:49:25 UTC

Sorry for not providing that information.

OpenELEC uses XBMC to play the videos. XBMC has the option to display stats (http://wiki.xbmc.org/?title=Codecinfo). One of the stats is the actual video frame rate. Other useful stats are the amount of frames dropped and skipped.

When I look at for example the numbers 23.95 and 23.92 it seems it is off by 1/1000 or 2/1000 respectively.

Comment 5 Alex Deucher 2014-03-24 19:06:14 UTC

Probably a duplicate of bug 71753.

Comment 6 jeroen 2014-03-24 19:16:21 UTC

(In reply to comment #5)
> Probably a duplicate of bug 71753.

Yes I read that report. What I don't understand is how the audio clock, uvd clock, hdmi clock, etc relate to each other.

For audio I use the realtek chip and it's SPDIF. I guess with the audio clock, the HDMI audio clock is used? Which in my case is not used I guess.

Comment 7 Alex Deucher 2014-03-24 19:40:22 UTC

(In reply to comment #6)
> (In reply to comment #5)
> > Probably a duplicate of bug 71753.
> 
> Yes I read that report. What I don't understand is how the audio clock, uvd
> clock, hdmi clock, etc relate to each other.
> 
> For audio I use the realtek chip and it's SPDIF. I guess with the audio
> clock, the HDMI audio clock is used? Which in my case is not used I guess.

They are not really related on the hw side.  UVD decodes as fast as it can based on it's own clocks.  When the decoded frame is displayed is up to the application.  The audio chip has it's own clock and the display has it's own clock.  The hdmi audio information is embedded in the display stream.  The monitor uses special packets that the GPU embeds in the display stream to reconstruct the audio stream on the monitor based on the display clock.  There seem to be cases where the hdmi stream is not set up properly so the audio clock is not recovered properly on the monitor side.

Comment 8 jeroen 2014-03-24 20:39:31 UTC

(In reply to comment #7)
> (In reply to comment #6)
> > (In reply to comment #5)
> > > Probably a duplicate of bug 71753.
> > 
> > Yes I read that report. What I don't understand is how the audio clock, uvd
> > clock, hdmi clock, etc relate to each other.
> > 
> > For audio I use the realtek chip and it's SPDIF. I guess with the audio
> > clock, the HDMI audio clock is used? Which in my case is not used I guess.
> 
> They are not really related on the hw side.  UVD decodes as fast as it can
> based on it's own clocks.  When the decoded frame is displayed is up to the
> application.  The audio chip has it's own clock and the display has it's own
> clock.  The hdmi audio information is embedded in the display stream.  The
> monitor uses special packets that the GPU embeds in the display stream to
> reconstruct the audio stream on the monitor based on the display clock. 
> There seem to be cases where the hdmi stream is not set up properly so the
> audio clock is not recovered properly on the monitor side.

Okay, but doesnt that mean in this case it is a problem with the display (HDMI?) clock, as I am not using HDMI audio?

Is there a way I could get more detailed logging of what is happening on my system?

Comment 9 jeroen 2014-03-25 19:16:13 UTC

Created attachment 96379 [details]
drm_debug log

This big drm log was created while starting xbmc with the display set to 50fps, then playing a 23.976fps test clip and afterwards switching back to 50fps

In the log I notice:
[    1.982022] [drm:drm_calc_timestamping_constants], crtc 12: clock 148500 kHz framedur 20000000 linedur 17777, pixeldur 6
[   32.608813] [drm:drm_calc_timestamping_constants], crtc 12: clock 74176 kHz framedur 41708234 linedur 37073, pixeldur 13
[  124.623308] [drm:drm_calc_timestamping_constants], crtc 12: clock 148500 kHz framedur 20000000 linedur 17777, pixeldur 6
These correspond to the 50fps -> 23.976fps -> 50fps mode switches

Does this log tell you guys what is going wrong?

Comment 10 Alex Deucher 2014-03-25 19:55:24 UTC

(In reply to comment #8)
> Okay, but doesnt that mean in this case it is a problem with the display
> (HDMI?) clock, as I am not using HDMI audio?
> 
> Is there a way I could get more detailed logging of what is happening on my
> system?

I don't know how XBMC calculates the frame rate off hand.  The PLL used to generate the display clock may not always match the exact pixel clock of the monitor.  The driver calculates pll dividers to get as close as possible to the pixel clock of the display mode.  See radeon_compute_pll_avivo() in radeon_display.c

Comment 11 jeroen 2014-03-25 20:14:56 UTC

Created attachment 96380 [details]
drm debug log with kms and driver level

I found out to see the PLL numbers KMS level debugging needed to be on. So I redid the test and attached a new log file that shows the pll information.

Comment 12 Christian König 2014-03-25 20:21:56 UTC

(In reply to comment #10)
> (In reply to comment #8)
> > Okay, but doesnt that mean in this case it is a problem with the display
> > (HDMI?) clock, as I am not using HDMI audio?
> > 
> > Is there a way I could get more detailed logging of what is happening on my
> > system?
> 
> I don't know how XBMC calculates the frame rate off hand.  The PLL used to
> generate the display clock may not always match the exact pixel clock of the
> monitor.  The driver calculates pll dividers to get as close as possible to
> the pixel clock of the display mode.  See radeon_compute_pll_avivo() in
> radeon_display.c

Yeah, but on multiple occasions I had the feeling that fglrx might do a better job on this than the radeon kernel module.

We might want to take a second look at this and try to compare the settings fglrx and radeon uses for the same mode.

Comment 13 jeroen 2014-03-26 19:30:04 UTC

Do the PLL values in the log files I posted indicate a problem, or are they okay?

How can you see the PLL values fglrx is using?

Comment 14 Alex Deucher 2014-03-26 20:50:38 UTC

(In reply to comment #13)
> Do the PLL values in the log files I posted indicate a problem, or are they
> okay?

[drm:radeon_compute_pll_avivo], 14875, pll dividers - fb: 23.8 ref: 2, post 8
[drm:radeon_compute_pll_avivo], 7406, pll dividers - fb: 23.7 ref: 2, post 16

The display pll looks fine to me. The clock formula is:

pixel_clock = (reference_frequency * feedback_divider) / (reference_divider * post_divider)

The reference frequency is 100 Mhz, so:

(100Mhz * 23.8) / (2 * 8) = 148.75Mhz

(100Mhz * 23.7) / (2 * 16) = 74.0625Mhz

> 
> How can you see the PLL values fglrx is using?

You'd need to dump the PLL registers using radeonreg (http://cgit.freedesktop.org/~airlied/radeontool/).
PPLL1
0x400 - ref div - bits 9:0
0x404 - fb div - whole part bits 26:16, fractional part bits 3:0
0x408 - post div - bits 6:0
PPLL2
0x440 - ref div - bits 9:0
0x444 - fb div - whole part bits 26:16, fractional part bits 3:0
0x448 - post div - bits 6:0

e.g., ./radeonreg regmatch 0x400

Comment 15 jeroen 2014-03-27 17:14:31 UTC

(In reply to comment #14)
> (In reply to comment #13)
> > Do the PLL values in the log files I posted indicate a problem, or are they
> > okay?
> 
> [drm:radeon_compute_pll_avivo], 14875, pll dividers - fb: 23.8 ref: 2, post 8
> [drm:radeon_compute_pll_avivo], 7406, pll dividers - fb: 23.7 ref: 2, post 16
> 
> The display pll looks fine to me. The clock formula is:
> 
> pixel_clock = (reference_frequency * feedback_divider) / (reference_divider
> * post_divider)
> 
> The reference frequency is 100 Mhz, so:
> 
> (100Mhz * 23.8) / (2 * 8) = 148.75Mhz
> 
> (100Mhz * 23.7) / (2 * 16) = 74.0625Mhz
> 
> > 
> > How can you see the PLL values fglrx is using?
> 
> You'd need to dump the PLL registers using radeonreg
> (http://cgit.freedesktop.org/~airlied/radeontool/).
> PPLL1
> 0x400 - ref div - bits 9:0
> 0x404 - fb div - whole part bits 26:16, fractional part bits 3:0
> 0x408 - post div - bits 6:0
> PPLL2
> 0x440 - ref div - bits 9:0
> 0x444 - fb div - whole part bits 26:16, fractional part bits 3:0
> 0x448 - post div - bits 6:0
> 
> e.g., ./radeonreg regmatch 0x400

I got the results with fglrx:
PPLL1@50Hz:     fb=23.7    ref=2    post=6
PPLL2@50Hz:     fb=296.16  ref=7    post=6
PPLL1@23.976Hz: fb=23.7    ref=2    post=12
PPLL2@23.976Hz: fb=296.16  ref=7    post=6

PPLL2 does seem to be used as it does not change. PPLL1 has different values than with the radeon OSS driver. What does this mean?

Comment 16 Alex Deucher 2014-03-27 17:53:04 UTC

(In reply to comment #15)
> I got the results with fglrx:
> PPLL1@50Hz:     fb=23.7    ref=2    post=6
> PPLL1@23.976Hz: fb=23.7    ref=2    post=12
> 
> PPLL2 does seem to be used as it does not change. PPLL1 has different values
> than with the radeon OSS driver. What does this mean?


(100Mhz * 23.7) / (2 * 6) = 197.5Mhz

(100Mhz * 23.7) / (2 * 12) = 98.75Mhz

Maybe fglrx is using different modelines with a different display clock?  Can you print what modes are being used with fglrx and radeon?  E.g., xrandr --verbose.

Comment 17 jeroen 2014-03-27 18:22:53 UTC

Created attachment 96471 [details]
fgrlx xrandr

Comment 18 jeroen 2014-03-27 18:23:18 UTC

Created attachment 96473 [details]
oss xrandr

Comment 19 jeroen 2014-03-27 18:24:24 UTC

(In reply to comment #16)
> (In reply to comment #15)
> > I got the results with fglrx:
> > PPLL1@50Hz:     fb=23.7    ref=2    post=6
> > PPLL1@23.976Hz: fb=23.7    ref=2    post=12
> > 
> > PPLL2 does seem to be used as it does not change. PPLL1 has different values
> > than with the radeon OSS driver. What does this mean?
> 
> 
> (100Mhz * 23.7) / (2 * 6) = 197.5Mhz
> 
> (100Mhz * 23.7) / (2 * 12) = 98.75Mhz
> 
> Maybe fglrx is using different modelines with a different display clock? 
> Can you print what modes are being used with fglrx and radeon?  E.g., xrandr
> --verbose.

I attached both xrandr outputs.

I also noticed that for fgrlx the PLL values are the same for 50Hz,60Hz and 59.94Hz output.

Comment 20 jeroen 2014-03-27 18:53:17 UTC

(In reply to comment #19)
> (In reply to comment #16)
> > (In reply to comment #15)
> > > I got the results with fglrx:
> > > PPLL1@50Hz:     fb=23.7    ref=2    post=6
> > > PPLL1@23.976Hz: fb=23.7    ref=2    post=12
> > > 
> > > PPLL2 does seem to be used as it does not change. PPLL1 has different values
> > > than with the radeon OSS driver. What does this mean?
> > 
> > 
> > (100Mhz * 23.7) / (2 * 6) = 197.5Mhz
> > 
> > (100Mhz * 23.7) / (2 * 12) = 98.75Mhz
> > 
> > Maybe fglrx is using different modelines with a different display clock? 
> > Can you print what modes are being used with fglrx and radeon?  E.g., xrandr
> > --verbose.
> 
> I attached both xrandr outputs.
> 
> I also noticed that for fgrlx the PLL values are the same for 50Hz,60Hz and
> 59.94Hz output.
 
When I look at the xrandr output I wonder if the reference frequency is not 75MHz for fgrlx? Can the reference even change is this not fixed by the hardware?

Comment 21 Alex Deucher 2014-03-28 14:08:17 UTC

(In reply to comment #20)
>  
> When I look at the xrandr output I wonder if the reference frequency is not
> 75MHz for fgrlx? Can the reference even change is this not fixed by the
> hardware?

As far as I know, it's fixed.  I'm not really sure what fglrx is doing.  Anyway, it's probably easier to just fix the open source driver.

the modes are:
  1920x1080 (0x55)  148.5MHz +HSync +VSync *current +preferred
        h: width  1920 start 2448 end 2492 total 2640 skew    0 clock   56.2KHz
        v: height 1080 start 1084 end 1089 total 1125           clock   50.0Hz

  1920x1080 (0x5a)   74.2MHz +HSync +VSync
        h: width  1920 start 2558 end 2602 total 2750 skew    0 clock   27.0KHz
        v: height 1080 start 1084 end 1089 total 1125           clock   24.0Hz

and the driver ends up calculating the dividers as such:

for 148.5MHz target clock:
(100Mhz * 23.8) / (2 * 8) = 148.75Mhz

for 74.2MHz target clock:
(100Mhz * 23.7) / (2 * 16) = 74.0625Mhz

One would need to tweak radeon_compute_pll_avivo() in radeon_display.c to try and get dividers that are closer to the target clock.

Comment 22 Christian König 2014-03-28 14:34:52 UTC

(In reply to comment #21)
> One would need to tweak radeon_compute_pll_avivo() in radeon_display.c to
> try and get dividers that are closer to the target clock.

That was some kind of calculus or rather functional analyesis of finding the sweet spot for getting the frequency exact without toasting the hardware wasn't it?

Comment 23 jeroen 2014-03-28 14:37:43 UTC

(In reply to comment #21)
> (In reply to comment #20)
> >  
> > When I look at the xrandr output I wonder if the reference frequency is not
> > 75MHz for fgrlx? Can the reference even change is this not fixed by the
> > hardware?
> 
> As far as I know, it's fixed.  I'm not really sure what fglrx is doing. 
> Anyway, it's probably easier to just fix the open source driver.
> 
> the modes are:
>   1920x1080 (0x55)  148.5MHz +HSync +VSync *current +preferred
>         h: width  1920 start 2448 end 2492 total 2640 skew    0 clock  
> 56.2KHz
>         v: height 1080 start 1084 end 1089 total 1125           clock  
> 50.0Hz
> 
>   1920x1080 (0x5a)   74.2MHz +HSync +VSync
>         h: width  1920 start 2558 end 2602 total 2750 skew    0 clock  
> 27.0KHz
>         v: height 1080 start 1084 end 1089 total 1125           clock  
> 24.0Hz
> 
> and the driver ends up calculating the dividers as such:
> 
> for 148.5MHz target clock:
> (100Mhz * 23.8) / (2 * 8) = 148.75Mhz
> 
> for 74.2MHz target clock:
> (100Mhz * 23.7) / (2 * 16) = 74.0625Mhz
> 
> One would need to tweak radeon_compute_pll_avivo() in radeon_display.c to
> try and get dividers that are closer to the target clock.

Isn't that what the OSS driver is currently doing? If you look in the post history those are the exact values that are currently being used

Comment 24 Christian König 2014-03-28 14:42:12 UTC

(In reply to comment #23)
> (In reply to comment #21)
> > (In reply to comment #20)
> > >  
> > > When I look at the xrandr output I wonder if the reference frequency is not
> > > 75MHz for fgrlx? Can the reference even change is this not fixed by the
> > > hardware?
> > 
> > As far as I know, it's fixed.  I'm not really sure what fglrx is doing. 
> > Anyway, it's probably easier to just fix the open source driver.
> > 
> > the modes are:
> >   1920x1080 (0x55)  148.5MHz +HSync +VSync *current +preferred
> >         h: width  1920 start 2448 end 2492 total 2640 skew    0 clock  
> > 56.2KHz
> >         v: height 1080 start 1084 end 1089 total 1125           clock  
> > 50.0Hz
> > 
> >   1920x1080 (0x5a)   74.2MHz +HSync +VSync
> >         h: width  1920 start 2558 end 2602 total 2750 skew    0 clock  
> > 27.0KHz
> >         v: height 1080 start 1084 end 1089 total 1125           clock  
> > 24.0Hz
> > 
> > and the driver ends up calculating the dividers as such:
> > 
> > for 148.5MHz target clock:
> > (100Mhz * 23.8) / (2 * 8) = 148.75Mhz
> > 
> > for 74.2MHz target clock:
> > (100Mhz * 23.7) / (2 * 16) = 74.0625Mhz
> > 
> > One would need to tweak radeon_compute_pll_avivo() in radeon_display.c to
> > try and get dividers that are closer to the target clock.
> 
> Isn't that what the OSS driver is currently doing? If you look in the post
> history those are the exact values that are currently being used

The problem is that the frequencys are exact enough so that the display device (Monitor/TV/Whatever) accepts them, but not 100% precise.

E.g. for the 50Hz mode we wanted 148.5MHz pixel clock, but got 148.75Mhz instead. And for the 24Hz mode we wanted 74.2MHz but got 74.0625Mhz instead.

So as Alex said somebody would need to dig into that and try to improve the numbers without toasting the hardware.

Comment 25 jeroen 2014-03-28 15:18:18 UTC

(In reply to comment #24)
> (In reply to comment #23)
> > (In reply to comment #21)
> > > (In reply to comment #20)
> > > >  
> > > > When I look at the xrandr output I wonder if the reference frequency is not
> > > > 75MHz for fgrlx? Can the reference even change is this not fixed by the
> > > > hardware?
> > > 
> > > As far as I know, it's fixed.  I'm not really sure what fglrx is doing. 
> > > Anyway, it's probably easier to just fix the open source driver.
> > > 
> > > the modes are:
> > >   1920x1080 (0x55)  148.5MHz +HSync +VSync *current +preferred
> > >         h: width  1920 start 2448 end 2492 total 2640 skew    0 clock  
> > > 56.2KHz
> > >         v: height 1080 start 1084 end 1089 total 1125           clock  
> > > 50.0Hz
> > > 
> > >   1920x1080 (0x5a)   74.2MHz +HSync +VSync
> > >         h: width  1920 start 2558 end 2602 total 2750 skew    0 clock  
> > > 27.0KHz
> > >         v: height 1080 start 1084 end 1089 total 1125           clock  
> > > 24.0Hz
> > > 
> > > and the driver ends up calculating the dividers as such:
> > > 
> > > for 148.5MHz target clock:
> > > (100Mhz * 23.8) / (2 * 8) = 148.75Mhz
> > > 
> > > for 74.2MHz target clock:
> > > (100Mhz * 23.7) / (2 * 16) = 74.0625Mhz
> > > 
> > > One would need to tweak radeon_compute_pll_avivo() in radeon_display.c to
> > > try and get dividers that are closer to the target clock.
> > 
> > Isn't that what the OSS driver is currently doing? If you look in the post
> > history those are the exact values that are currently being used
> 
> The problem is that the frequencys are exact enough so that the display
> device (Monitor/TV/Whatever) accepts them, but not 100% precise.
> 
> E.g. for the 50Hz mode we wanted 148.5MHz pixel clock, but got 148.75Mhz
> instead. And for the 24Hz mode we wanted 74.2MHz but got 74.0625Mhz instead.
> 
> So as Alex said somebody would need to dig into that and try to improve the
> numbers without toasting the hardware.

So that would mean for example using fb=29.7   Ref=2   post=10?

Or would that fry the hardware?
Why must it exactly match? Because for fgrlx it seems roughly 30% higher than needed

Comment 26 Alex Deucher 2014-03-28 16:04:08 UTC

(In reply to comment #25)
> (In reply to comment #24)
> > (In reply to comment #23)
> > > (In reply to comment #21)
> > > > (In reply to comment #20)
> > > > >  
> > > > > When I look at the xrandr output I wonder if the reference frequency is not
> > > > > 75MHz for fgrlx? Can the reference even change is this not fixed by the
> > > > > hardware?
> > > > 
> > > > As far as I know, it's fixed.  I'm not really sure what fglrx is doing. 
> > > > Anyway, it's probably easier to just fix the open source driver.
> > > > 
> > > > the modes are:
> > > >   1920x1080 (0x55)  148.5MHz +HSync +VSync *current +preferred
> > > >         h: width  1920 start 2448 end 2492 total 2640 skew    0 clock  
> > > > 56.2KHz
> > > >         v: height 1080 start 1084 end 1089 total 1125           clock  
> > > > 50.0Hz
> > > > 
> > > >   1920x1080 (0x5a)   74.2MHz +HSync +VSync
> > > >         h: width  1920 start 2558 end 2602 total 2750 skew    0 clock  
> > > > 27.0KHz
> > > >         v: height 1080 start 1084 end 1089 total 1125           clock  
> > > > 24.0Hz
> > > > 
> > > > and the driver ends up calculating the dividers as such:
> > > > 
> > > > for 148.5MHz target clock:
> > > > (100Mhz * 23.8) / (2 * 8) = 148.75Mhz
> > > > 
> > > > for 74.2MHz target clock:
> > > > (100Mhz * 23.7) / (2 * 16) = 74.0625Mhz
> > > > 
> > > > One would need to tweak radeon_compute_pll_avivo() in radeon_display.c to
> > > > try and get dividers that are closer to the target clock.
> > > 
> > > Isn't that what the OSS driver is currently doing? If you look in the post
> > > history those are the exact values that are currently being used
> > 
> > The problem is that the frequencys are exact enough so that the display
> > device (Monitor/TV/Whatever) accepts them, but not 100% precise.
> > 
> > E.g. for the 50Hz mode we wanted 148.5MHz pixel clock, but got 148.75Mhz
> > instead. And for the 24Hz mode we wanted 74.2MHz but got 74.0625Mhz instead.
> > 
> > So as Alex said somebody would need to dig into that and try to improve the
> > numbers without toasting the hardware.
> 
> So that would mean for example using fb=29.7   Ref=2   post=10?
> 
> Or would that fry the hardware?

That should work.  You aren't likely to fry the hw.  You just don't want to set a 400 Mhz clock as you monitor properly won't like it.  The hard part is adjusting the algorithm to reliably calculate a good value for a wide range of clocks.

> Why must it exactly match?

You want to the clock to accurately match what userspace expects.  So if userspace expects 148.5MHz and the clock is actually 148.75MHz the actual and expected frame rate will be slightly off.

Comment 27 Christian König 2014-03-28 16:22:42 UTC

(In reply to comment #26)
> (In reply to comment #25)
> > (In reply to comment #24)
> > > (In reply to comment #23)
> > > The problem is that the frequencys are exact enough so that the display
> > > device (Monitor/TV/Whatever) accepts them, but not 100% precise.
> > > 
> > > E.g. for the 50Hz mode we wanted 148.5MHz pixel clock, but got 148.75Mhz
> > > instead. And for the 24Hz mode we wanted 74.2MHz but got 74.0625Mhz instead.
> > > 
> > > So as Alex said somebody would need to dig into that and try to improve the
> > > numbers without toasting the hardware.
> > 
> > So that would mean for example using fb=29.7   Ref=2   post=10?
> > 
> > Or would that fry the hardware?
> 
> That should work.  You aren't likely to fry the hw.  You just don't want to
> set a 400 Mhz clock as you monitor properly won't like it.  The hard part is
> adjusting the algorithm to reliably calculate a good value for a wide range
> of clocks.

I'm not sure if those values would work. A post divider of 10 might result in a to high VCO and that could indeed damage the hardware (even if that's rather unlikely).

Essentially the target clock multiplied with the post divider must be in a certain range. I think between pll->pll_out_max and pll->pll_out_min.

I think the problem is that we don't try to choose a good value to match the target frequency as close as possible in avivo_get_post_div, but just a value that either matches the maximum or minimum VCO frequency.

Comment 28 jeroen 2014-03-28 17:02:22 UTC

(In reply to comment #27)
> (In reply to comment #26)
> > (In reply to comment #25)
> > > (In reply to comment #24)
> > > > (In reply to comment #23)
> > > > The problem is that the frequencys are exact enough so that the display
> > > > device (Monitor/TV/Whatever) accepts them, but not 100% precise.
> > > > 
> > > > E.g. for the 50Hz mode we wanted 148.5MHz pixel clock, but got 148.75Mhz
> > > > instead. And for the 24Hz mode we wanted 74.2MHz but got 74.0625Mhz instead.
> > > > 
> > > > So as Alex said somebody would need to dig into that and try to improve the
> > > > numbers without toasting the hardware.
> > > 
> > > So that would mean for example using fb=29.7   Ref=2   post=10?
> > > 
> > > Or would that fry the hardware?
> > 
> > That should work.  You aren't likely to fry the hw.  You just don't want to
> > set a 400 Mhz clock as you monitor properly won't like it.  The hard part is
> > adjusting the algorithm to reliably calculate a good value for a wide range
> > of clocks.
> 
> I'm not sure if those values would work. A post divider of 10 might result
> in a to high VCO and that could indeed damage the hardware (even if that's
> rather unlikely).
> 
> Essentially the target clock multiplied with the post divider must be in a
> certain range. I think between pll->pll_out_max and pll->pll_out_min.
> 
> I think the problem is that we don't try to choose a good value to match the
> target frequency as close as possible in avivo_get_post_div, but just a
> value that either matches the maximum or minimum VCO frequency.

Perhaps before somebody is going to modify the algorithm, it is a good idea to verify that this is indeed the problem.
If this is the problem why don't more people have these problems?

If I know which values for fb,ref and post to use for 23.976fps I can hard code these In a patch and test if it indeed works.

So which values are save to get a clock of 74.2MHz?

Comment 29 Christian König 2014-03-28 17:06:07 UTC

(In reply to comment #28)
> Perhaps before somebody is going to modify the algorithm, it is a good idea
> to verify that this is indeed the problem.
> If this is the problem why don't more people have these problems?
> 
> If I know which values for fb,ref and post to use for 23.976fps I can hard
> code these In a patch and test if it indeed works.
> 
> So which values are save to get a clock of 74.2MHz?

I've already hacked together a patch that from inituial testing seems to work fine.

Just give me a about an hour to clean that up and then I will attach it so you can test.

Comment 30 Christian König 2014-03-28 17:56:45 UTC

Created attachment 96561 [details] [review]
Possible fix

Please try if the attached patch helps.

Comment 31 Christian König 2014-03-28 18:06:29 UTC

Created attachment 96562 [details] [review]
Possible fix v2

Sorry just found a stupid typo in the last patch, attached is a new one.

Comment 32 jeroen 2014-03-29 11:20:00 UTC

(In reply to comment #31)
> Created attachment 96562 [details] [review] [review]
> Possible fix v2
> 
> Sorry just found a stupid typo in the last patch, attached is a new one.

I've tested the patch and it definately better, as in less frames are dropped/skipped. The fps indicated by xbmc is now 23.98fps as expected, but every so many seconds it drops to 22fps. Without the patch it was 23.95fps and dropped to 22fps so many seconds.

The PLL values are now:
23.976 content: [drm:radeon_compute_pll_avivo], 7416, pll dividers - fb: 17.8 ref: 2, post 12
50fps content: [drm:radeon_compute_pll_avivo], 14833, pll dividers - fb: 17.8 ref: 2, post 6

So compared to without the patch it is closer to the perfect value, but still not there. So since the frame rate is closer now I expect the only way to get it to work properly is to get a perfect match I guess.

Comment 33 jeroen 2014-03-29 12:42:03 UTC

The PLL struct values are on my system:

pll_in_min=675,
pll_in_max=5000, 
pll_out_min=64800, 
pll_out_max=120000, 
lcd_pll_out_min=0, 
lcd_pll_out_max=120000, 
min_ref_div=2, 
max_ref_div=1023, 
min_post_div=2, 
max_post_div=127, 
min_feedback_div=4, 
max_feedback_div=2047, 
min_frac_feedback_div=0, 
max_frac_feedback_div=9, 
best_vco=0, 
reference_freq=10000, 
reference_div=0, 
post_div=0, 
flags=1040

I guess this means that for 148.5MHz the post_div has to be 5,6,7 or 8? Also if I understand the code correctly the ref_div is always 2 as the fb fraction is used?

Comment 34 jeroen 2014-03-29 14:03:32 UTC

Why not increase the ref_div? hardware frying? That way you can get the clock exactly right.

For example: 148.5 = 100 * 29.7 / 4 * 5
             74.2  = 100 * 37.1 / 5 * 10

Comment 35 jeroen 2014-03-29 16:29:19 UTC

(In reply to comment #34)
> Why not increase the ref_div? hardware frying? That way you can get the
> clock exactly right.
> 
> For example: 148.5 = 100 * 29.7 / 4 * 5
>              74.2  = 100 * 37.1 / 5 * 10

Just noticed that the clock values shown by xrandr are already rounded. So the 74.2MHz for 23.976fps actually is 74.17MHz. This would mean the PLL can never exactly generate the clock that is requested by the television I guess.

Perhaps something else is wrong with the PLL then

Comment 36 Christian König 2014-03-29 18:18:29 UTC

(In reply to comment #35)
> (In reply to comment #34)
> > Why not increase the ref_div? hardware frying? That way you can get the
> > clock exactly right.
> > 
> > For example: 148.5 = 100 * 29.7 / 4 * 5
> >              74.2  = 100 * 37.1 / 5 * 10
> 
> Just noticed that the clock values shown by xrandr are already rounded. So
> the 74.2MHz for 23.976fps actually is 74.17MHz. This would mean the PLL can
> never exactly generate the clock that is requested by the television I guess.
> 
> Perhaps something else is wrong with the PLL then

The PLL is fine, you can't just represent some frequencies 100% correct.

I'm already digging into making more use of the ref divider, just give me some time to get the algorithem straight.

Comment 37 Christian König 2014-03-29 21:10:20 UTC

Created attachment 96604 [details] [review]
Possible fix v3

Please try this one, it's a complete rewrite of finding the right PLL numbers.

Comment 38 Peter Frühberger 2014-03-30 12:33:19 UTC

Here is an OpenELEC image with kernel 3.14-rc8+ with that PLL patch included, so no need to compile manually: http://saraev.ca/OpenELEC-Generic.x86_64-devel-20140330151700-r18049-g02739c3.tar

Comment 39 jeroen 2014-03-30 13:39:31 UTC

(In reply to comment #37)
> Created attachment 96604 [details] [review] [review]
> Possible fix v3
> 
> Please try this one, it's a complete rewrite of finding the right PLL
> numbers.

No problem Peter I was already compiling it manually for the previous tests.

I just tested the patch using OpenElec 4.
50fps:     14850, pll dividers - fb: 29.7 ref: 4, post 5
23.976fps: 7417, pll dividers - fb: 741.7 ref: 100, post 10
59.94fps:  14835, pll dividers - fb: 296.7 ref: 25, post 8

50fps is spot on and no frames are dropped. The dividers are optimal.

23.976fps is better, but I still count on average a skipped frame every 28sec. Since the skipped frames are not at a regular interval I took the average. Sometimes a frame is skipped after 10sec, but it sometimes also takes 40 seconds.
When a frame is skipped I see the fps in XBMC drop to around 22.98fps.
What I do not understand is why it is still so often as the error in the clock would suggest a drop every ~9min.

59.94fps is also much better. Here XBMC reports a missed frame maybe every minute or so.

This patch is definately an improvement, but it still not as good as fgrlx which could play for many minutes without every dropping/skipping/missing frames.

Perhaps since Peter is apparently also following this thread could give some insight in the synchronisation in XBMC?
I have the idea the sync behaves differently in OE4 compared to OE3. For example in OE3, the player (P) in the codec info overlay never had the field skipped, only dropped?
Is the VBlank moment in XBMC derived from the pixel clock from the radeon or the pixel clock the TV is expecting?

Could it be that as the phase between the two pixel clocks is becoming to big, that this is solved/handled differently between fgrlx and OSS radeon?

Comment 40 Rainer Hochecker 2014-03-30 15:27:38 UTC

If you set "sync playback to display" in XBMC, an inaccurate clock has no impact on dropped or skipped frames. Suppose you only have a 24Hz mode and play material which is 23.976. It would slightly speed up playback: every vblank interval a frame is rendered.
If you observe skipped frames, the render thread may have been blocked too long or a vertical retrace was missed.

Comment 41 jeroen 2014-03-30 16:18:04 UTC

(In reply to comment #40)
> If you set "sync playback to display" in XBMC, an inaccurate clock has no
> impact on dropped or skipped frames. Suppose you only have a 24Hz mode and
> play material which is 23.976. It would slightly speed up playback: every
> vblank interval a frame is rendered.
> If you observe skipped frames, the render thread may have been blocked too
> long or a vertical retrace was missed.
 
Hello FernetMenta,

Thanks for commenting, as you are one of the experts on this subject in XBMC.

"sync playback to display" is definately enabled on my system and still I am seeing skipped or missed frames depending on if the clock is too slow or too fast, respectively.
Also, the patches from Christian already proved that a clock that is closer to the television display clock DOES have an influence on skipping/missing frames. If the clock had no impact there wouldn't be a problem in the first place.

The posted xrandr logs also show my television does have a 23.976 mode.

Comment 42 Christian König 2014-03-31 07:49:38 UTC

The only other option I'm aware of would be to adjust the modes to have a doable pixel clock.

On modern displays we could for exampler increase the vertical blanking period slightly to make the mode hit a pixel clock that is exactly representable.

Comment 43 Alex Deucher 2014-03-31 14:17:08 UTC

We could also update the adjusted mode clock to the actual clock set by the pll so that drm_calc_timestamping_constants() uses the actual clock value on the PLL.  E.g.,

diff --git a/drivers/gpu/drm/radeon/atombios_crtc.c b/drivers/gpu/drm/radeon/atombios_crtc.c
index daa4dd3..2a2da82 100644
--- a/drivers/gpu/drm/radeon/atombios_crtc.c
+++ b/drivers/gpu/drm/radeon/atombios_crtc.c
@@ -1085,6 +1085,7 @@ static void atombios_crtc_set_pll(struct drm_crtc *crtc, struct drm_display_mode
                atombios_crtc_program_ss(rdev, ATOM_ENABLE, radeon_crtc->pll_id,
                                         radeon_crtc->crtc_id, &radeon_crtc->ss);
        }
+       mode->clock = pll_clock * 10;
 }
 
 static int dce4_crtc_do_set_base(struct drm_crtc *crtc,

Comment 44 jeroen 2014-03-31 15:59:56 UTC

(In reply to comment #43)
> We could also update the adjusted mode clock to the actual clock set by the
> pll so that drm_calc_timestamping_constants() uses the actual clock value on
> the PLL.  E.g.,
> 
> diff --git a/drivers/gpu/drm/radeon/atombios_crtc.c
> b/drivers/gpu/drm/radeon/atombios_crtc.c
> index daa4dd3..2a2da82 100644
> --- a/drivers/gpu/drm/radeon/atombios_crtc.c
> +++ b/drivers/gpu/drm/radeon/atombios_crtc.c
> @@ -1085,6 +1085,7 @@ static void atombios_crtc_set_pll(struct drm_crtc
> *crtc, struct drm_display_mode
>                 atombios_crtc_program_ss(rdev, ATOM_ENABLE,
> radeon_crtc->pll_id,
>                                          radeon_crtc->crtc_id,
> &radeon_crtc->ss);
>         }
> +       mode->clock = pll_clock * 10;
>  }
>  
>  static int dce4_crtc_do_set_base(struct drm_crtc *crtc,

I think that would only help if radeon_compute_pll_avivo could not compute an exact match. In the case of 23.976Hz the target clock is 74170kHz and the PLL is set exactly to this value.
This does raise another question why the target clock' last digit is always zero? For example, for 23.976Hz the target clock should be 74176kHz (with correct rounding). I looked through the source code, but the target clock seems to come all the way from some deep generic drm code.

74176kHz could be matched by the PLL using fb=927.2, post_div=10 and ref_div=125

Comment 45 Christian König 2014-03-31 18:09:56 UTC

(In reply to comment #44)
> (In reply to comment #43)
> > We could also update the adjusted mode clock to the actual clock set by the
> > pll so that drm_calc_timestamping_constants() uses the actual clock value on
> > the PLL.  E.g.,
> > 
> > diff --git a/drivers/gpu/drm/radeon/atombios_crtc.c
> > b/drivers/gpu/drm/radeon/atombios_crtc.c
> > index daa4dd3..2a2da82 100644
> > --- a/drivers/gpu/drm/radeon/atombios_crtc.c
> > +++ b/drivers/gpu/drm/radeon/atombios_crtc.c
> > @@ -1085,6 +1085,7 @@ static void atombios_crtc_set_pll(struct drm_crtc
> > *crtc, struct drm_display_mode
> >                 atombios_crtc_program_ss(rdev, ATOM_ENABLE,
> > radeon_crtc->pll_id,
> >                                          radeon_crtc->crtc_id,
> > &radeon_crtc->ss);
> >         }
> > +       mode->clock = pll_clock * 10;
> >  }
> >  
> >  static int dce4_crtc_do_set_base(struct drm_crtc *crtc,
> 
> I think that would only help if radeon_compute_pll_avivo could not compute
> an exact match. In the case of 23.976Hz the target clock is 74170kHz and the
> PLL is set exactly to this value.
> This does raise another question why the target clock' last digit is always
> zero? For example, for 23.976Hz the target clock should be 74176kHz (with
> correct rounding). I looked through the source code, but the target clock
> seems to come all the way from some deep generic drm code.
> 
> 74176kHz could be matched by the PLL using fb=927.2, post_div=10 and
> ref_div=125

You might want to take a look at atombios_adjust_pll which does the mode fixup before a mode is actually used.

Since atombios always works with 10khz pixel clock which always sets the target clocks last digit to zero.

Comment 46 jeroen 2014-03-31 20:42:37 UTC

(In reply to comment #45)
> (In reply to comment #44)
> > (In reply to comment #43)
> > > We could also update the adjusted mode clock to the actual clock set by the
> > > pll so that drm_calc_timestamping_constants() uses the actual clock value on
> > > the PLL.  E.g.,
> > > 
> > > diff --git a/drivers/gpu/drm/radeon/atombios_crtc.c
> > > b/drivers/gpu/drm/radeon/atombios_crtc.c
> > > index daa4dd3..2a2da82 100644
> > > --- a/drivers/gpu/drm/radeon/atombios_crtc.c
> > > +++ b/drivers/gpu/drm/radeon/atombios_crtc.c
> > > @@ -1085,6 +1085,7 @@ static void atombios_crtc_set_pll(struct drm_crtc
> > > *crtc, struct drm_display_mode
> > >                 atombios_crtc_program_ss(rdev, ATOM_ENABLE,
> > > radeon_crtc->pll_id,
> > >                                          radeon_crtc->crtc_id,
> > > &radeon_crtc->ss);
> > >         }
> > > +       mode->clock = pll_clock * 10;
> > >  }
> > >  
> > >  static int dce4_crtc_do_set_base(struct drm_crtc *crtc,
> > 
> > I think that would only help if radeon_compute_pll_avivo could not compute
> > an exact match. In the case of 23.976Hz the target clock is 74170kHz and the
> > PLL is set exactly to this value.
> > This does raise another question why the target clock' last digit is always
> > zero? For example, for 23.976Hz the target clock should be 74176kHz (with
> > correct rounding). I looked through the source code, but the target clock
> > seems to come all the way from some deep generic drm code.
> > 
> > 74176kHz could be matched by the PLL using fb=927.2, post_div=10 and
> > ref_div=125
> 
> You might want to take a look at atombios_adjust_pll which does the mode
> fixup before a mode is actually used.
> 
> Since atombios always works with 10khz pixel clock which always sets the
> target clocks last digit to zero.

atombios_adjust_pll seems to do nothing to compensate for the 10kHz pixel clock, or didn't you mean that?

When I look at drm_calc_timestamping_constants(), does this mean the vblank moment is calculated by the OSS driver?

What about Alex' idea in comment 43? Would tat help Christian?

Comment 47 Rainer Hochecker 2014-04-02 15:18:46 UTC

(In reply to comment #41)
> (In reply to comment #40)
> > If you set "sync playback to display" in XBMC, an inaccurate clock has no
> > impact on dropped or skipped frames. Suppose you only have a 24Hz mode and
> > play material which is 23.976. It would slightly speed up playback: every
> > vblank interval a frame is rendered.
> > If you observe skipped frames, the render thread may have been blocked too
> > long or a vertical retrace was missed.
>  
> Hello FernetMenta,
> 
> Thanks for commenting, as you are one of the experts on this subject in XBMC.
> 
> "sync playback to display" is definately enabled on my system and still I am
> seeing skipped or missed frames depending on if the clock is too slow or too
> fast, respectively.
> Also, the patches from Christian already proved that a clock that is closer
> to the television display clock DOES have an influence on skipping/missing
> frames. If the clock had no impact there wouldn't be a problem in the first
> place.
> 
> The posted xrandr logs also show my television does have a 23.976 mode.

Again, a wrong speed does NOT have direct influence on dropped or skipped frames. If you see a some kind of relationship you have to look for the missing piece.

Comment 48 Garrett 2014-04-02 16:20:49 UTC

Created attachment 96792 [details]
logs working and not

This patch appears to have broken my 24P playback which has been fine until now.  My set is a Sony KDL 40NX711 NTSC set, A4-3400 - HDMI to set for vid/audio stereo out speakers/no avr.  Using Openelec- gotham nightlies (this patch is now comitted to OE-nightlies):
Playing 29.97 plays fine enough (a few skips- why I am testing this patch), 23.976 playback breaks the TV display- it starts to buzz then says "no signal", I press stop and the menu/screen resumes at ~30P like normal. I am attaching dmesg/xorg logs  drm.debug=0xe both working w/o patch and broken w/patch.  let me know if you need more.

Comment 49 Christian König 2014-04-02 17:50:08 UTC

(In reply to comment #48)
> Created attachment 96792 [details]
> logs working and not
> 
> This patch appears to have broken my 24P playback which has been fine until
> now.  My set is a Sony KDL 40NX711 NTSC set, A4-3400 - HDMI to set for
> vid/audio stereo out speakers/no avr.  Using Openelec- gotham nightlies
> (this patch is now comitted to OE-nightlies):
> Playing 29.97 plays fine enough (a few skips- why I am testing this patch),
> 23.976 playback breaks the TV display- it starts to buzz then says "no
> signal", I press stop and the menu/screen resumes at ~30P like normal. I am
> attaching dmesg/xorg logs  drm.debug=0xe both working w/o patch and broken
> w/patch.  let me know if you need more.

Are you sure you're logs are valid? Cause both a single mode switch shortly after boot:

bad/dmesg_24pbroken.log:[   11.148315] [drm:radeon_compute_pll_avivo], 297000 - 29700, pll dividers - fb: 29.7 ref: 2, post 5

good/dmesg_24pOK.log:[   11.400765] [drm:radeon_compute_pll_avivo], 29700, pll dividers - fb: 29.7 ref: 2, post 5

You might try to ssh into the box and change the modes using xrandr directly.

Comment 50 jeroen 2014-04-02 19:59:45 UTC

(In reply to comment #47)
> (In reply to comment #41)
> > (In reply to comment #40)
> > > If you set "sync playback to display" in XBMC, an inaccurate clock has no
> > > impact on dropped or skipped frames. Suppose you only have a 24Hz mode and
> > > play material which is 23.976. It would slightly speed up playback: every
> > > vblank interval a frame is rendered.
> > > If you observe skipped frames, the render thread may have been blocked too
> > > long or a vertical retrace was missed.
> >  
> > Hello FernetMenta,
> > 
> > Thanks for commenting, as you are one of the experts on this subject in XBMC.
> > 
> > "sync playback to display" is definately enabled on my system and still I am
> > seeing skipped or missed frames depending on if the clock is too slow or too
> > fast, respectively.
> > Also, the patches from Christian already proved that a clock that is closer
> > to the television display clock DOES have an influence on skipping/missing
> > frames. If the clock had no impact there wouldn't be a problem in the first
> > place.
> > 
> > The posted xrandr logs also show my television does have a 23.976 mode.
> 
> Again, a wrong speed does NOT have direct influence on dropped or skipped
> frames. If you see a some kind of relationship you have to look for the
> missing piece.

Okay, but some more information would be helpful. This way the bug report becomes more constructive in finding the root cause. It would help me (and perhaps others) to find the missing piece if it clear how radeon OSS and XBMC work in respect to the vblank timing etc.

For example the XBMC wiki is not very thorough on what the missing/skipping/dropping really means. Therefore, I already read a lot of threads on the XBMC forum. In http://forum.xbmc.org/showthread.php?tid=178173&pid=1551907#pid1551907 you state that skipping MAY be caused by refresh rate problems.

So what I got together in terms of definitions in XMBC:
- Skipping: The renderer is late
- Dropping: The decoder is late
- Missing: A vblank interrupt was missed

If the 'sync to display' option is on in XBMC the video pixels clock is master and I guess it then uses the vblank interrupt generated by the OSS driver. These interrupts are generated using the same clock settings that were used to set the PLL parameters. Why are there then ever skips reported, because the renderer cannot be late as it is the master and just puts a frame out for each vblank interrupt? or do I misunderstand something?
Are missed vblanks reported by the OSS driver to XBMC or does XBMC keep some shadow adminstration to see if vblank interrupts arrive at the expected time?

In my opinion there were a lot of bad comments on fgrlx, but atleast it got the core rendering of frames done without stuttering. The XVBA part was not that ideal though.

Comment 51 Peter Frühberger 2014-04-02 20:07:58 UTC

This bugtracker is not about fglrx. Not a single thing will change for the radeon oss driver cause of such statements.

At the end - before we dropped support for it (xvba + fglrx) - it was not even able to hold vsync without one core at 100%. So that's not any alternative at all.

Btw. such comments are really distracting, as we never had that good support from AMD as we get from christian and alex right now. </offtopic>

Comment 52 jeroen 2014-04-02 20:25:38 UTC

(In reply to comment #51)
> This bugtracker is not about fglrx. Not a single thing will change for the
> radeon oss driver cause of such statements.
> 
> At the end - before we dropped support for it (xvba + fglrx) - it was not
> even able to hold vsync without one core at 100%. So that's not any
> alternative at all.
> 
> Btw. such comments are really distracting, as we never had that good support
> from AMD as we get from christian and alex right now. </offtopic>

Then you misunderstand me! I think it is really good that radeon OSS is getting so much support. I never said I want fgrlx back.

It is just that comments like "a wrong speed does NOT have direct influence on dropped or skipped frames." are not helping without any explanation and are not constructive. Most people on this mailing list have probably no idea how XBMC internally works making it difficult to help us. </offtopic>

Comment 53 Garrett 2014-04-03 03:24:52 UTC

Created attachment 96817 [details]
xrandr 1920x1080x24p and OE crashed lcd

Sorry.  Here you go: dmesg just after- "xrandr --output HDMI-0 --mode 1920x1080 --rate 24". It did change refresh w/o blank screen: 
[2465.347782] [drm:radeon_compute_pll_avivo], 148500 - 14850, pll dividers - fb: 29.7 ref: 2, post 10.    My LCD shows 1080/24P now pressing remote "DISPLAY" button.

Opening OE 24p video >> crashes the screen.  dmesg while on a crashed screen.
[958.084974] [drm:radeon_compute_pll_avivo], 148340 - 14834, pll dividers - fb: 741.7 ref: 50, post 10

Comment 54 Garrett 2014-04-03 03:34:31 UTC

Created attachment 96819 [details]
xrandr 1920x1080x23.98 crashed

ok "xrandr --output HDMI-0 --mode 1920x1080 --rate 23.98" crashes the screen. Same as the OE when playing a 24p video.
[ 4592.767636] [drm:radeon_compute_pll_avivo], 148340 - 14834, pll dividers - fb: 741.7 ref: 50, post 10
LMK if you need more.

Comment 55 Garrett 2014-04-03 04:16:22 UTC

Created attachment 96820 [details]
dmesg xrandr old code working 23.98

xrandr prepatch 1920x1080px23.98 OK LCD display.
"xrandr --output HDMI-0 --mode 1920x1080 --rate 23.98"
[  183.991214] [drm:radeon_compute_pll_avivo], 14818, pll dividers - fb: 32.6 ref: 2, post 11
LCD = "DISPLAY" button 1080/24p.

Comment 56 Christian König 2014-04-03 07:39:21 UTC

@Garrett:

That looks rather interesting. First of all please open up a new bug report, I want to separate this problem from the discussion here.

To this new bug report please add the output of "xrandr --verbose" and your dmesg logs of the 23.98 mode in the working and not working case.

I have a pretty good idea what's going wrong here, but you need to test a couple of patches to make sure.

Comment 57 Rainer Hochecker 2014-04-03 16:59:59 UTC

(In reply to comment #50)

> If the 'sync to display' option is on in XBMC the video pixels clock is
> master and I guess it then uses the vblank interrupt generated by the OSS
> driver. These interrupts are generated using the same clock settings that
> were used to set the PLL parameters. Why are there then ever skips reported,
> because the renderer cannot be late as it is the master and just puts a
> frame out for each vblank interrupt? or do I misunderstand something?
> Are missed vblanks reported by the OSS driver to XBMC or does XBMC keep some
> shadow adminstration to see if vblank interrupts arrive at the expected time?
> 

Almost correct. At the application level we don't see interrupts. We just render the frames. We only can render one frame per vblank interval. Decoding should be fasted than rendering, hence the queue of ready frames fills and frames wait for being picked up. At this point (when the render thread comes by) we check the timestamp attached to the frame. If the time has already passed and there is more than a single frame in the queue, the next frame is skipped. Means the render thread is late by minimum frametime 41ms when running at 23.976.

So even if we run at wrong speed, the render thread should not get that late.

Comment 58 Rackow, Detlev 2014-04-03 19:26:56 UTC

Created attachment 96869 [details]
dmesg after switching to 25hz and back to 24hz

Comment 59 Rackow, Detlev 2014-04-03 19:27:50 UTC

Hi, I also have issues with OE 3.95.x and Radeon 6320 (AMD E-450). On my device, issues happened with all fractional frequencies (23.9x, 29.9x, 59.9x hz)

The test-version which Peter Fruehberger posted and which contains your preliminary patch changed the behaviour.

With that new patch fractionalmodes (23.976, ... , ... ) are now working
fine, but with 25hz I have a problem. (Peter supposes that it's actually 50i and I believe this too, but I'm just a user and I can only report the frequency that I select in the XBMC-settings)

When I set the rate to 25Hz, the picture begins to shiver up and down a few millimeters. When I don't acknowledge the new rate, XBMC switches back to the old rate, and the picture is immediately stable as ever. 

This effect used to happen with all fractional rates in OE 3.95.x, while 25Hz worked fine. With the mentioned test-version it is gone on the fractional rates, but now it happens on 25Hz (or 50i, as Peter says).

As instructed, I booted OE with the kernel-parameter drm.debug=0xe and took dmesg and Xorg.0.log immediately after switching to 25hz and falling back.

This is my first post on this site, I hope I don't mess it ;)

Comment 60 Rackow, Detlev 2014-04-03 19:29:18 UTC

Created attachment 96870 [details]
Xorg.0.log after switching to 25hz and back to 24hz

Comment 61 Christian König 2014-04-04 08:06:44 UTC

Detlev, please attache your logs to this bug instead: ttps://bugs.freedesktop.org/show_bug.cgi?id=77009

It's essentially a different problem and I want to keep it separated from the discussion here.

Thanks,
Christian.

Comment 62 Rackow, Detlev 2014-04-04 17:50:52 UTC

Thanks for your fast reply, it's done.

Regards,

Detlev

Comment 63 Christian König 2014-04-23 08:46:52 UTC

Please try http://cgit.freedesktop.org/~deathsimple/linux/log/?h=drm-next-3.16.

This branch might fix the remaining frame drop problems.

Comment 64 jeroen 2014-04-23 15:43:34 UTC

(In reply to comment #63)
> Please try
> http://cgit.freedesktop.org/~deathsimple/linux/log/?h=drm-next-3.16.
> 
> This branch might fix the remaining frame drop problems.

Hi Christian,

I see multiple commits, are you specifically referring to the page flip commits?  Problems with the vblank handling could explain skipped frames in XBMC as this indicates the render thread is late as explained by Rainer.

I will see if I can create a patch for OpenElec to pull in all the changes made to drm/radeon and test it

Comment 65 Christian König 2014-04-23 15:45:51 UTC

(In reply to comment #64)
> I will see if I can create a patch for OpenElec to pull in all the changes
> made to drm/radeon and test it

Peter already did so, you should contact him to get the merged patch.

Comment 66 Peter Frühberger 2014-04-24 06:37:06 UTC

I gave the patches I extracted a try (http://sprunge.us/ZRBT).

It seems to have major issues with fractional modes. When watching 24p content, it jumps all the way from 0.89 fps to 18fps and 24fps, the skip counter is running monotonically.

I think jeroen, which als has the issue, will post some logfiles later. It's nothing subtile, so I think you can directly see it, when running a fractional mode.

Might also be, that I missed a patch (see above).

Comment 67 Garrett 2014-04-24 13:49:17 UTC

Created attachment 97901 [details]
dmesg xrandr new patch slow 18fps on 24p

I tested this new patch.  it plays back very slowly in xbmc (reverting the kernel fixes it- i have that dmesg too if you need it).  attached is the dmesg (error level 14) after cmd:

xrandr --output HDMI-0 --mode 1920x1080 --rate 23.98

in log note:  
[   45.594953] [drm:drm_mode_debug_printmodeline], Modeline 24:"1920x1080" 60 148500 1920 2008 2052 2200 1080 1084 1089 1125 0x48 0x5
[   45.594955] [drm:drm_mode_debug_printmodeline], Modeline 31:"" 0 74176 1920 2558 2602 2750 1080 1084 1089 1125 0x0 0x5
[   45.594958] [drm:drm_crtc_helper_set_config], [CONNECTOR:18:HDMI-A-1] to [CRTC:12]
[   45.594960] [drm:drm_crtc_helper_set_config], attempting to set mode from userspace
[   45.594962] [drm:drm_mode_debug_printmodeline], Modeline 31:"" 0 74176 1920 2558 2602 2750 1080 1084 1089 1125 0x0 0x5

Not sure where 31 came from.
Let me know if you need to see anything else.
system: a4-3400 hdmi to tv direct a/v both.  xubuntu64 14.04 full desktop.

Comment 68 jeroen 2014-04-24 15:53:05 UTC

dmesg output with drm debug on: http://sprunge.us/gSWc

I tested it too and have similar results as Peter. When playing 23.976 content the average framerate is around 21fps and XBMC reports constant 'skipped frames', meaning the render thread is late.

This was tested using OpenElec 4.0 master on a AMD E-350

Comment 69 Christian König 2014-04-27 13:12:21 UTC

Hi Peter & Jereon,

please give this commit a try: http://cgit.freedesktop.org/~deathsimple/linux/commit/?h=drm-fixes-3.15-wip&id=cebb0b66645d8d18982c160521c164a51d0f1fd9

It should now use the pflip irq to avoid problems with missed flips.

Comment 70 Peter Frühberger 2014-04-27 13:25:23 UTC

@Jereon: Here is the updated patch: http://sprunge.us/DOjf

I am currently building debian packages, so drop me a mail if you test on debian based system to save you that work.

@Christian: Thx much as usual.

Comment 71 Peter Frühberger 2014-04-27 14:42:26 UTC

Works perfectly for me.

Test 50hz, 24.0 hz and 23.976 hz. No skips, rock solid fps.

Comment 72 Garrett 2014-04-27 14:58:10 UTC

(In reply to comment #69)
> Hi Peter & Jereon,
> 
> please give this commit a try:
> http://cgit.freedesktop.org/~deathsimple/linux/commit/?h=drm-fixes-3.15-
> wip&id=cebb0b66645d8d18982c160521c164a51d0f1fd9
> 
> It should now use the pflip irq to avoid problems with missed flips.

@Christian, 
Thanks so much!  I have NO (very rare) additional skips now in OE, A4-3400 HDMI to Sony LCD.  On 1080ix29.97 h264.  I have been trying to fix this for a long time.  I have some really hard to play panning vids that used to skip a lot, and it is now none.  1080px23.976 h.264 = perfect.  @Peter thanks for posting the patch.  PQ is amazing!  No judders at all.  

I forgot to mention the 210 PLL limit works great too.  Thanks for that also.

@Jereon.   I hope that this fixes your issues.  This bug report has made a huge difference for my systems.

Garrett

Comment 73 jeroen 2014-04-27 15:27:24 UTC

(In reply to comment #70)
> @Jereon: Here is the updated patch: http://sprunge.us/DOjf
> 
> I am currently building debian packages, so drop me a mail if you test on
> debian based system to save you that work.
> 
> @Christian: Thx much as usual.

Thanks for providing the patch. I am currently building OE 4.0 master with it and hopefully can report similar results as you guys later.

Comment 74 Peter Frühberger 2014-04-27 15:37:55 UTC

Wait a moment, the last one has changed a bit to support more than R600, please use that one: http://sprunge.us/XcQW

Comment 75 Garrett 2014-04-27 17:15:17 UTC

(In reply to comment #74)
> Wait a moment, the last one has changed a bit to support more than R600,
> please use that one: http://sprunge.us/XcQW

OK built it and works great!  A4-3400 all good, like before.

Now I can play 1080ix29.97 on my Zotac AQ01 (A4-5000 APU) 59.94fps stable.  Even VDPAU Temoral de-interlacing no skips!  It failed to play before.  I got ~54-58FPS before, and only plain BOB worked kind of, not even VDPAU BOB.  This is a good patch even for new chips.

I am using 3.14.0 Kernel from OE (r18133) git, removed older PLL patches, applied these:  http://sprunge.us/XcQW

Garrett

Comment 76 jeroen 2014-04-27 17:48:05 UTC

(In reply to comment #74)
> Wait a moment, the last one has changed a bit to support more than R600,
> please use that one: http://sprunge.us/XcQW

I just tested with this patch in combination with OE 4.0 master and it is like you said 'rock solid'. I tested 23.976fps content for 20 min and not a single skipped or dropped frame.

I will test some more the coming days, but this patch seems to be the solution. 

Thanks Christian!

Now, I can ditch OE 3.0 and switch to OE 4.0 finally.

Comment 77 Christian König 2014-04-28 08:23:07 UTC

I'm trying to get that patch into 3.15 and going to create a cleaned up solution for 3.16.

Thanks for all the help,
Christian.

Comment 78 adb76 2014-05-04 08:39:36 UTC

I've tested the attached patch, which is now included in OpenELEC 4.0 Beta 7 on my AMD Fusion E-350 with Radeon HD 6310 system (see adb76_lspci.txt). There seems to be still a problem with which I was alread in contact with Peter Frühberger (fritsch) on the github site for OpenELEC: https://github.com/OpenELEC/OpenELEC.tv/issues/3163 . Peter asked me to attach my information to this bug:

The problem is that when I watch videos with XBMC on OpenELEC 4.0 Beta 7 the "missed frames" counter (not the skipped frames counter!) increases constantly during playback. In around 45 minutes there are approximately 30 "missed frames". The missed frames are recognisable, so when I see a stuttering I look afterwards on the OSD of XBMC and the missed frame counter increased by +1 or +2. I previously had installed OpenELEC 3.2 where I didn't get any missed frames during the full playback. The assumption of Peter is that "the driver did not do swaps".

My TV is displaying the framerate of all the tested videos natively: 1920x1080@25fps and 1280x720@25fps. See also the attached file adb76_xrandr.txt for the display properties.

On XBMC side I have set the following preferences (according to the suggestions of Peter):

Enable Adjust Refreshrate to match video (On Start / Stop)
Enable Sync Playback to Display Method Video Clock (Drop / Dupe)
Deinterlace: Auto
Deinterlace Method: Bob
Scaling: Bilinear
Vertical Blank Setting: Let Driver Decide
Enalbe HQ Scaler: above 20%

I have attached multiple logs from my system (adb76_*). Which further informations are required to narrow down the problem?

Comment 79 adb76 2014-05-04 08:40:44 UTC

Created attachment 98406 [details]
adb76 dmesg output

Comment 80 adb76 2014-05-04 08:41:10 UTC

Created attachment 98407 [details]
adb76 lspci output

Comment 81 adb76 2014-05-04 08:41:34 UTC

Created attachment 98408 [details]
adb76 xorg.log

Comment 82 adb76 2014-05-04 08:42:16 UTC

Created attachment 98409 [details]
adb76 xbmc-xrandr output

Comment 83 Christian König 2014-05-04 11:02:59 UTC

(In reply to comment #79)
> Created attachment 98406 [details]
> adb76 dmesg output

Please provide a dmesg output generated with drm.debug=0xE.

Thanks,
Christian.

Comment 84 adb76 2014-05-04 13:48:09 UTC

Since the dmesg log buffer seems very small, I called "dmesg | pastebinit" directly afterwards the missed frame counter increased. I made this three times:

http://sprunge.us/gCRU
http://sprunge.us/EHNG
http://sprunge.us/MWEY

Comment 85 Christian König 2014-05-04 13:58:36 UTC

(In reply to comment #84)
> Since the dmesg log buffer seems very small, I called "dmesg | pastebinit"
> directly afterwards the missed frame counter increased. I made this three
> times:
> 
> http://sprunge.us/gCRU
> http://sprunge.us/EHNG
> http://sprunge.us/MWEY

The dmesg after the missed frame is uninteresting. I need the dmesg of the boot process with drm.debug=0xE.

Comment 86 adb76 2014-05-04 14:45:45 UTC

Sorry. Here it is: 

http://sprunge.us/NhCD

Because of the small dmesg log buffer I can't get all of the output from the second 0. Hope the necessary information is included.

Regards,

André

Comment 87 adb76 2014-05-05 19:59:00 UTC

I've retested this evening the playback with OE 3.2 + xvba: 0 missed frames in 45 minutes for 1280x720@25 fps. For the same video file with OE 4 beta 7 + vdpau: 38 missed frames.

Comment 88 Christian König 2016-06-15 12:03:12 UTC

We should probably close this bug now. The original problem is clearly fixed and the remaining frame drops have different causes.

If you still have issues with some modes/hw combinations feel free to open up a new bug report.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.