Bug 59572 - [845G regression] 8bpp fbcon modes broken
Summary: [845G regression] 8bpp fbcon modes broken
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: x86 (IA32) Linux (All)
: medium normal
Assignee: Ville Syrjala
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-01-18 21:59 UTC by mlsemon35
Modified: 2017-07-24 22:59 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
Full dmesg, drm.debug=6 (37.97 KB, text/plain)
2013-01-18 21:59 UTC, mlsemon35
no flags Details
boot to fbcon using Fedora 18 3.6 kernel (72.26 KB, text/plain)
2013-01-19 13:05 UTC, mlsemon35
no flags Details
boot to blank screen, no fbcon present, using 3.8 (53.41 KB, text/plain)
2013-01-19 13:06 UTC, mlsemon35
no flags Details
regdump from good 8bpp fbcon, indirect, needed intel_reg_snapshot (14.09 KB, text/plain)
2013-01-21 10:47 UTC, mlsemon35
no flags Details
regdump from blind screen, indirect, needed intel_reg_snapshot (14.08 KB, text/plain)
2013-01-21 10:48 UTC, mlsemon35
no flags Details
Patch #1 (915 bytes, patch)
2013-01-31 14:20 UTC, Ville Syrjala
no flags Details | Splinter Review
Patch #2 (1.18 KB, patch)
2013-01-31 14:21 UTC, Ville Syrjala
no flags Details | Splinter Review

Description mlsemon35 2013-01-18 21:59:09 UTC
Created attachment 73254 [details]
Full dmesg, drm.debug=6

Hi!  I have an old PC with an 845G card.  I've been exiled from my very nice Linux 3.6 kernel by the EOL tag, and I'm treading rough waters with 3.7 and 3.8.  I used to boot to a framebuffer console with "video=1024x786M-8@75", I think, but since I've started using kernel 3.7, I've had to add or remove parts until I get a usable console.  "video=1024x768M" seems to be a safe choice right now.

At random, on boot or reboot--and regardless of my video= line--my monitor will go blank and show "Auto Adjust in Progress" for 6-10 seconds.  The system isn't hanging--I can type my crypto password to my blank, auto-adjusting screen and keep moving--but I wonder if this is good for the monitor.  This is common to every 3.7 and 3.8 kernel I've used, and all was fine for the other kernels I used on this PC (3.2 up to 3.6.11).  In 3.7 kernels, this happened on every boot, and there was a longer pause on reboots than for a regular boot.  In 3.8-rc4, this might happen 3 boots in a row, then do it only once every 3 boots for a while.

Attached is my boot dmesg, and maybe somebody out there is smart enough to tell me which problem is the real problem.  I certainly don't know!  I have another PC with the same kind of host-bridge-window boot issue and motherboard-shared-interrupts craziness, but it has 865G Intel video and therefore shares only 10% of the glitches that the 845G does.

I've contacted the wrong forum, please feel free to point me to the correct forum.  Thanks!

Michael
Comment 1 Daniel Vetter 2013-01-18 22:48:09 UTC
I guess the timings of your resolution slightly changed due to the slightly different video mode your now using. Which can take the screen longer to adjust to. Questions to check:
- does the screen eventually show the console, or does that sometimes fail?
- if you have time, bisecting why the old video mode doesn't work on the kernel cmdline would be interesting.
Comment 2 mlsemon35 2013-01-19 02:21:25 UTC
(In reply to comment #1)
> I guess the timings of your resolution slightly changed due to the slightly
> different video mode your now using. Which can take the screen longer to
> adjust to. Questions to check:

I guess it's best put this way:  For kernels 3.6.11 and prior, the default fbcon boot goes like this:

> - does the screen eventually show the console, or does that sometimes fail?

I think I get the same results every time: A good cmdline leads to text on a screen at the specified resolution.  A bad cmdline leads to no text on a blank console.  Some settings of resolution and Hz seem to cause the driver to default to the monitor's native resolution.  [Read more.]

What seems to change is the "same results" from kernel to kernel, and I've forgotten what worked on my 3.6.11-EOL diaspora journey.  I think it went 3.6.11 to 3.7.{0,1} to 3.8-rc{1,2,3,4}.

> - if you have time, bisecting why the old video mode doesn't work on the
> kernel cmdline would be interesting.

Yes, that was interesting!  I didn't get too scientific with it, but looks like the -8 switch leads to a blank screen each time.  -15, -16, -24, and -32 work, though -24 seems to lead to the same results as -32.  The @Hz works, so long as it matches a valid number for Hz.  Resolutions work, too, so long as they match a valid number.  I didn't see a difference between using "M" and not using "M", and really, I didn't notice it much when I thought it made a difference.  [It's a holdover from my time using i810fb on an older PC.]

As for blank times and such, it went like this:

1024x768-32: 10s delay, rgba 8/16,8/8,8/0,0/0 (fbset)

800x600-16: 2s "Auto Adjust in Progress" shows over unblanked screen, real blank time not noticed, rgba 5/11,6/5,5/0,0/0

800x600-8: booted to blank screen
640x480-8: booted to blank screen

640x480-15: 10s delay, rgba 5/10,5/5,5/0,1/15

1024x768M-16@75: OK, blank time was variable

1152x768-16: 2s blank

-16: 1280x1024@60, 16bpp, blank time not noticed

Hmmm...maybe my questions should include "What happened to 8bpp?"
Comment 3 mlsemon35 2013-01-19 02:29:52 UTC
Oops, I left out what the behavior was for kernel 3.6.11 and this particular monitor.  It goes like this:

POST
LILO
kernel messages in VGA
0.5-1.5s blank time
kernel messages in graphics mode

It really shouldn't go up to 10s of blank time.

Also noteworthy to me is that switching from framebuffer to X and back is as fast as always.
Comment 4 Chris Wilson 2013-01-19 11:59:01 UTC
Right, the pause is just the monitor adjusting, there does not seem to be any problem such as the driver spinning for 10s waiting for a state change, for example.

What I think would be interesting would be a intel_reg_dump (see the intel-gpu-tools package) of 3.6.11 vs 3.8, and also the drm.debug=6 dmesg from 3.6.11.
Comment 5 mlsemon35 2013-01-19 13:05:33 UTC
Created attachment 73283 [details]
boot to fbcon using Fedora 18 3.6 kernel
Comment 6 mlsemon35 2013-01-19 13:06:47 UTC
Created attachment 73284 [details]
boot to blank screen, no fbcon present, using 3.8
Comment 7 mlsemon35 2013-01-19 13:20:22 UTC
(In reply to comment #4)
> Right, the pause is just the monitor adjusting, there does not seem to be
> any problem such as the driver spinning for 10s waiting for a state change,
> for example.
> 
> What I think would be interesting would be a intel_reg_dump (see the
> intel-gpu-tools package) of 3.6.11 vs 3.8, and also the drm.debug=6 dmesg
> from 3.6.11.

I think my comments were deleted when I made the attachments.  Oops.  Short version: My glibc is compiled against 3.8 headers, so I resorted to a Fedora 18 disc to get 3.6.10 kernel results.  The resulting dmesg looks very accurate to me, including the host bridge window errors and shared interrupts, so the dmesg is attached.

For the 3.8 setup, I got the idea of booting to the video=1024x768M-8@75 screen that I wanted to have.  I took the dmesg by booting to a blank screen, starting X, and using an xterm to dump the message buffer.  There's a stack trace in there that looks interesting.  fbset states that /dev/fb0 does not exist.

I'll try to get intel_gpu_tools working from 3.6 somehow.  [They're already installed on my 3.8 system.]  That will be later, though.  I must get some sleep.  Thanks for the quick reply!
Comment 8 Daniel Vetter 2013-01-19 17:02:50 UTC
(In reply to comment #2)
> Hmmm...maybe my questions should include "What happened to 8bpp?"

Yeah, I think that's the interesting thing here - the 10s delay for the screen adjusting seems to be annoying, but I don't see anything where the driver blocks for that long. This sometimes also happens in my own test-setups that the screen seems to take awfully long to adjust.

For the 8bpp support I've looked through git logs a bit, and it seems that we've parsed this always. So it's more likely that something in the 8bpp support broke (rarely tested unfortunately), and not that it never worked, but for some odd reason we've ignored that on 3.6 kernels. Git bisect of the 8bpp issue would be really interesting ...
Comment 9 mlsemon35 2013-01-20 07:09:46 UTC
(In reply to comment #8)
> (In reply to comment #2)
> > Hmmm...maybe my questions should include "What happened to 8bpp?"
> 
> Yeah, I think that's the interesting thing here - the 10s delay for the
> screen adjusting seems to be annoying, but I don't see anything where the
> driver blocks for that long. This sometimes also happens in my own
> test-setups that the screen seems to take awfully long to adjust.
> 
> For the 8bpp support I've looked through git logs a bit, and it seems that
> we've parsed this always. So it's more likely that something in the 8bpp
> support broke (rarely tested unfortunately), and not that it never worked,
> but for some odd reason we've ignored that on 3.6 kernels. Git bisect of the
> 8bpp issue would be really interesting ...

Yes, boot time is unchanged, so the driver isn't waiting on the monitor, but (rhetorically) what is it telling the monitor?

For 1024x768-16, my monitor has settled down like it has resigned to storing it as a preset.  This problem is variable with kernel version, so I'll have to start with 3.6.11 on a new partition and work my way up.  It's the best way to get the requested intel_reg_dump for Chris, anyway.

I read about git bisects only yesterday and don't know how to do it yet.

The 8bpp fbcon worked without problem for kernels 3,2-3.6 on 845G, and 3.7 needs to be rechecked.  8bpp works without problem for kernels 3.6-3.7 for 865G.  I use 8bpp because it's about 50% faster than 16bpp on the 845G...and it is baggage from using the old i810fb driver on i810.
Comment 10 Daniel Vetter 2013-01-20 11:42:37 UTC
(In reply to comment #9)
> I read about git bisects only yesterday and don't know how to do it yet.

This should get you started: http://www.reactivated.net/weblog/archives/2006/01/using-git-bisect-to-find-buggy-kernel-patches/
Comment 11 mlsemon35 2013-01-21 09:54:21 UTC
(In reply to comment #10)
> (In reply to comment #9)
> > I read about git bisects only yesterday and don't know how to do it yet.
> 
> This should get you started:
> http://www.reactivated.net/weblog/archives/2006/01/using-git-bisect-to-find-
> buggy-kernel-patches/

Thanks, that was quite helpful.

I haven't tested the monitor stuff yet, but this is the bisect for the 8bpp part.  I tested it first because the test was easy:  If I booted to a blank screen, the bisect should be marked bad; else, it should be marked good.

bash-4.2# git bisect bad
57779d06367a915ee03e6cb918d7575f0a46e419 is the first bad commit
commit 57779d06367a915ee03e6cb918d7575f0a46e419
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date:   Wed Oct 31 17:50:14 2012 +0200

    drm/i915: Fix display pixel format handling
    
    Fix support for all RGB/BGR pixel formats (except the 16:16:16:16 float
    format).
    
    Fix intel_init_framebuffer() to match hardware and driver limitations:
    * RGB332 is not supported at all
    * CI8 is supported
    * XRGB1555 & co. are supported on Gen3 and earlier
    * XRGB210101010 & co. are supported from Gen4 onwards
    * BGR formats are supported from Gen4 onwards
    * YUV formats are supported from Gen5 onwards (driver limitation)
    
    Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
    Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
    Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

:040000 040000 334e1536b3513d0c329a8bb6360593d12065b71d bf0996ec13cbee07156c5e9f98dcdee30200e658 M      drivers
Comment 12 Daniel Vetter 2013-01-21 10:09:43 UTC
Thanks a lot for doing the bisect. On a quick look I don't immediately see how that broke things. Can you please boot both with a working and broken kernel in the 8bpp mode and grab the output of intel_reg_dumper from http://cgit.freedesktop.org/xorg/app/intel-gpu-tools/ (if your distro has it you can use that, too - no relevant changes for i845g dump support in ages).

I've also adjusted the bug summary to reflect the actual bug/regression we're hunting here (since I think the 6-10s delay is something we can't do much about).
Comment 13 mlsemon35 2013-01-21 10:47:34 UTC
Created attachment 73371 [details]
regdump from good 8bpp fbcon, indirect, needed intel_reg_snapshot
Comment 14 mlsemon35 2013-01-21 10:48:33 UTC
Created attachment 73372 [details]
regdump from blind screen, indirect, needed intel_reg_snapshot
Comment 15 mlsemon35 2013-01-21 11:15:27 UTC
(In reply to comment #12)
> Thanks a lot for doing the bisect. On a quick look I don't immediately see
> how that broke things. Can you please boot both with a working and broken
> kernel in the 8bpp mode and grab the output of intel_reg_dumper from
> http://cgit.freedesktop.org/xorg/app/intel-gpu-tools/ (if your distro has it
> you can use that, too - no relevant changes for i845g dump support in ages).
> 
> I've also adjusted the bug summary to reflect the actual bug/regression
> we're hunting here (since I think the 6-10s delay is something we can't do
> much about).

The attachments are from the last good and bad kernels booted before the end of the bisect.  I had to use intel_reg_snapshot first because to call intel_reg_dumper with no arguments gave the message "Gen2/3 Ranges are not supported. Please use unsafe mode.Aborting".

There are some different checks in there having to do with 8-bit formats, and the new code might check for slightly different things than the old code.  However, it should be noted that 8bpp broke here for the 845G, and the 8bpp for the 865G is still going well as of kernel 3.7.3, and to my limited knowledge, it has not faltered yet.

If I explore the monitor issue again, I'll open another bug and start with the bisect results first.  Using the `git bisect visualize` command for the first time, it looked like a lot of drm/i915 commits were made in the same time period, and any of those could have caused the monitor quirkiness.

Good luck!
Comment 16 Ville Syrjala 2013-01-31 14:20:47 UTC
Created attachment 73994 [details] [review]
Patch #1
Comment 17 Ville Syrjala 2013-01-31 14:21:57 UTC
Created attachment 73995 [details] [review]
Patch #2

Try these two patches. I think they should help.

Here's a git tree with both patches if that's easier:
https://gitorious.org/vsyrjala/linux/commits/use_c8_format
Comment 18 mlsemon35 2013-01-31 17:30:45 UTC
Comment on attachment 73995 [details] [review]
Patch #2

Review of attachment 73995 [details] [review]:
-----------------------------------------------------------------

This works.  8bpp is back, and that extra blast of speed is back as well.  Thanks!
Comment 19 Daniel Vetter 2013-02-13 23:44:47 UTC
Fix finally landed somewhere:

commit 2a9280a1f3e1c574dc89a6f870dc363f17cc5a40
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date:   Thu Jan 31 19:43:38 2013 +0200

    drm: Use C8 instead of RGB332 when determining the format from depth/bpp

Should land in 3.9 and then get backported.
Comment 20 mlsemon35 2013-02-14 22:15:01 UTC
(In reply to comment #19)
> Fix finally landed somewhere:
> 
> commit 2a9280a1f3e1c574dc89a6f870dc363f17cc5a40
> Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Date:   Thu Jan 31 19:43:38 2013 +0200
> 
>     drm: Use C8 instead of RGB332 when determining the format from depth/bpp
> 
> Should land in 3.9 and then get backported.

Very good!  In the meantime, I'll keep these patches for patching 3.8 kernels.
Comment 21 Florian Mickler 2013-03-04 22:58:15 UTC
A patch referencing this bug report has been merged in Linux v3.9-rc1:

commit d84f031bd230fdf9c3b7734940c859bf28b90219
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date:   Thu Jan 31 19:43:38 2013 +0200

    drm: Use C8 instead of RGB332 when determining the format from depth/bpp


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.