81140 – [BYT/HSW/BDW/BSW]system sporadically boot fail because efifb

Bug 81140 - [BYT/HSW/BDW/BSW]system sporadically boot fail because efifb

Summary: [BYT/HSW/BDW/BSW]system sporadically boot fail because efifb

Status:	CLOSED FIXED

Alias:	None

Product:	DRI
Classification:	Unclassified
Component:	DRM/Intel (show other bugs)
Version:	XOrg git
Hardware:	All Linux (All)

Importance:	highest blocker
Assignee:	Intel GFX Bugs mailing list
QA Contact:	Intel GFX Bugs mailing list

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2014-07-10 02:42 UTC by lu hua
Modified:	2017-10-06 14:37 UTC (History)
CC List:	4 users (show)

See Also:
i915 platform:
i915 features:

Attachments
boot log (35.04 KB, text/plain) 2014-07-10 02:42 UTC, lu hua	no flags	Details
Trace fb comings and goings (2.19 KB, patch) 2014-07-12 12:46 UTC, Chris Wilson	no flags	Details \| Splinter Review
normal config (103.13 KB, text/plain) 2014-07-16 05:52 UTC, liulei	no flags	Details
debug config (107.91 KB, text/plain) 2014-07-16 05:52 UTC, liulei	no flags	Details
diff_configs (10.22 KB, text/plain) 2014-07-16 06:22 UTC, liulei	no flags	Details
diff-u_configs (23.39 KB, text/plain) 2014-07-16 07:39 UTC, liulei	no flags	Details
normal.config for attempt to reproduce 1bb9e63 (118.21 KB, text/plain) 2014-07-22 00:56 UTC, Ben Widawsky	no flags	Details
insert delay (849 bytes, patch) 2014-07-23 10:56 UTC, Daniel Vetter	no flags	Details \| Splinter Review
Avoid fbcon memory corruption (5.22 KB, patch) 2014-08-05 14:23 UTC, Chris Wilson	no flags	Details \| Splinter Review
boot log(patch) (33.93 KB, text/plain) 2014-08-06 02:51 UTC, lu hua	no flags	Details
Show Obsolete (1) View All

Description lu hua 2014-07-10 02:42:22 UTC

Created attachment 102512 [details]
boot log

System Environment:
--------------------------
Platform: Broadwell
Kernel: drm-intel-nightly/ed4d04defe2c6962efe8f4ba3587a8e69e06d2dd

Bug detailed description:
------------------------- 
Clean boot system, it fails 3 in 5 cycles. It happens on Broadwell with -nightly and -fixes kernel. Run 5 cycles on -queued kernel, it works well.


Reproduce steps:
-------------------------
1. clean boot system

Comment 1 Chris Wilson 2014-07-10 06:44:54 UTC

Regression in the -fixes tree? Please bisect.

Comment 2 lu hua 2014-07-11 01:20:13 UTC

It also happens on  BYT.

Comment 3 liulei 2014-07-11 02:08:23 UTC

This issue exist on HSW, too.

Comment 4 Chris Wilson 2014-07-11 06:37:34 UTC

Also what exactly is the failure mode? The boot log freezes at the second console takeover, so check you have:

commit 1bb9e632a0aeee1121e652ee4dc80e5e6f14bcd2
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Tue Jul 8 10:02:43 2014 +0200

    drm/i915: Only unbind vgacon, not other console drivers

Comment 5 liulei 2014-07-11 07:35:01 UTC

(In reply to comment #4)
> Also what exactly is the failure mode? The boot log freezes at the second
> console takeover, so check you have:
> 
> commit 1bb9e632a0aeee1121e652ee4dc80e5e6f14bcd2
> Author: Daniel Vetter <daniel.vetter@ffwll.ch>
> Date:   Tue Jul 8 10:02:43 2014 +0200
> 
>     drm/i915: Only unbind vgacon, not other console drivers
We have this commit.

Comment 6 liulei 2014-07-11 07:42:47 UTC

==Bisect results==
----------------------------
Bisect shows: 01527b3127997ef6370d5ad4fa25d96847fbf12a is the first bad commit
01527b3127997ef6370d5ad4fa25d96847fbf12a is the first bad commit
commit 01527b3127997ef6370d5ad4fa25d96847fbf12a
Author: Clint Taylor <clinton.a.taylor@intel.com>
Date:   Mon Jul 7 13:01:46 2014 -0700

    drm/i915/vlv: T12 eDP panel timing enforcement during reboot

    The panel power sequencer on vlv doesn't appear to accept changes to its
    T12 power down duration during warm reboots. This change forces a delay
    for warm reboots to the T12 panel timing as defined in the VBT table for
    the connected panel.

    Ver2: removed redundant pr_crit(), commented magic value for pp_div_reg

    Ver3: moved SYS_RESTART check earlier, new name for pp_div.

    Ver4: Minor issue changes

    Ver5: Move registration of reboot notifier to edp_connector_init,
          Added warning comment to handler about lack of PM notification.

Comment 7 liulei 2014-07-11 07:54:20 UTC

Add details of this commit.
commit 01527b3127997ef6370d5ad4fa25d96847fbf12a
Author:     Clint Taylor <clinton.a.taylor@intel.com>
AuthorDate: Mon Jul 7 13:01:46 2014 -0700
Commit:     Daniel Vetter <daniel.vetter@ffwll.ch>
CommitDate: Wed Jul 9 09:52:14 2014 +0200

Comment 8 Gavin Hindman 2014-07-11 17:36:08 UTC

This patch is VLV specific - why would it be causing boot failures on other platforms?

Comment 9 Jesse Barnes 2014-07-11 19:09:03 UTC

Comment on attachment 102512 [details]
boot log

Yeah I'm not seeing how this patch would affect non-BDW either.  All the structures it checks in the pre VLV check look like they ought to be allocated and present.  And the log doesn't have a crash in it...

So either this is timing related or this is the wrong bisect result.

Comment 10 liulei 2014-07-12 05:46:54 UTC

(In reply to comment #9)
> Comment on attachment 102512 [details]
> boot log
> 
> Yeah I'm not seeing how this patch would affect non-BDW either.  All the
> structures it checks in the pre VLV check look like they ought to be
> allocated and present.  And the log doesn't have a crash in it...
> 
> So either this is timing related or this is the wrong bisect result.
As this bug title said this issue can't be reproduced 100%. Each time I run 5 times. If machine boot successfully with no crash in this round, I considered the commit I tested was good. Eventually bisect shown this commit was the first bad commit.And i tested its parent commit 5 five times , I didn't get crash. Since you both thought this bisect was wrong. I will try more times for a more credible bisect result.I will reply after my second bisect.

Comment 11 liulei 2014-07-12 09:22:31 UTC

Here is the first bad commit. I must be a very lucky guy, yesterday,because I have tested this commit but it worked well!!!! 
commit 1bb9e632a0aeee1121e652ee4dc80e5e6f14bcd2
Author:     Daniel Vetter <daniel.vetter@ffwll.ch>
AuthorDate: Tue Jul 8 10:02:43 2014 +0200
Commit:     Daniel Vetter <daniel.vetter@ffwll.ch>
CommitDate: Wed Jul 9 09:52:13 2014 +0200

    drm/i915: Only unbind vgacon, not other console drivers

    The console subsystem only provides a function to switch to a given
    console, but we want to actually only switach away from vgacon.
    Unconditionally switching to the dummy console resulted in switching
    away from fbcon in multi-gpu setups when other gpu drivers are loaded
    before i915.

    Then either the reinitialization of fbcon when i915 registers its
    fbdev emulation or the teardown of the fbcon driver killed the
    machine. So only switch to the dummy console when it's required.

    Kudos to Chris for the original idea, I've only refined it a bit to
    still unregister vgacon even when it's currently unused.

    This regression has been introduced in

Comment 12 liulei 2014-07-12 10:55:10 UTC

My simply debug as below, 
With bad commit, below file will go to wrong place.
(-fixes branch: 92ae62076957c5904509f755eea0075ad60f74c6)
drivers/video/fbdev/core/fbmem.c line:1696

function "static int do_unregister_framebuffer(struct fb_info *fb_info)"
will return -EINVAL, because of this line code:
if (i < 0 || i >= FB_MAX || registered_fb[i] != fb_info) 
more detail: it is "registered_fb[i] != fb_info" cause function return -EINVAL

Comment 13 liulei 2014-07-12 10:55:46 UTC

My simply debug as below, maybe this will help
With bad commit, below file will go to wrong place.
(-fixes branch: 92ae62076957c5904509f755eea0075ad60f74c6)
drivers/video/fbdev/core/fbmem.c line:1696

function "static int do_unregister_framebuffer(struct fb_info *fb_info)"
will return -EINVAL, because of this line code:
if (i < 0 || i >= FB_MAX || registered_fb[i] != fb_info) 
more detail: it is "registered_fb[i] != fb_info" cause function return -EINVAL

Comment 14 liulei 2014-07-12 12:11:06 UTC

Excuse my stupid debug method i add two lines log message. 
printk(KERN_WARNING "fb_info %p registered_fb  %p",fb_info,registered_fb[i]);
if (i < 0 || i >= FB_MAX || ( registered_fb[i] != fb_info ))
                return -EINVAL;
printk(KERN_WARNING "Pass this line?" );

On HSW, log shows 
[   2.082948]fb_info ffff8801454fc800 registered_fb ffff8801454fc800 (this is last line)

On BDW, log shows 
[   27.535818] fb: switching to inteldrmfb from EFI VGA
[   27.535821] fb_info ffff880149e06000 registered_fb  ffff880149e06000
[   27.673416] Pass this line?
[   27.673418] Console: switching to colour dummy device 80x25

My last comment based on HSW machine debug, may be it's not accurate.I will feel such grateful, if someone takes a minute to educate me why this happen.

Comment 15 Chris Wilson 2014-07-12 12:46:03 UTC

Created attachment 102667 [details] [review]
Trace fb comings and goings

Try this patch to see what tale it gives for the efifb.

Comment 16 liulei 2014-07-13 07:36:31 UTC

(In reply to comment #15)
> Created attachment 102667 [details] [review] [review]
> Trace fb comings and goings
> 
> Try this patch to see what tale it gives for the efifb.
There is nothing different output. If i translate output to serial port output will show as below:

[   26.895983] ACPI: Power Button [PWRF]
[   26.898922] [drm] Memory usable by graphics device = 4096M
[   26.898923] [drm] Replacing VGA console driver
[   27.449211] fb: switching to inteldrmfb from EFI VGA
[   27.509560] Unregistering EFI VGA framebuffer
[   27.509577] Console: switching to colour dummy device 80x25

Comment 17 Daniel Vetter 2014-07-14 09:01:36 UTC

Please retest with latest drm-intel-fixes, that has a fix for vgacon unbinding.

Note that the fbcon setup is done with the console_lock held, so if the machine dies in there no printk will reach netconsole or anything else really. You can try to debug with CONFIG_DRM_I915_FBDEV=n to avoid some of the fun, but this will likely not help in this case. Still worth a shot.

Comment 18 liulei 2014-07-15 03:34:50 UTC

(In reply to comment #17)
> Please retest with latest drm-intel-fixes, that has a fix for vgacon
> unbinding.

Still boot fail.

> Note that the fbcon setup is done with the console_lock held, so if the
> machine dies in there no printk will reach netconsole or anything else
> really. You can try to debug with CONFIG_DRM_I915_FBDEV=n to avoid some of
> the fun, but this will likely not help in this case. Still worth a shot.

Comment 19 Daniel Vetter 2014-07-15 07:08:03 UTC

Sorry I've confused myself. Are you sure about the bisect result in comment #11 ?

That patch changes the code back to how it was on older kernels in some situations (which fixed a regression). If that helps it means older kernels also should have failed too boot (e.g. 3.15).

Comment 20 liulei 2014-07-15 08:10:00 UTC

(In reply to comment #19)
> Sorry I've confused myself. Are you sure about the bisect result in comment
> #11 ?
> 
I tested his parents commit(f1e1c2129b79cfdaf07bca37c5a10569fe021abe) 10 times , all successfully boot. From this commit(1bb9e632a0aeee1121e652ee4dc80e5e6f14bcd2), failure rate became too high.

Comment 21 liulei 2014-07-16 05:51:40 UTC

If I build the issue kernel with "debug config", then the machine can boot up successfully . I attach both debug_config and normal config.

Comment 22 liulei 2014-07-16 05:52:07 UTC

Created attachment 102893 [details]
normal config

Comment 23 liulei 2014-07-16 05:52:32 UTC

Created attachment 102894 [details]
debug config

Comment 24 liulei 2014-07-16 06:04:43 UTC

I notice some differences between debug kernel dmesg and normal kernel dmesg .
Booting with debug kernel(boot up successfully) dmesg like below:
[    5.295520] [drm] Memory usable by graphics device = 2048M
[    5.299528] [drm:i915_gem_gtt_init] GMADR size = 256M
[    5.299531] [drm:i915_gem_gtt_init] GTT stolen size = 32M
[    5.299533] [drm:i915_gem_gtt_init] ppgtt mode: 1
[    5.299535] [drm] Replacing VGA console driver
[    5.314627] checking generic (b0000000 7e9000) vs hw (b0000000 10000000)
[    5.314629] fb: switching to inteldrmfb from EFI VGA
[    5.317939] Console: switching to colour dummy device 80x25
[    5.318122] usb 1-5: new high-speed USB device number 3 using xhci_hcd

Booting with normal kernel(boot failure) dmesg like this:
[   26.898922] [drm] Memory usable by graphics device = 2048M
[   26.898923] [drm] Replacing VGA console driver
[   27.449211] fb: switching to inteldrmfb from EFI VGA

Comment 25 liulei 2014-07-16 06:22:03 UTC

Created attachment 102897 [details]
diff_configs

Comment 26 Gordon Jin 2014-07-16 07:08:16 UTC

can any developer reproduce this?

Comment 27 Daniel Vetter 2014-07-16 07:22:27 UTC

Note: This seems to be only reproducible with the normal config.

Lu, please don't forget to add such crucial information from internal threads here.

(In reply to comment #25)
> Created attachment 102897 [details]
> diff_configs

Please attach unified diff (i.e. diff -u) since I can't read traditional diff output ;-)

A bunch of things to test:
- Please test 4c2e0990ade3251c9b5770aa8f06b06375b66f9f with the normal config extensively and make sure it really works. This is the parent of the commit which the offending patch tried to fix, so should have the same behaviour really.
- Please test a4de05268e674e8ed31df6348269e22d6c6a1803 with the normal config extensively and and make sure it really works. This is the patch which introcuded the regression the offending patch claims to fix.
- Please take the normal config and change config options step-by-step (yeah, this will take time so lower priority) until the kernel is stable. Then report which config option makes the kernel stable. Since it usually just takes a few reboots to hit the problem on broken kernels it's better to start with the broken .config: That way you don't have to boot 10+ times to make sure it really works, but can proceed to the next config option after the first hang.

Comment 28 liulei 2014-07-16 07:39:16 UTC

> (In reply to comment #25)
> > Created attachment 102897 [details]
> > diff_configs
> 
> Please attach unified diff (i.e. diff -u) since I can't read traditional
> diff output ;-)
> 
diff-u_configs.log  is attached. :)

Comment 29 liulei 2014-07-16 07:39:59 UTC

Created attachment 102900 [details]
diff-u_configs

Comment 30 liulei 2014-07-16 10:14:42 UTC

(In reply to comment #27)

> A bunch of things to test:
> - Please test 4c2e0990ade3251c9b5770aa8f06b06375b66f9f with the normal
> config extensively and make sure it really works. This is the parent of the
> commit which the offending patch tried to fix, so should have the same
> behaviour really.

-fixes branch (4c2e0990ade3251c9b5770aa8f06b06375b66f9f)  I reboot machine at least 10 times. I didn't find issue.
> - Please test a4de05268e674e8ed31df6348269e22d6c6a1803 with the normal
> config extensively and and make sure it really works. This is the patch
> which introcuded the regression the offending patch claims to fix.
-fixes branch (a4de05268e674e8ed31df6348269e22d6c6a1803)  I reboot machine at least 10 times. I didn't find issue.

Comment 31 wendy.wang 2014-07-18 08:14:50 UTC

Update more info about this issue fail rate:
BYT/HSW/BDW/BSW four platforms have the similar fail rate as below:
1. With debug mode config, the fail rate is about 2%
2. With non-debug mode config, the fail rate is about 50%

Comment 32 Daniel Vetter 2014-07-18 09:13:35 UTC

(In reply to comment #31)
> Update more info about this issue fail rate:
> BYT/HSW/BDW/BSW four platforms have the similar fail rate as below:
> 1. With debug mode config, the fail rate is about 2%
> 2. With non-debug mode config, the fail rate is about 50%

So can we please recheck the bisect? Afaik we've only done about 10 boot tests thus far each time, and it looks like a 2% failure rate could have crept through.

Comment 33 Gordon Jin 2014-07-21 05:34:59 UTC

(In reply to comment #32)
> (In reply to comment #31)
> > Update more info about this issue fail rate:
> > BYT/HSW/BDW/BSW four platforms have the similar fail rate as below:
> > 1. With debug mode config, the fail rate is about 2%
> > 2. With non-debug mode config, the fail rate is about 50%
> 
> So can we please recheck the bisect? Afaik we've only done about 10 boot
> tests thus far each time, and it looks like a 2% failure rate could have
> crept through.

I suggest focusing on non-debug mode, in which case 10 boot should be enough.
2% is too hard to bisect.

Comment 34 Ben Widawsky 2014-07-21 21:29:55 UTC

I am confused.

You've provided configs and diffs which do not just work on any of the mentioned SHAs in this thread that I can find.

There has been a lot of information. I would like the following table filled out, with the config files (not a config + a diff) attached.

Fill in the table:
        SHA        |       config             | failures per 10 boots
--------------------------------------------------------------------
<SHA of failing>   | <config name of failing> | <pass rate of failing>
<SHA of good>      | <config name of good>    | <pass rate of good>
<SHA of any others>| <config names>           | <pass rate>

Comment 35 Ben Widawsky 2014-07-21 21:35:11 UTC

And please list the BIOS versions and platforms you are using in that table.

Comment 36 Ben Widawsky 2014-07-22 00:56:34 UTC

Created attachment 103243 [details]
normal.config for attempt to reproduce 1bb9e63

The attached config file was originally based off of https://bugs.freedesktop.org/attachment.cgi?id=102893. That config with the diff seemed incomplete.

1bb9e63 | normal.config | 10/10 pass

Comment 37 Ben Widawsky 2014-07-22 00:57:15 UTC

(In reply to comment #36)
> Created attachment 103243 [details]
> normal.config for attempt to reproduce 1bb9e63
> 
> The attached config file was originally based off of
> https://bugs.freedesktop.org/attachment.cgi?id=102893. That config with the
> diff seemed incomplete.
> 
> 1bb9e63 | normal.config | 10/10 pass

Bios version: 77
Platform: WTM2

Comment 38 liulei 2014-07-22 05:50:52 UTC

-fixes
f1e1c212 |  normal.config | 79/79 pass

-nightly
411fa8b2 |  debug.config  | 10/10 pass
411fa8b2 |  normal.config | 3 /10 pass
411fa8b2 |  distri.config | 10/10 pass

BIOS : HSWLPTU1.86C.0135.R01.1311020052
Platform: SDS

Comment 39 liulei 2014-07-22 15:20:38 UTC

> - Please take the normal config and change config options step-by-step
> (yeah, this will take time so lower priority) until the kernel is stable.
> Then report which config option makes the kernel stable. Since it usually
> just takes a few reboots to hit the problem on broken kernels it's better to
> start with the broken .config: That way you don't have to boot 10+ times to
> make sure it really works, but can proceed to the next config option after
> the first hang.

Hi, according my preliminary bisect,bellow two configs can decrease boot failure rate, if add them in "normal.config".
CONFIG_PROVE_LOCKING=y
CONFIG_DEBUG_KERNEL=y

Comment 40 liulei 2014-07-23 02:12:27 UTC

I tested on one machine:
-fixes
f1e1c212 |  normal.config | 21/21  pass
1bb9e632 |  normal.config | 13/14 pass

-nightly
411fa8b2 |  normal.config |  18/19 pass

BIOS:      V80.R01
platform : WTM

On another machine:
1bb9e632 |  normal.config | 2/5 pass

BIOS:      V83.R00
platform:  WTM

Comment 41 liulei 2014-07-23 03:34:45 UTC

(In reply to comment #40)
> I tested on one machine:
> -fixes
> f1e1c212 |  normal.config | 21/21  pass
> 1bb9e632 |  normal.config | 13/14 pass
> 
> -nightly
> 411fa8b2 |  normal.config |  18/19 pass
> 
> BIOS:      V80.R01
> platform : WTM
> 
> On another machine:
> 1bb9e632 |  normal.config | 2/5 pass
> 
> BIOS:      V83.R00
> platform:  WTM

Update machine info.
I tested on one BDW machine:
-fixes
f1e1c212 |  normal.config | 21/21  pass
1bb9e632 |  normal.config | 13/14 pass

-nightly
411fa8b2 |  normal.config |  18/19 pass

BIOS:      V80.R01
platform : WTM2

On another BDW machine:
1bb9e632 |  normal.config | 2/5 pass

BIOS:      V83.R00
platform:  WTM2

Comment 42 liulei 2014-07-23 09:39:03 UTC

(In reply to comment #41) 
> Update machine info.
> I tested on one BDW machine:
> -fixes
> f1e1c212 |  normal.config | 21/21  pass
> 1bb9e632 |  normal.config | 13/14 pass
> 
> -nightly
> 411fa8b2 |  normal.config |  18/19 pass
> 
> BIOS:      V80.R01
> platform : WTM2
> 
> On another BDW machine:
> 1bb9e632 |  normal.config | 2/5 pass
> 
> BIOS:      V83.R00
> platform:  WTM2
This time I confirm all machine info are right. I'm badly sorry for my mistake.
-fixes
 f1e1c212 |  normal.config | 21/21  pass
 1bb9e632 |  normal.config | 13/14 pass

-nightly
 411fa8b2 |  normal.config |  18/19 pass
 BIOS:      V80
 platform : WTM1

 On another BDW machine:
 1bb9e632 |  normal.config | 2/5 pass
 BIOS:      V82.R00
 platform:  STP

 Third BDW machine:
 1bb9e632 |  normal.config | 19/20 pass
 BIOS:      V82.R00
 platform:  STP

Comment 43 Daniel Vetter 2014-07-23 10:56:07 UTC

Created attachment 103329 [details] [review]
insert delay

Please test this quick hack.

Comment 44 Daniel Vetter 2014-07-23 13:19:25 UTC

Also please test an otherwise broken kernel config with CONFIG_FB_EFI=n.

Comment 45 liulei 2014-07-23 15:24:24 UTC

(In reply to comment #43)
> Created attachment 103329 [details] [review] [review]
> insert delay
> 
> Please test this quick hack.
This patch didn't work. When it crashed, I didn't see any log that you added in patch, Or I missed them because of screen quickly output. If you need I can attach output later

Comment 46 liulei 2014-07-24 02:04:19 UTC

(In reply to comment #44)
> Also please test an otherwise broken kernel config with CONFIG_FB_EFI=n.
This amazing config works. I tested about 10 times without boot failure when using CONFIG_FB_EFI=n. I tested 3 times all failed, after I change to CONFIG_FB_EFI=y from CONFIG_FB_EFI=n.

Comment 47 Daniel Vetter 2014-07-24 08:43:10 UTC

(In reply to comment #46)
> (In reply to comment #44)
> > Also please test an otherwise broken kernel config with CONFIG_FB_EFI=n.
> This amazing config works. I tested about 10 times without boot failure when
> using CONFIG_FB_EFI=n. I tested 3 times all failed, after I change to
> CONFIG_FB_EFI=y from CONFIG_FB_EFI=n.

Can you please make _really_ sure that this helps on all machines? This is a tricky bug so I don't want to jump to conclusion.

Comment 48 liulei 2014-07-25 07:27:30 UTC

(In reply to comment #47)
> (In reply to comment #46)
> > (In reply to comment #44)
> > > Also please test an otherwise broken kernel config with CONFIG_FB_EFI=n.
> > This amazing config works. I tested about 10 times without boot failure when
> > using CONFIG_FB_EFI=n. I tested 3 times all failed, after I change to
> > CONFIG_FB_EFI=y from CONFIG_FB_EFI=n.
> 
> Can you please make _really_ sure that this helps on all machines? This is a
> tricky bug so I don't want to jump to conclusion.
I retested total 272 times with CONFIG_FB_EFI=n on the machine that can easily reproduce this bug, and the 10 times on other two machines, all of them booted successfully.Excuse me, I think it's a little hard to make sure this config works on _all_ machine, because this issue can't be reproduced for every reboot, even on the machine that can easily reproduce this issue. If this is not enough to prove config works. I need some time to satisfy your requirement. :)

Comment 49 liulei 2014-07-29 06:35:20 UTC

These days the same issue doesn't bother us since we build kernel with CONFIG_FB_EFI=n . But we must set CONFIG_FB_VESA=n, too, to avoid some machines boot failure. I think we just hide this issue instead of fixing it.

Comment 50 dog 2014-08-05 14:11:37 UTC

Given comment 48 in this bug, why is the bug not closed?  If not closed, what is the next step to close it?

Comment 51 Chris Wilson 2014-08-05 14:23:44 UTC

Created attachment 104073 [details] [review]
Avoid fbcon memory corruption

Considering that the bug has been narrowed down to a timing issue in takeover from efifb, it could just be bug 72765 - and should be fixed by the attached.

Comment 52 lu hua 2014-08-06 02:51:38 UTC

Created attachment 104117 [details]
boot log(patch)

(In reply to comment #51)
> Created attachment 104073 [details] [review] [review]
> Avoid fbcon memory corruption
> 
> Considering that the bug has been narrowed down to a timing issue in
> takeover from efifb, it could just be bug 72765 - and should be fixed by the
> attached.

Apply this patch, It still fails.

Comment 53 Gordon Jin 2014-09-16 08:34:59 UTC

Daniel, how do you think about the bisected patch (mentioned in comment#11)?

Comment 54 Daniel Vetter 2014-10-01 15:51:49 UTC

(In reply to comment #53)
> Daniel, how do you think about the bisected patch (mentioned in comment#11)?

It's a red herring, I think it's clear now that EFIFB is the culprit and we've just been unlucky with the timing.

Comment 55 Daniel Vetter 2014-10-20 08:05:35 UTC

Adding Peter Jones who's efifb maintainer.

Comment 56 Daniel Vetter 2014-10-20 12:10:07 UTC

Dropping the regression marker on this one since it's just EFIFB being broken, and my patch just made it a bit easier to hit on a few machines. Still treat it as P1 since it's blocking QA.

Comment 57 Daniel Vetter 2014-10-20 12:10:51 UTC

And de-assigning since (due to lack of an efi system here that's already set up) I'm probably not the best guy to look at this right now.

Comment 58 Paulo Zanoni 2014-10-23 18:35:36 UTC

Possible duplicate: https://bugzilla.kernel.org/show_bug.cgi?id=86671

Comment 59 Jani Nikula 2015-01-30 06:56:42 UTC

Please retest with current drm-intel-nightly, I'm guessing this may be fixed by

commit 0485c9dc24ec0939b42ca5104c0373297506b555
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Fri Nov 14 10:09:49 2014 +0100

    drm/i915: Kick fbdev before vgacon

also see bug 82439.

Comment 60 lu hua 2015-02-02 07:30:48 UTC

Test 10 cycles on BYT and BDW, it works well. Close it.

Comment 61 lu hua 2015-02-02 07:31:22 UTC

Verified.Fixed.

Comment 62 Elizabeth 2017-10-06 14:37:20 UTC

Closing old verified.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.