Bug 59408

Summary: [ivb] memory corruption after kexec
Product: DRI Reporter: Jiri Slaby <jirislaby>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium    
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
lspci -vvnnxxxs 2.0
none
dmesg from kexec run none

Description Jiri Slaby 2013-01-15 10:34:18 UTC
Created attachment 73071 [details]
lspci -vvnnxxxs 2.0

Using KMS, when I kexec from kernel 3.7.2 (to the same one), memory gets corrupted. I see 0x0720 (white space on a black background) all over the structures. I do not start X at all. This is a pure fb console.

My IOMMU also complains in the kexec'ed kernel:
dmar: DRHD: handling fault status reg 3
dmar: DMAR:[DMA Read] Request device [00:02.0] fault addr fffff000 
DMAR:[fault reason 06] PTE Read access is not set

This happens until i915 is loaded in the new kernel:
dmar: DRHD: handling fault status reg 3
dmar: DMAR:[DMA Read] Request device [00:02.0] fault addr ff34c000 
DMAR:[fault reason 06] PTE Read access is not set
[drm] GMBUS [i915 gmbus dpb] timed out, falling back to bit banging on pin 5
fbcon: inteldrmfb (fb0) is primary device
<here it stops to access invalid memory>


There is no pci_driver->shutdown method and there apparently should be one that stops the card in some manner. I tried to add the .shutdown hook which did the same as .remove, but it had no effect. It was a shot in the dark though.
Comment 1 Daniel Vetter 2013-01-15 11:47:46 UTC
We probably need to restore the GTT PTEs to the values the BIOS expects - otherwise vesafb will scribble all over memory ...
Comment 2 Daniel Vetter 2013-02-10 19:43:05 UTC
For fun, can you please attach the complete dmesg of the kexec'ed kernel? Just to figure out whether it really might be vesafb/efifb scribbling all over main memory ...
Comment 3 Jiri Slaby 2013-02-11 18:27:33 UTC
Created attachment 74636 [details]
dmesg from kexec run
Comment 4 Daniel Vetter 2013-08-06 06:43:48 UTC
Sorry for the long delay here, but I still don't have an idea. Can you please retest on latest upstream drm-intel-nightly or linux-next trees please?
Comment 5 Chris Wilson 2013-08-07 09:08:33 UTC
I wonder if we need to do a TLB flush for the PDE as well. Is there such a beast, or maybe it is just part of the ring TLBs?

I would certainly retest with https://patchwork.kernel.org/patch/2839512/
Comment 6 Daniel Vetter 2013-09-09 09:16:33 UTC
Ping for test results. Note that the patch Chris referenced is merged already.
Comment 7 Jiri Slaby 2013-09-09 09:29:49 UTC
(In reply to comment #6)
> Ping for test results. Note that the patch Chris referenced is merged
> already.

Ok, I'm running 3.11 already. I will try if it works...
Comment 8 Jiri Slaby 2013-09-15 15:52:05 UTC
(In reply to comment #7)
> (In reply to comment #6)
> > Ping for test results. Note that the patch Chris referenced is merged
> > already.
> 
> Ok, I'm running 3.11 already. I will try if it works...

It seems so...

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.