Summary: | [BDW][drm:gen8_irq_handler [i915]] *ERROR* The master control interrupt lied (SDE)! | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | wendy.wang | ||||||||
Component: | DRM/Intel | Assignee: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||||||
Status: | CLOSED FIXED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||||||
Severity: | normal | ||||||||||
Priority: | medium | CC: | benoitg, c.affolter, cs_gon, darkbasic, intel-gfx-bugs, jairo.daniel.miramontes.caton, jwboyer, morphuspam, nbowler, nmcveity, n.schnelle, sickvolo | ||||||||
Version: | DRI git | ||||||||||
Hardware: | Other | ||||||||||
OS: | All | ||||||||||
Whiteboard: | |||||||||||
i915 platform: | i915 features: | ||||||||||
Attachments: |
|
Description
wendy.wang
2015-09-23 07:32:07 UTC
(In reply to wendy.wang from comment #0) > Dmesg log attached. There's no attachment. Created attachment 118408 [details]
dmesg info with drm.debug=0xe
This bug also exists on the latest drm-intel-fixes and drm-intel-next-queued branch. Hmm, I'm wondering if we're handling this correctly, and if that could cause the issue: "For each bit, the IIR can store a second pending interrupt if two or more of the same interrupt conditions occur before the first condition is cleared. Upon clearing the interrupt, the IIR bit will momentarily go low, then return high to indicate there is another interrupt pending. Only the rising edge of the PCH Display interrupt will cause the North Display IIR (DEIIR) PCH Display Interrupt event bit to be set, so all PCH Display Interrupts, including back to back interrupts, must be cleared here before a new PCH Display Interrupt can cause the DEIIR to be set." Good commit: commit b42fa27abff5970649ff07b0ce1691f6464097f3 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Wed Jul 8 21:48:11 2015 +0200 drm-intel-nightly: 2015y-07m-08d-19h-47m-26s UTC integration manifest Bad commit: commit 5b4a647fe39cf42753761b7d4ee20d695eec589c Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Thu Jul 9 21:57:35 2015 +0200 drm-intel-nightly: 2015y-07m-09d-19h-56m-44s UTC integration manifest On my bdw after this commit the messages started to appear: commit aaf5ec2e51ab1d9c5e962b4728a1107ed3ff7a3e Author: Sonika Jindal <sonika.jindal@intel.com> Date: Wed Jul 8 17:07:47 2015 +0530 drm/i915: Handle HPD when it has actually occurred But when I just tried with latest nightly, I couldn't reproduce the any dmesg errors. Created attachment 118478 [details]
dmesg (without drm.debug)
I was going to report the very same issue for my XPS 13 2015 9343 (Broadwell) when I saw this.
I attached dmesg (without drm.debug). Kernel is 4.3.0-rc3-mainline.
(In reply to Mika Kuoppala from comment #6) > But when I just tried with latest nightly, I couldn't reproduce the any > dmesg errors. I still see this on BDW. Created attachment 118559 [details]
dmesg reproducing the problem on bdw
Curiously the errors are next to DP aux traffic.
Random idea: --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -2345,6 +2345,7 @@ static irqreturn_t gen8_irq_handler(int irq, void *arg) u32 pch_iir = I915_READ(SDEIIR); if (pch_iir) { I915_WRITE(SDEIIR, pch_iir); + POSTING_READ(SDEIIR); ret = IRQ_HANDLED; (In reply to Ville Syrjala from comment #10) > + POSTING_READ(SDEIIR); Does not help. However this helps. We're missing something. diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 76bd40e13391..0d524034abd7 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -1827,6 +1827,9 @@ static void ibx_hpd_irq_handler(struct drm_device *dev, u32 hotplug_trigger, dig_hotplug_reg = I915_READ(PCH_PORT_HOTPLUG); I915_WRITE(PCH_PORT_HOTPLUG, dig_hotplug_reg); + if (!hotplug_trigger) + return; + intel_get_hpd_pins(&pin_mask, &long_mask, hotplug_trigger, dig_hotplug_reg, hpd, pch_port_hotplug_long_detect); @@ -1934,8 +1937,7 @@ static void cpt_irq_handler(struct drm_device *dev, u32 pch_iir) int pipe; u32 hotplug_trigger = pch_iir & SDE_HOTPLUG_MASK_CPT; - if (hotplug_trigger) - ibx_hpd_irq_handler(dev, hotplug_trigger, hpd_cpt); + ibx_hpd_irq_handler(dev, hotplug_trigger, hpd_cpt); if (pch_iir & SDE_AUDIO_POWER_MASK_CPT) { int port = ffs((pch_iir & SDE_AUDIO_POWER_MASK_CPT) >> (In reply to Jani Nikula from comment #12) > However this helps. We're missing something. > > diff --git a/drivers/gpu/drm/i915/i915_irq.c > b/drivers/gpu/drm/i915/i915_irq.c > index 76bd40e13391..0d524034abd7 100644 > --- a/drivers/gpu/drm/i915/i915_irq.c > +++ b/drivers/gpu/drm/i915/i915_irq.c > @@ -1827,6 +1827,9 @@ static void ibx_hpd_irq_handler(struct drm_device > *dev, u32 hotplug_trigger, > dig_hotplug_reg = I915_READ(PCH_PORT_HOTPLUG); > I915_WRITE(PCH_PORT_HOTPLUG, dig_hotplug_reg); Is the read alone enough, or do you need the write too? > > + if (!hotplug_trigger) > + return; > + > intel_get_hpd_pins(&pin_mask, &long_mask, hotplug_trigger, > dig_hotplug_reg, hpd, > pch_port_hotplug_long_detect); > @@ -1934,8 +1937,7 @@ static void cpt_irq_handler(struct drm_device *dev, > u32 pch_iir) > int pipe; > u32 hotplug_trigger = pch_iir & SDE_HOTPLUG_MASK_CPT; > > - if (hotplug_trigger) > - ibx_hpd_irq_handler(dev, hotplug_trigger, hpd_cpt); > + ibx_hpd_irq_handler(dev, hotplug_trigger, hpd_cpt); > > if (pch_iir & SDE_AUDIO_POWER_MASK_CPT) { > int port = ffs((pch_iir & SDE_AUDIO_POWER_MASK_CPT) >> (In reply to Jani Nikula from comment #12) > However this helps. We're missing something. > > diff --git a/drivers/gpu/drm/i915/i915_irq.c > b/drivers/gpu/drm/i915/i915_irq.c > index 76bd40e13391..0d524034abd7 100644 > --- a/drivers/gpu/drm/i915/i915_irq.c > +++ b/drivers/gpu/drm/i915/i915_irq.c > @@ -1827,6 +1827,9 @@ static void ibx_hpd_irq_handler(struct drm_device > *dev, u32 hotplug_trigger, > dig_hotplug_reg = I915_READ(PCH_PORT_HOTPLUG); > I915_WRITE(PCH_PORT_HOTPLUG, dig_hotplug_reg); > > + if (!hotplug_trigger) > + return; > + Oh, but the whole point of the patch ( drm/i915: Handle HPD when it has actually occurred) was to disallaow writing to PCH_PORT_HOTPLUG register when the HPD did not occur. And this shows stable HPD with SKL for me and was inline with other interrupts handling. Do these "[drm:gen8_irq_handler [i915]] *ERROR* The master control interrupt lied (SDE)!" messages have any effect on HPD? > intel_get_hpd_pins(&pin_mask, &long_mask, hotplug_trigger, > dig_hotplug_reg, hpd, > pch_port_hotplug_long_detect); > @@ -1934,8 +1937,7 @@ static void cpt_irq_handler(struct drm_device *dev, > u32 pch_iir) > int pipe; > u32 hotplug_trigger = pch_iir & SDE_HOTPLUG_MASK_CPT; > > - if (hotplug_trigger) > - ibx_hpd_irq_handler(dev, hotplug_trigger, hpd_cpt); > + ibx_hpd_irq_handler(dev, hotplug_trigger, hpd_cpt); > > if (pch_iir & SDE_AUDIO_POWER_MASK_CPT) { > int port = ffs((pch_iir & SDE_AUDIO_POWER_MASK_CPT) >> (In reply to Ville Syrjala from comment #13) > (In reply to Jani Nikula from comment #12) > > However this helps. We're missing something. > > > > diff --git a/drivers/gpu/drm/i915/i915_irq.c > > b/drivers/gpu/drm/i915/i915_irq.c > > index 76bd40e13391..0d524034abd7 100644 > > --- a/drivers/gpu/drm/i915/i915_irq.c > > +++ b/drivers/gpu/drm/i915/i915_irq.c > > @@ -1827,6 +1827,9 @@ static void ibx_hpd_irq_handler(struct drm_device > > *dev, u32 hotplug_trigger, > > dig_hotplug_reg = I915_READ(PCH_PORT_HOTPLUG); > > I915_WRITE(PCH_PORT_HOTPLUG, dig_hotplug_reg); > > Is the read alone enough, or do you need the write too? Moving the write below the !hotplug_trigger check brings the problem back, i.e. the write is also needed. > > + if (!hotplug_trigger) > > + return; > > + > > intel_get_hpd_pins(&pin_mask, &long_mask, hotplug_trigger, > > dig_hotplug_reg, hpd, > > pch_port_hotplug_long_detect); > > @@ -1934,8 +1937,7 @@ static void cpt_irq_handler(struct drm_device *dev, > > u32 pch_iir) > > int pipe; > > u32 hotplug_trigger = pch_iir & SDE_HOTPLUG_MASK_CPT; > > > > - if (hotplug_trigger) > > - ibx_hpd_irq_handler(dev, hotplug_trigger, hpd_cpt); > > + ibx_hpd_irq_handler(dev, hotplug_trigger, hpd_cpt); > > > > if (pch_iir & SDE_AUDIO_POWER_MASK_CPT) { > > int port = ffs((pch_iir & SDE_AUDIO_POWER_MASK_CPT) >> (In reply to Jani Nikula from comment #15) > (In reply to Ville Syrjala from comment #13) > > (In reply to Jani Nikula from comment #12) > > > However this helps. We're missing something. > > > > > > diff --git a/drivers/gpu/drm/i915/i915_irq.c > > > b/drivers/gpu/drm/i915/i915_irq.c > > > index 76bd40e13391..0d524034abd7 100644 > > > --- a/drivers/gpu/drm/i915/i915_irq.c > > > +++ b/drivers/gpu/drm/i915/i915_irq.c > > > @@ -1827,6 +1827,9 @@ static void ibx_hpd_irq_handler(struct drm_device > > > *dev, u32 hotplug_trigger, > > > dig_hotplug_reg = I915_READ(PCH_PORT_HOTPLUG); > > > I915_WRITE(PCH_PORT_HOTPLUG, dig_hotplug_reg); > > > > Is the read alone enough, or do you need the write too? > > Moving the write below the !hotplug_trigger check brings the problem back, > i.e. the write is also needed. Are the status bits actually showing long/short pulses when this happens? Maybe we can just do something like this: dig_hotplug_reg = I915_READ(PCH_PORT_HOTPLUG); if (!hotplug_trigger) dig_hotplug_reg &= ~(*_HOTPLUG_STATUS_MASK); I915_WRITE(PCH_PORT_HOTPLUG, dig_hotplug_reg if (!hotplug_trigger) return; *** Bug 92454 has been marked as a duplicate of this bug. *** commit 97e5ed1111dcc5300a0f59a55248cd243937a8ab Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Fri Oct 23 10:56:12 2015 +0200 drm/i915: shut up gen8+ SDE irq dmesg noise This also gets rid of the messages, and does *not* print the ### debug msg, i.e. the status bits are clear. diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 6e0a5683bbdc..7716181473dc 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -1825,7 +1825,21 @@ static void ibx_hpd_irq_handler(struct drm_device *dev, u32 hotplug_trigger, u32 dig_hotplug_reg, pin_mask = 0, long_mask = 0; dig_hotplug_reg = I915_READ(PCH_PORT_HOTPLUG); + if (!hotplug_trigger) { + u32 mask = PORTA_HOTPLUG_STATUS_MASK | + PORTD_HOTPLUG_STATUS_MASK | + PORTC_HOTPLUG_STATUS_MASK | + PORTB_HOTPLUG_STATUS_MASK; + + if (dig_hotplug_reg & mask) + DRM_DEBUG_KMS("### %08x\n", dig_hotplug_reg & mask); + + dig_hotplug_reg &= ~mask; + } + I915_WRITE(PCH_PORT_HOTPLUG, dig_hotplug_reg); + if (!hotplug_trigger) + return; intel_get_hpd_pins(&pin_mask, &long_mask, hotplug_trigger, dig_hotplug_reg, hpd, @@ -1840,8 +1854,7 @@ static void ibx_irq_handler(struct drm_device *dev, u32 pch_iir) int pipe; u32 hotplug_trigger = pch_iir & SDE_HOTPLUG_MASK; - if (hotplug_trigger) - ibx_hpd_irq_handler(dev, hotplug_trigger, hpd_ibx); + ibx_hpd_irq_handler(dev, hotplug_trigger, hpd_ibx); if (pch_iir & SDE_AUDIO_POWER_MASK) { int port = ffs((pch_iir & SDE_AUDIO_POWER_MASK) >> @@ -1934,8 +1947,7 @@ static void cpt_irq_handler(struct drm_device *dev, u32 pch_iir) int pipe; u32 hotplug_trigger = pch_iir & SDE_HOTPLUG_MASK_CPT; - if (hotplug_trigger) - ibx_hpd_irq_handler(dev, hotplug_trigger, hpd_cpt); + ibx_hpd_irq_handler(dev, hotplug_trigger, hpd_cpt); if (pch_iir & SDE_AUDIO_POWER_MASK_CPT) { int port = ffs((pch_iir & SDE_AUDIO_POWER_MASK_CPT) >> commit 6a39d7c986be4fd18eb019e9cdbf774ec36c9f77 Author: Jani Nikula <jani.nikula@intel.com> Date: Wed Nov 25 16:47:22 2015 +0200 drm/i915: fix the SDE irq dmesg warnings properly Still present as of commit 0035ecf934fae0492c2d90390f88b8c79e806ffa Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Mon Dec 14 10:41:10 2015 +0100 drm-intel-nightly: 2015y-12m-14d-09h-40m-37s UTC integration manifest will look into the debug printk next time I have the box connected to a display. Hello. Just wanted to let you know (if you were unaware) this bug does also make the system freeze randomly. I have tried to figure out what triggers it, but it seems just random. If anything it happens after 2-3 hours of activity and then 2-5 mins idle? On kernel 3.13 it works. Please feel free to ask me to test fixes if you need. I can compile myself (with a patch) or I can test a kernel you compile. Hope its okay to write here as a non-dev. (In reply to Joakim Koed from comment #23) > Just wanted to let you know (if you were unaware) this bug does also make > the system freeze randomly. I have tried to figure out what triggers it, but > it seems just random. If anything it happens after 2-3 hours of activity and > then 2-5 mins idle? I don't think what you're seeing has anything to do with this bug. Please file a new bug report for the symptoms you're seeing. > Hope its okay to write here as a non-dev. Absolutely. commit 2dfb0b816d224379efc534764388745c474abeb4 Author: Jani Nikula <jani.nikula@intel.com> Date: Thu Jan 7 10:29:10 2016 +0200 drm/i915: shut up gen8+ SDE irq dmesg noise, again When did you shut up the messages for the first time? I always saw them in every RC, including latest 4.4-rc8. (In reply to darkbasic from comment #26) > When did you shut up the messages for the first time? I always saw them in > every RC, including latest 4.4-rc8. drm-next, not Linus' upstream. Please see this bug report: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1520040 This bug makes my Dell Latitude laptop shutdown (power off). In that bug report I provided a lot of information about my system, logs and tests I did. Currently, only kernel "3.13.0-36-generic" running module "/lib/modules/3.13.0-36-generic/kernel/ubuntu/i915/i915_bdw.ko" doesn't have this bug. Hence, this bug is far from "solved" and shutting up dmesg definitely won't solve it. (In reply to Yuri from comment #28) > Please see this bug report: > > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1520040 > > This bug makes my Dell Latitude laptop shutdown (power off). In that bug > report I provided a lot of information about my system, logs and tests I > did. Currently, only kernel "3.13.0-36-generic" running module > "/lib/modules/3.13.0-36-generic/kernel/ubuntu/i915/i915_bdw.ko" doesn't have > this bug. > > Hence, this bug is far from "solved" and shutting up dmesg definitely won't > solve it. That has nothing to do with these error messages. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.