Summary: | Warning PID: 1150 at drivers/gpu/drm/i915/intel_cdclk.c:835 | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | Valerio Vanni <valerio.vanni> | ||||||||||
Component: | DRM/Intel | Assignee: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||||||||
Status: | CLOSED WORKSFORME | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||||||||
Severity: | normal | ||||||||||||
Priority: | medium | CC: | intel-gfx-bugs | ||||||||||
Version: | unspecified | ||||||||||||
Hardware: | x86-64 (AMD64) | ||||||||||||
OS: | Linux (All) | ||||||||||||
Whiteboard: | Triaged | ||||||||||||
i915 platform: | CFL | i915 features: | |||||||||||
Attachments: |
|
Description
Valerio Vanni
2018-12-17 16:14:08 UTC
I would disable the GuC. It has not been properly validated for Mesa. Valerio, Can you attach the full dmesg log from boot with kernal parameters drm.debug=0x1e log_buf_len=4M? Also, can you verify with Kernel 4.20? (In reply to Lakshmi from comment #2) > Valerio, Can you attach the full dmesg log from boot with kernal parameters > drm.debug=0x1e log_buf_len=4M? > Also, can you verify with Kernel 4.20? I've tried, but it fills the buffer with [drm:] entries and I lose all early boot ones. I've tried increasing log_buf_len, but I've got nothing better. Created attachment 142848 [details]
dmesg with drm.debug=0x1e (working state)
This is a dmesg of the working (no error) state. I'll post also one when the error will occur.
Created attachment 142873 [details]
dmesg with drm.debug=0x1e (warning state)
Comment on attachment 142873 [details]
dmesg with drm.debug=0x1e (warning state)
Today I got the warning
Created attachment 142938 [details]
dmesg with drm.debug=0x1e (warning state - kernel 4.20)
And now I got the warning with kernel 4.20
Created attachment 143173 [details]
warning with 4.20.3 -
(In reply to Valerio Vanni from comment #8) > Created attachment 143173 [details] > warning with 4.20.3 - Can you try comment 1 and check if you still see the warning? Valerio, any updates? Since some time I haven't seen the warning anymore. I'm not able to relate this change to anything. Now I'm using kernel 4.20.3, but the warning disappeared later (i mean: it happened also with this kernel). I didn't change hardware. Debian Stretch got many updates, this is the only change I can see. The guc has always been active. It just happened. Now I'm disabling the GUC. And now it happened even with GUC disabled. Valerio, I couldn't find the motherboard you describe. Do you mean the B360M-A? https://www.newegg.com/Product/Product.aspx?Item=N82E16813119086 Things that I would try: - eliminate hardware as a root cause - check for bios update - verify that RAM passes memcheck What is it about the error message that causes you to think this is a graphics bug? The stack trace makes me think there is something wrong with power management. The symptoms (won't wake up) seem to be similar too, although from your description, it doesn't sound like the machine is asleep. >Valerio, I couldn't find the motherboard you describe. Do you mean the B360M-A? > > https://www.newegg.com/Product/Product.aspx?Item=N82E16813119086 Yes, I typed it wrong. >Things that I would try: > > - eliminate hardware as a root cause > - check for bios update BIOS is 1602, updated at Novembre. Now I see that there's an updated version, I'll try immediately. > - verify that RAM passes memcheck I'll try this too. After I built the machine, I did and it was ok. >What is it about the error message that causes you to think this is a graphics >bug? The stack trace makes me think there is something wrong with power >management. I suspected it from the first line WARNING: CPU: 1 PID: 1150 at drivers/gpu/drm/i915/intel_cdclk.c:835 skl_get_cdclk+0x211/0x250 [i915] Where in stack trace do you find power management issue? I'm not very good at analyize stack trace, as I said i focused on the first lines. >The symptoms (won't wake up) seem to be similar too, although from >your >description, it doesn't sound like the machine is asleep. But perhaps they are not related. Sometimes (but even less frequently) the machine starts up (power led on) but with blank screen, no response to ping etc. I'm not able to say if it stops at BIOS or during linux boot because it usually happens when I'm on another machine with a kvm switch. I'm on another machine, I power on this, after some time I point the kvm switch on this machine and I find it dead. I have to fix my serial port bracket, because it has a wrong pinout for thism MB. Perhaps with serial console I can see something on serial port, if Linux boot has at least started. When I see this warning, instead, the machine is alive and (seems to be) working. (In reply to Valerio Vanni from comment #15) > >Valerio, I couldn't find the motherboard you describe. Do you mean the B360M-A? > > > > https://www.newegg.com/Product/Product.aspx?Item=N82E16813119086 > > Yes, I typed it wrong. > > > >Things that I would try: > > > > - eliminate hardware as a root cause > > - check for bios update > > BIOS is 1602, updated at Novembre. Now I see that there's an updated > version, I'll try immediately. > > > - verify that RAM passes memcheck > > I'll try this too. After I built the machine, I did and it was ok. > > >What is it about the error message that causes you to think this is a graphics >bug? The stack trace makes me think there is something wrong with power >management. > > I suspected it from the first line > WARNING: CPU: 1 PID: 1150 at drivers/gpu/drm/i915/intel_cdclk.c:835 > skl_get_cdclk+0x211/0x250 [i915] > > Where in stack trace do you find power management issue? > I'm not very good at analyize stack trace, as I said i focused on the first > lines. > > > >The symptoms (won't wake up) seem to be similar too, although from >your >description, it doesn't sound like the machine is asleep. > > But perhaps they are not related. > > Sometimes (but even less frequently) the machine starts up (power led on) > but with blank screen, no response to ping etc. > I'm not able to say if it stops at BIOS or during linux boot because it > usually happens when I'm on another machine with a kvm switch. I'm on > another machine, I power on this, after some time I point the kvm switch on > this machine and I find it dead. > > I have to fix my serial port bracket, because it has a wrong pinout for > thism MB. Perhaps with serial console I can see something on serial port, if > Linux boot has at least started. > > When I see this warning, instead, the machine is alive and (seems to be) > working. Any further updates here? Soon after my last message, I updated BIOS to version 2012. Since then, the issue has not come out anymore. The issue happened only after a reboot or shutdown, never after a suspend (S3) and resume. And for all this time I've always done full reboots to do a better test with more triggering events. It seems that BIOS updated fixed the issue. (In reply to Valerio Vanni from comment #17) > Soon after my last message, I updated BIOS to version 2012. > Since then, the issue has not come out anymore. > > The issue happened only after a reboot or shutdown, never after a suspend > (S3) and resume. > And for all this time I've always done full reboots to do a better test with > more triggering events. > > It seems that BIOS updated fixed the issue. Thanks for the feedback. Closing this bug as WORKSFORME. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.