Bug 38694 (katherine)

Summary: Server freezes with latest commit on 22/06/2011
Product: DRI Reporter: Emanuele <tomasi>
Component: DRM/RadeonAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED MOVED QA Contact:
Severity: critical    
Priority: high    
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
Xorg.log
none
dmesg on boot none

Description Emanuele 2011-06-26 13:13:20 UTC
With this commit my PC freezes sometimes when get on/off standby:
$> xset dpms force standby

I have always had these problems, please refer to:
https://bugzilla.kernel.org/show_bug.cgi?id=spina

but they are occurred rarely, recently. With your commin on 17/06/2011 (commit 122b471f734aa07427b01d4bec35ff1ac28290b5) the problem seemed to be solved. However, with the commit in object it always happens.

I can see a blinking screen but I have to halt the PC with magic key.

Best regards,
Emanuele
Comment 1 Emanuele 2011-07-08 14:47:08 UTC
Hi,
I have tried some commit in reverse mode.
- Jun 17: same problem.
- Jun 16: same problem, but less often.
- Jun 13: so far so good.

Emanuele
Comment 2 Alex Deucher 2011-07-08 14:51:38 UTC
Please attach your xorg log and dmesg output.
Comment 3 Emanuele 2011-07-08 15:12:25 UTC
Created attachment 48905 [details]
Xorg.log
Comment 4 Emanuele 2011-07-08 15:21:05 UTC
Created attachment 48906 [details]
dmesg on boot

As I have already said in the other bugzilla, when my I do:
  xset dpms force standby
or
  echo "low" > /sys/class/drm/card0/device/power_profile
the first time I have this message:
---
NMI: PCI system error (SERR) for reason a1 on CPU 0.
Dazed and confused, but trying to continue
---
but, if I follow you suggestion this disappears:
---
You can disable the PCIE lane changes by removing the
call to radeon_set_pcie_lanes() in rs600_pm_misc() in rs600.c.
---
Comment 5 Emanuele 2011-07-12 11:17:40 UTC
Hi Alex,
I saw your last commit one minute ago. Have you discovered what happens on my PC, yet? Do you need more infos?

Best regards,
Emanuele
Comment 6 Alex Deucher 2011-07-12 11:43:33 UTC
Have you changed the kernel you are using or just the ddx?  With KMS, there ddx doesn't really do much with respect to modesetting.  It just calls into the kernel.
Comment 7 Emanuele 2011-07-13 03:02:49 UTC
Alex,
I'm using the latest vanilla stable kernel (2.6.39.3). Sorry but, what is the ddx?

Emanuele
Comment 8 Alex Deucher 2011-07-13 07:28:26 UTC
(In reply to comment #7)
> Alex,
> I'm using the latest vanilla stable kernel (2.6.39.3). Sorry but, what is the
> ddx?

The ddx is the radeon X driver (xf86-video-ati).
Comment 9 Emanuele 2011-07-14 13:50:25 UTC
Thanks for your reply. So, is there a bug in the Kernel? I use to compile the kernel by myself, did I forget something?

But, how is possible that with different commit of ddx there are different situations?

Thanks a lot,
Emanuele
Comment 10 Alex Deucher 2011-07-14 14:09:43 UTC
It's hard to say.  That's why I was trying to figure out which components you changed (just ddx, ddx and kernel, ddx, kernel, and mesa. etc.) as they could all be to blame potentially.  It might be a 3D screen saver that kicks in and hangs the card due to a bug in the 3D driver.
Comment 11 Emanuele 2011-07-15 14:15:03 UTC
I have never used screen saver. My Server X is configured in this way:
xset s 300
xset s blank

When I want to force the standby I use:
xset dpms force standby

I can do all the tests you want,
Emanuele
Comment 12 Emanuele 2011-09-18 14:31:18 UTC
Hi Alex,
now I'm using kernel v3.0.4. I noticed that I have freeze also when I put my current wire into socket, sometimes.
Remember that I receive this message on standby or when I do:
---
echo "low" > /sys/class/drm/card0/device/power_profile
---
NMI: PCI system error (SERR) for reason b1 on CPU 0.
Dazed and confused, but trying to continue
---
and I do this when the current wire is pulled off whereas I do:
---
echo "auto" > /sys/class/drm/card0/device/power_profile
---
when is inserted.

I have noticed also that sometimes I receive that message at boot time:
---[drm] Initialized drm 1.1.0 20060810
[drm] radeon defaulting to kernel modesetting.
[drm] radeon kernel modesetting enabled.
radeon 0000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
radeon 0000:01:00.0: setting latency timer to 64
[drm] initializing kernel modesetting (RV515 0x1002:0x7145 0x1028:0x2003).
[drm] register mmio base: 0xEFDF0000
[drm] register mmio size: 65536
ATOM BIOS: M54P
[drm] Generation 2 PCI interface, using max accessible memory
radeon 0000:01:00.0: VRAM: 256M 0x0000000000000000 - 0x000000000FFFFFFF (128M used)
radeon 0000:01:00.0: GTT: 512M 0x0000000010000000 - 0x000000002FFFFFFF
[drm] Supports vblank timestamp caching Rev 1 (10.10.2010).
[drm] Driver supports precise vblank timestamp query.
radeon 0000:01:00.0: irq 43 for MSI/MSI-X
radeon 0000:01:00.0: radeon: using MSI.
[drm] radeon: irq initialized.
[drm] Detected VRAM RAM=256M, BAR=256M
[drm] RAM width 64bits DDR
[TTM] Zone  kernel: Available graphics memory: 1028024 kiB.
[TTM] Initializing pool allocator.
[drm] radeon: 128M of VRAM memory ready
[drm] radeon: 512M of GTT memory ready.
[drm] GART: num cpu pages 131072, num gpu pages 131072
NMI: PCI system error (SERR) for reason b1 on CPU 0.
Dazed and confused, but trying to continue
[drm] radeon: 1 quad pipes, 1 z pipes initialized.
[drm] PCIE GART of 512M enabled (table at 0x00040000).
radeon 0000:01:00.0: WB enabled
[drm] Loading R500 Microcode
[drm] radeon: ring at 0x0000000010001000
[drm] ring test succeeded in 10 usecs
[drm] radeon: ib pool ready.
[drm] ib test succeeded in 0 usecs
[drm] Radeon Display Connectors
[drm] Connector 0:
[drm]   VGA
[drm]   DDC: 0x7e40 0x7e40 0x7e44 0x7e44 0x7e48 0x7e48 0x7e4c 0x7e4c
[drm]   Encoders:
[drm]     CRT1: INTERNAL_KLDSCP_DAC1
[drm] Connector 1:
[drm]   LVDS
[drm]   DDC: 0x7e30 0x7e30 0x7e34 0x7e34 0x7e38 0x7e38 0x7e3c 0x7e3c
[drm]   Encoders:
[drm]     LCD1: INTERNAL_LVTM1
[drm] Connector 2:
[drm]   S-video
[drm]   Encoders:
[drm]     TV1: INTERNAL_KLDSCP_DAC2
[drm] Radeon display connector VGA-1: No monitor connected or invalid EDID
[drm] Radeon display connector LVDS-1: Found valid EDID
[drm] radeon: power management initialized
[drm] fb mappable at 0xD00C0000
[drm] vram apper at 0xD0000000
[drm] size 4096000
[drm] fb depth is 24
[drm]    pitch is 5120
fbcon: radeondrmfb (fb0) is primary device
Console: switching to colour frame buffer device 160x50
fb0: radeondrmfb frame buffer device
drm: registered panic notifier
[drm] Initialized radeon 2.10.0 20080528 for 0000:01:00.0 on minor 0
---

I hope that this can help you to investigate,
Emanuele
Comment 13 Michel Dänzer 2011-10-03 08:54:06 UTC
(In reply to comment #4)
> but, if I follow you suggestion this disappears:
> ---
> You can disable the PCIE lane changes by removing the
> call to radeon_set_pcie_lanes() in rs600_pm_misc() in rs600.c.

Did removing that call only cause the NMI message to disappear, or also the freezes? Is it still the case with a current kernel?
Comment 14 Michel Dänzer 2011-10-03 08:55:01 UTC
BTW, please refer to Git commits by their commit IDs rather than by date.
Comment 15 Emanuele 2011-10-03 16:05:33 UTC
Hi Michel,
(In reply to comment #13)
> (In reply to comment #4)
> > but, if I follow you suggestion this disappears:
> > ---
> > You can disable the PCIE lane changes by removing the
> > call to radeon_set_pcie_lanes() in rs600_pm_misc() in rs600.c.
> 
> Did removing that call only cause the NMI message to disappear, or also the
> freezes?
we'll see. I disabled that call and the NMI message has been desappeared. I also tryed 'xset dpms force standby' some times and so far so good.

> Is it still the case with a current kernel?
---
$> uname -r
3.0.4
---

With 3.0.X kernel freezes are different. Now I can move the mouse or I can use keyboard, sometimes. In these cases, I'm not obliged to shutdown PC but I can reboot it: I have to press the halt button on my case and I can force reboot with CTRl+ALT+DEL after some seconds (I think when X server is killed, but I can't see nothing).

Thank for all and best regards,
Emanuele
Comment 16 Emanuele 2011-10-11 16:34:45 UTC
(In reply to comment #13)
> Did removing that call only cause the NMI message to disappear, or also the
> freezes? Is it still the case with a current kernel?

Michel,
after a week we can say: "yes, also freezes have been desappear". Now, all is good!!!

But, when I update the kernel with new increments, patch obvously fails for this file.

Emanuele
Comment 17 Martin Peres 2019-11-19 08:19:50 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/199.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.