Bug 35502

Summary: Regression: black screen with Radeon KMS in 2.6.38 (2.6.37.4 worked fine)
Product: DRI Reporter: John Lindgren <john.lindgren>
Component: DRM/RadeonAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED FIXED QA Contact:
Severity: major    
Priority: medium CC: bryce, hallbw, madbiologist2016, ta.bu.shi.da.yu
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
Kernel log
none
Patch to fix infinite loop
none
fix record parsing
none
Updated patch (#3) none

Description John Lindgren 2011-03-21 06:01:01 UTC
Created attachment 44669 [details]
Kernel log

With Linux kernel 2.6.38, loading the radeon module causes the console to go completely blank (laptop backlight on, but nothing displayed).  The system still responds to ctrl-alt-del and reboots.  This is a regression from the previous kernel, 2.6.37.4, where kernel mode setting works flawlessly.

With radeon.modeset=0, or with the module blacklisted, the system boots fine, and I can use X, but without the powersaving features available through KMS.

System: Toshiba Satellite A305-S6916
Graphics: ATI Radeon Mobility HD 3650
Kernel: 2.6.38 (x86_64)

Kernel log when loading the module:

Mar 21 00:57:38 localhost kernel: [drm] Initialized drm 1.1.0 20060810
Mar 21 00:57:38 localhost kernel: [drm] radeon defaulting to kernel modesetting.
Mar 21 00:57:38 localhost kernel: [drm] radeon kernel modesetting enabled.
Mar 21 00:57:38 localhost kernel: radeon 0000:01:00.0: PCI INT A ->  GSI 16 (level, low) ->  IRQ 16
Mar 21 00:57:38 localhost kernel: radeon 0000:01:00.0: setting latency timer to 64
Mar 21 00:57:38 localhost kernel: [drm] initializing kernel modesetting (RV635 0x1002:0x9591).
Mar 21 00:57:38 localhost kernel: [drm] register mmio base: 0xD6300000
Mar 21 00:57:38 localhost kernel: [drm] register mmio size: 65536
Mar 21 00:57:38 localhost kernel: ATOM BIOS: Tosh_IEC_Potomac_M86_DDR2
Mar 21 00:57:38 localhost kernel: radeon 0000:01:00.0: VRAM: 512M 0x0000000000000000 - 0x000000001FFFFFFF (512M used)
Mar 21 00:57:38 localhost kernel: radeon 0000:01:00.0: GTT: 512M 0x0000000020000000 - 0x000000003FFFFFFF
Mar 21 00:57:38 localhost kernel: [drm] Detected VRAM RAM=512M, BAR=256M
Mar 21 00:57:38 localhost kernel: [drm] RAM width 128bits DDR
Mar 21 00:57:38 localhost kernel: [TTM] Zone  kernel: Available graphics memory: 2027544 kiB.
Mar 21 00:57:38 localhost kernel: [TTM] Initializing pool allocator.
Mar 21 00:57:38 localhost kernel: [drm] radeon: 512M of VRAM memory ready
Mar 21 00:57:38 localhost kernel: [drm] radeon: 512M of GTT memory ready.
Mar 21 00:57:38 localhost kernel: [drm] Supports vblank timestamp caching Rev 1 (10.10.2010).
Mar 21 00:57:38 localhost kernel: [drm] Driver supports precise vblank timestamp query.
Mar 21 00:57:38 localhost kernel: radeon 0000:01:00.0: irq 49 for MSI/MSI-X
Mar 21 00:57:38 localhost kernel: radeon 0000:01:00.0: radeon: using MSI.
Mar 21 00:57:38 localhost kernel: [drm] radeon: irq initialized.
Mar 21 00:57:38 localhost kernel: [drm] GART: num cpu pages 131072, num gpu pages 131072
Mar 21 00:57:38 localhost kernel: [drm] Loading RV635 Microcode
Mar 21 00:57:38 localhost kernel: radeon 0000:01:00.0: WB enabled
Mar 21 00:57:38 localhost kernel: [drm] ring test succeeded in 1 usecs
Mar 21 00:57:38 localhost kernel: [drm] radeon: ib pool ready.
Mar 21 00:57:38 localhost kernel: [drm] ib test succeeded in 0 usecs
Mar 21 00:57:38 localhost kernel: [drm] Enabling audio support
Mar 21 00:57:38 localhost kernel: HDMI hot plug event: Pin=3 Presence_Detect=0 ELD_Valid=0

If there is any more info I can provide, please ask.
Comment 1 Alex Deucher 2011-03-21 07:23:42 UTC
Can you bisect?
Comment 2 John Lindgren 2011-03-21 07:42:40 UTC
I will see if I can find the time.  Probably not before this weekend.
Comment 3 John Lindgren 2011-03-22 12:40:37 UTC
$ git bisect bad
36868bda88b92ce8a9aa8b3ee2e0d1e0de09cc19 is the first bad commit
commit 36868bda88b92ce8a9aa8b3ee2e0d1e0de09cc19
Author: Alex Deucher <alexdeucher@gmail.com>
Date:   Thu Jan 6 21:19:21 2011 -0500

    drm/radeon/kms: parse DCE5 encoder caps when setting up encoders
    
    Needed to tell which DIG encoders are HBR2 capable for DP 1.2.
    
    Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
    Signed-off-by: Dave Airlie <airlied@redhat.com>

:040000 040000 3e2fca7e36dd68f455757fab0d303c3ebe9e729f 81431d1b3a7c1ca3f217e4b670131f40aa836b90 M	drivers
Comment 4 John Lindgren 2011-03-22 12:57:59 UTC
The problem is very simple.  There is no sanity check in the loop added by that commit to ensure that record->ucRecordSize is nonzero.  Hence the driver gets stuck in an infinite loop.

diff --git a/drivers/gpu/drm/radeon/radeon_atombios.c b/drivers/gpu/drm/radeon/radeon_atombios.c
index a2dfe25..2525e86 100644
--- a/drivers/gpu/drm/radeon/radeon_atombios.c
+++ b/drivers/gpu/drm/radeon/radeon_atombios.c
@@ -679,7 +679,8 @@ bool radeon_get_atom_connector_info_from_object_table(struct drm_device *dev)
 							ATOM_ENCODER_CAP_RECORD *cap_record;
 							u16 caps = 0;
 
-							while (record->ucRecordType > 0 &&
+							while (record->ucRecordSize > 0 &&
+							       record->ucRecordType > 0 &&
 							       record->ucRecordType <= ATOM_MAX_OBJECT_RECORD_NUMBER) {
 								switch (record->ucRecordType) {
 								case ATOM_ENCODER_CAP_RECORD_TYPE:
Comment 5 John Lindgren 2011-03-22 12:59:38 UTC
Created attachment 44725 [details] [review]
Patch to fix infinite loop
Comment 6 Alex Deucher 2011-03-22 15:19:49 UTC
Created attachment 44731 [details] [review]
fix record parsing

Thanks for tracking.  I've attached the patch I plan to send upstream.  I also fixed a similar potential case in the connector handling.
Comment 7 John Lindgren 2011-03-24 16:16:39 UTC
Created attachment 44795 [details] [review]
Updated patch (#3)

For the sake of completeness, I updated the patch to also fix a third loop similar to the first two.
Comment 8 Alex Deucher 2011-03-24 16:29:17 UTC
Thanks, I've sent the updated version to Dave.
Comment 9 Michel Dänzer 2011-04-13 04:45:42 UTC
*** Bug 36007 has been marked as a duplicate of this bug. ***
Comment 10 Alex Deucher 2011-04-30 08:34:00 UTC
*** Bug 36713 has been marked as a duplicate of this bug. ***
Comment 11 Fabio Pedretti 2011-05-05 00:50:01 UTC
Fixed in 2.6.38.3.
Comment 12 Chris Sherlock 2011-05-14 05:32:09 UTC
*** Bug 32662 has been marked as a duplicate of this bug. ***

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.