Bug 36007

Summary: [natty] system freezes on boot without disabling KMS
Product: xorg Reporter: Bryce Harrington <bryce>
Component: Driver/RadeonAssignee: xf86-video-ati maintainers <xorg-driver-ati>
Status: RESOLVED DUPLICATE QA Contact: Xorg Project Team <xorg-team>
Severity: major    
Priority: high CC: spam
Version: 7.6 (2010.12)Keywords: regression
Hardware: x86 (IA32)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
netconkern.log
none
netconkern-modeset-disabled.log none

Description Bryce Harrington 2011-04-05 15:21:37 UTC
Forwarding this bug from Ubuntu reporter LGB [Gbor Lnrt]:
http://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-ati/+bug/735126

[Problem]
With KMS enabled, the kernel fails to boot, with errors appearing starting with the module loaded after the drm module (see sample kernel output below).  If KMS is disabled, it boots properly.

This is a regression between the maverick kernel and the natty kernel.

[   19.142821] [drm] ring test succeeded in 0 usecs
[   19.143057] [drm] radeon: ib pool ready.
[   19.143156] [drm] ib test succeeded in 0 usecs
[   19.143163] [drm] Enabling audio support
[   28.864008] eth0: no IPv6 routers present
[   79.112037] ieee80211 phy0: Failed to initialize wep: -110
[   79.112066] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'
[   79.112902] ath5k phy0: Atheros AR2425 chip found (MAC: 0xe2, PHY: 0x70)
[  243.716037] INFO: task jbd2/sda1-8:263 blocked for more than 120 seconds.
[  243.716050] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  243.716065] jbd2/sda1-8     D f6ee5860     0   263      2 0x00000000
[  243.716082]  f6a0be24 00000046 00000400 f6ee5860 00000400 00000008 f6ee5aec c1877680
[  243.716111]  14d02157 00000005 f6ee5ae8 c1877680 c1877680 f7887680 f6ee5860 f75f25e0
[  243.716141]  00000000 efaece80 00000800 00000000 00000000 f691a788 f6a0be1c c1081938
[  243.716175] Call Trace:
[  243.716191]  [<c1081938>] ? ktime_get_ts+0xf8/0x120
[  243.716205]  [<c152f84f>] io_schedule+0x5f/0xa0
[  243.716213]  [<c11598c8>] sync_buffer+0x38/0x40
[  243.716225]  [<c153004d>] __wait_on_bit+0x4d/0x70
[  243.716232]  [<c1159890>] ? sync_buffer+0x0/0x40
[  243.716244]  [<c1159890>] ? sync_buffer+0x0/0x40


[Original Description]
After upgrading maverick (x86 32 bit system but with PAE kernel) system does not boot, after some disk activity the upper half of the screen is filled with garbage and no more disk activity. Booting an older kernel (from maverick) works. Also it work with natty's kernel if I give radeon.modeset=0 parameter at grub but then there is now hw accel. (glxinfo reports software rendering, booting natty with maverick's kernel seems to have it, glxinfo reports galium as the render string then).

This is a Toshiba Satellite A300 notebook with 4G of RAM and video hw:

01:00.0 VGA compatible controller: ATI Technologies Inc Mobility Radeon HD 3400 Series

xserver-xorg-video-radeon                            1:6.14.0-0ubuntu2

Linux orion 2.6.38-6-generic-pae #34-Ubuntu SMP Tue Mar 8 15:47:54 UTC 2011 i686 i686 i386 GNU/Linux

DistroRelease: Ubuntu 11.04
Package: xserver-xorg-video-radeon 1:6.14.0-0ubuntu2
ProcVersionSignature: Ubuntu 2.6.38-6.34-generic-pae 2.6.38-rc7
Uname: Linux 2.6.38-6-generic-pae i686
Architecture: i386
CompizPlugins: [core,bailer,detection,composite,opengl,compiztoolbox,decor,regex,mousepoll,resize,wall,grid,move,animation,place,snap,session,imgpng,workarounds,vpswitch,gnomecompat,expo,ezoom,staticswitcher,fade,scale]
CompositorRunning: None
Date: Mon Mar 14 22:03:35 2011
DistUpgraded: Log time: 2011-03-13 16:50:20.128908
DistroCodename: natty
DistroVariant: ubuntu
DkmsStatus:
 vboxhost, 4.0.0, 2.6.35-24-generic-pae, i686: installed 
 vboxhost, 4.0.0, 2.6.35-28-generic-pae, i686: installed 
 vboxhost, 4.0.0, 2.6.35-25-generic-pae, i686: installed 
 vboxhost, 4.0.0, 2.6.35-27-generic-pae, i686: installed 
 vboxhost, 4.0.0, 2.6.35-26-generic-pae, i686: installed
GraphicsCard:
 ATI Technologies Inc Mobility Radeon HD 3400 Series [1002:95c4] (prog-if 00 [VGA controller])
   Subsystem: Toshiba America Info Systems Device [1179:ff1e]
InstallationMedia: Ubuntu 10.10 "Maverick Meerkat" - Release i386 (20101007)
MachineType: TOSHIBA Satellite A300
ProcEnviron:
 LANGUAGE=en_US:en
 PATH=(custom, user)
 LANG=en_US.UTF-8ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.38-6-generic-pae root=UUID=14c446e2-e9e0-47a2-992e-f04d3f9b7bf3 ro quiet splash vt.handoff=7 radeon.modeset=0
Renderer: Software
SourcePackage: xserver-xorg-video-ati
UpgradeStatus: Upgraded to natty on 2011-03-13 (1 days ago)
dmi.bios.date: 03/20/2009
dmi.bios.vendor: INSYDE
dmi.bios.version: 1.90
dmi.board.asset.tag: Base Board Asset Tag
dmi.board.name: Portable PC
dmi.board.vendor: TOSHIBA
dmi.board.version: Base Board Version
dmi.chassis.asset.tag: No Asset Tag
dmi.chassis.type: 10
dmi.chassis.vendor: Chassis Manufacturer
dmi.chassis.version: Chassis Version
dmi.modalias: dmi:bvnINSYDE:bvr1.90:bd03/20/2009:svnTOSHIBA:pnSatelliteA300:pvrPSAGCE-0KC00FHU:rvnTOSHIBA:rnPortablePC:rvrBaseBoardVersion:cvnChassisManufacturer:ct10:cvrChassisVersion:
dmi.product.name: Satellite A300
dmi.product.version: PSAGCE-0KC00FHU
dmi.sys.vendor: TOSHIBA
version.compiz: compiz 1:0.9.4-0ubuntu4
version.libdrm2: libdrm2 2.4.23-1ubuntu3
version.libgl1-mesa-glx: libgl1-mesa-glx 7.10.1-0ubuntu1
version.xserver-xorg: xserver-xorg 1:7.6~3ubuntu11
version.xserver-xorg-video-ati: xserver-xorg-video-ati 1:6.14.0-0ubuntu2
version.xserver-xorg-video-intel: xserver-xorg-video-intel 2:2.14.0-4ubuntu1
version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau 1:0.0.16+git20110107+b795ca6e-0ubuntu5
Comment 1 Bryce Harrington 2011-04-05 15:22:43 UTC
Created attachment 45323 [details]
netconkern.log
Comment 2 Bryce Harrington 2011-04-05 15:25:37 UTC
Created attachment 45324 [details]
netconkern-modeset-disabled.log

Kernel log from having KMS disabled.  (No errors)
Comment 3 Alex Deucher 2011-04-05 15:32:39 UTC
The crash doesn't look related to radeon at all.  Can you blacklist radeon, boot to runlevel 3 and then manually load it from the console?
Comment 4 Alex Deucher 2011-04-05 15:35:49 UTC
If it is radeon, any chance you could bisect the kernel?
Comment 5 Bryce Harrington 2011-04-05 15:39:14 UTC
Yeah, the crashes appear in the networking layer (and after), however it's odd they occur only with radeon kms switched on.  I'll have him test blacklisting radeon and loading it post-boot.

Is it possible that the drm driver leaves the system in a state that could cause subsequent modules to fail loading?
Comment 6 Alex Deucher 2011-04-05 15:47:53 UTC
(In reply to comment #5)
> Is it possible that the drm driver leaves the system in a state that could
> cause subsequent modules to fail loading?

It shouldn't.  Also, the crashes are happening before the radeon driver has even finished loading.  You might want to check if there is a problem with the mtrrs or pat setup on that kernel.  I recall similar issues years ago with toshiba laptops caused by the bios setting up the mtrrs wrong.
Comment 7 Gabor Lenart 2011-04-06 03:36:19 UTC
> You might want to check if there is a problem with the
> mtrrs or pat setup on that kernel.  I recall similar issues years ago with
> toshiba laptops caused by the bios setting up the mtrrs wrong.

Hi, if it helps:

reg00: base=0x0fffe0000 ( 4095MB), size= 128KB, count=1: write-protect
reg01: base=0x000000000 ( 0MB), size= 2048MB, count=1: write-back
reg02: base=0x080000000 ( 2048MB), size= 1024MB, count=1: write-back
reg03: base=0x0bff00000 ( 3071MB), size= 1MB, count=1: uncachable
reg04: base=0x100000000 ( 4096MB), size= 1024MB, count=1: write-back

It's with kernel cmdline radeon.modeset=0 otherwise I can't boot to get a usable system because of the issue. It's a system with 4Gbyte of RAM, and PAE kernel (to be able to use all of the RAM, and I don't want a 64 bit system on this). But I've also tried non-pae kernel, and the result seems to be the same.
Comment 8 Gabor Lenart 2011-04-07 00:25:01 UTC
Also though I am not an expert of debugging kernel level problems, I was keen to try various things without too much success, please take a look on the LP bug, where I've commented more details meanwhile. Any suggestion is warmly welcomed to tell me what other thing I can try. Thanks!
Comment 9 Alex Deucher 2011-04-07 07:25:35 UTC
You have to load the drm before X has started.  Try the following:
1. Add "radeon.modeset=0 single" to the end of your kernel command line in grub.  That should boot into single user mode without X running.
2. At the console (do not start X) enter the following commands:
modprobe -r radeon
modprobe fbcon
modprobe radeon modeset=1
Comment 10 Gabor Lenart 2011-04-08 04:20:42 UTC
(In reply to comment #9)
> You have to load the drm before X has started.  Try the following:
> 1. Add "radeon.modeset=0 single" to the end of your kernel command line in
> grub.  That should boot into single user mode without X running.
> 2. At the console (do not start X) enter the following commands:
> modprobe -r radeon
> modprobe fbcon
> modprobe radeon modeset=1

Ok, I've tried this, the same result: after the last line (inserting radeon module with modeset set to 1) the screen goes black and nothing. Pressing the caps lock key hasn't got feedback anymore with the caps lock led either. If you'd like, I will set up netconsole before trying this, so I can see there is some kernel log got (but I have the idea that it can be similar that my netconkern.log what I've already posted).
Comment 11 Alex Deucher 2011-04-08 07:40:16 UTC
Would it be possible to bisect (or at least try some intermediate kernels) to help narrow down when the problem started?  Can you also try a vanilla non-distro kernel and see if it exhibits the problem?
Comment 12 Gabor Lenart 2011-04-08 07:55:15 UTC
(In reply to comment #11)
> Would it be possible to bisect (or at least try some intermediate kernels) to
> help narrow down when the problem started?  Can you also try a vanilla
> non-distro kernel and see if it exhibits the problem?

For sure, I would like to help. However I need some hints, what will be the oldest kernel to try first? Also I am not sure that Ubuntu works with vanilla kernel, but I guess it's ok at least if I want only single user mode, and no specific things (I don't want to reinstall my notebook). Current kernel of natty (at least what is on the notebook) is 2.6.38, previous ubuntu version used 2.6.35, which seems to work. So should I try to compile kernels between 35 and 38 to try to find at least the "main" kernel version which already works? (I guess the exact problem would need to find the exact patch what cause this then). It requires some kernel compilation, no doubts :) I will try this, but maybe no sooner than next week. Thanks for your suggestion.
Comment 13 Gabor Lenart 2011-04-08 08:16:28 UTC
By the way till that, if it helps, I've made some diff about the kernel logs, between the working maverick's kernel, and natty's, with grep'ing only the drm/radeon specified lines of both (with natty's I have the one I could catch with netconsole), then I made an unified diff between them:

-[drm] Initialized drm 1.1.0 20060810
 [drm] radeon defaulting to kernel modesetting.
 [drm] radeon kernel modesetting enabled.
 radeon 0000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
 radeon 0000:01:00.0: setting latency timer to 64
 [drm] initializing kernel modesetting (RV620 0x1002:0x95C4).
 [drm] register mmio base: 0xD6300000
 [drm] register mmio size: 65536
-[drm] Clocks initialized !
-radeon 0000:01:00.0: VRAM: 256M 0x00000000 - 0x0FFFFFFF (256M used)
-radeon 0000:01:00.0: GTT: 512M 0x10000000 - 0x2FFFFFFF
+radeon 0000:01:00.0: VRAM: 256M 0x0000000000000000 - 0x000000000FFFFFFF (256M used)
+radeon 0000:01:00.0: GTT: 512M 0x0000000010000000 - 0x000000002FFFFFFF
 [drm] Detected VRAM RAM=256M, BAR=256M
 [drm] RAM width 64bits DDR
-[TTM] Zone  kernel: Available graphics memory: 418362 kiB.
-[TTM] Zone highmem: Available graphics memory: 2057984 kiB.
+[TTM] Zone  kernel: Available graphics memory: 418188 kiB.
+[TTM] Zone highmem: Available graphics memory: 2057810 kiB.
 [TTM] Initializing pool allocator.
 [drm] radeon: 256M of VRAM memory ready
 [drm] radeon: 512M of GTT memory ready.
-radeon 0000:01:00.0: irq 47 for MSI/MSI-X
+[drm] Supports vblank timestamp caching Rev 1 (10.10.2010).
+[drm] Driver supports precise vblank timestamp query.
+radeon 0000:01:00.0: irq 46 for MSI/MSI-X
 radeon 0000:01:00.0: radeon: using MSI.
 [drm] radeon: irq initialized.
 [drm] GART: num cpu pages 131072, num gpu pages 131072
 [drm] Loading RV620 Microcode
+radeon 0000:01:00.0: WB enabled
 [drm] ring test succeeded in 0 usecs
 [drm] radeon: ib pool ready.
 [drm] ib test succeeded in 0 usecs
 [drm] Enabling audio support

I don't know if it's useful or not, what I can notice is the different irq (46 and 47) but maybe that's natural. Natty's freeze is just after "enabling audio" by the way.
Comment 14 Gabor Lenart 2011-04-09 01:54:40 UTC
(In reply to comment #11)
> Would it be possible to bisect (or at least try some intermediate kernels) to
> help narrow down when the problem started?  Can you also try a vanilla
> non-distro kernel and see if it exhibits the problem?

I've downloaded kernel 2.6.38.2 from ftp.kernel.org, and I compiled it. With KMS disabled it worked. However with KMS left to work, the very same problem, I used your advise to load kernel with KMS disabled then removing then re-inserting radeon module with KMS enabled. I will try other vanilla kernel like 2.6.37 so I can see where the problem began, hopefully.
Comment 15 Gabor Lenart 2011-04-09 03:16:30 UTC
Ok, so using only vanilla kernels from ftp.kernel.org, as I've told 2.6.38.2 also does not work, however 2.6.37.6 seems to be OK! So the problem is introduced between these versions, I guess. I tried to compile 2.6.38 but the compilation fails, so I could not test that kernel version:


  AS      arch/x86/kernel/entry_32.o
arch/x86/kernel/entry_32.S: Assembler messages:
arch/x86/kernel/entry_32.S:1422: Error: .size expression for apf_page_fault does not evaluate to a constant
make[2]: *** [arch/x86/kernel/entry_32.o] Error 1
make[1]: *** [arch/x86/kernel] Error 2
make: *** [arch/x86] Error 2
Comment 16 Alex Deucher 2011-04-09 08:42:19 UTC
At this point it would be easiest to bisect your kernel.  To do that, check out a git tree and then start bisecting.  With bisect, you tell git the a known good and known bad commits, and then it check out a commit halfway between the two specified.  you compile and test that commit and then tell it if it was good or bad ('git bisect good' or 'git bisect bad') and it will continue to bisect until the problematic commit is found.  See:
http://www.kernel.org/pub/software/scm/git/docs/git-bisect.html

# clone the git tree
git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
# change to the newly checked out tree
cd linux-2.6
# start bisecting
git bisect start
# specify the bad commit
git bisect bad v2.6.38
# specify the good commit
git bisect good v2.6.37
# ...built/test
# if the commit is good
git bisect good
#if it's bad
git bisect bad
# if there is another problem (won't build/won't boot, etc.
git bisect skip
# at the end git will point you to the problematic commit.
# to reset your git tree after you are done
git bisect reset
Comment 17 Gabor Lenart 2011-04-09 09:51:09 UTC
Thanks for the URL and mini tutorial, I've heard about git bisect but I have never used before. I am on my way to test the first provided commit by git bisect now. However I have a question: should I use make mrproper each time after a bisect step or dependency is automatic in linux kernel and after a step I can simply re-issue make bzImage for example without any step?
Comment 18 Alex Deucher 2011-04-09 10:16:06 UTC
(In reply to comment #17)
> Thanks for the URL and mini tutorial, I've heard about git bisect but I have
> never used before. I am on my way to test the first provided commit by git
> bisect now. However I have a question: should I use make mrproper each time
> after a bisect step or dependency is automatic in linux kernel and after a step
> I can simply re-issue make bzImage for example without any step?

The kernel should take care of the rebuilding as necessary.  If you run into any problems, just run 'make clean'.
Comment 19 Gabor Lenart 2011-04-10 14:22:00 UTC
I was busy with this bisecting, but no luck :( For me, it's totally myserious how this bisect works, it always provided commits which result in 2.6.37-rcX kernels with a "+" sign at the end. That's odd. Since 2.6.37 _IS_ good, so kernels before that (its rc releases) should be ok and should not even asked by git on bisecting .... Anyway even with my doubts I do all the process and it comes up with: 

f70f5b9dc74ca7d0a64c4ead3fb28da09dc1b234 is the first bad commit

However compiling and trying that kernel works, so it's not bad: I can't understand. And it seems to be a 2.6.37-rc5+ kernel according to uname -a, so how can it be the first bad commit when I started the bisecting process with:

git bisect start
git bisect bad v2.6.38
git bisect good v2.6.37

Bisect log:

git bisect start
# bad: [521cb40b0c44418a4fd36dc633f575813d59a43d] Linux 2.6.38
git bisect bad 521cb40b0c44418a4fd36dc633f575813d59a43d
# good: [3c0eee3fe6a3a1c745379547c7e7c904aa64f6d5] Linux 2.6.37
git bisect good 3c0eee3fe6a3a1c745379547c7e7c904aa64f6d5
# bad: [5943a268002fce97885f2ca08827ff1b0312068c] Merge branch 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
git bisect bad 5943a268002fce97885f2ca08827ff1b0312068c
# bad: [9e9bc9736756f25d6c47b4eba0ebf25b20a6f153] Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6
git bisect bad 9e9bc9736756f25d6c47b4eba0ebf25b20a6f153
# good: [25860c3bd5bd1db236d4fd5826d76127d677dc28] tipc: recode getsockopt error handling for better readability
git bisect good 25860c3bd5bd1db236d4fd5826d76127d677dc28
# bad: [f70f5b9dc74ca7d0a64c4ead3fb28da09dc1b234] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next-2.6
git bisect bad f70f5b9dc74ca7d0a64c4ead3fb28da09dc1b234
# good: [4f00b901d4233a78e6ca4d44c8c6fc5d38a3ee9e] Merge branch 'x86-security-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
git bisect good 4f00b901d4233a78e6ca4d44c8c6fc5d38a3ee9e
# good: [4e3dbdb1392a83bd21a6ff8f6bc785495058d37c] cassini: Use local-mac-address prom property for Cassini MAC address
git bisect good 4e3dbdb1392a83bd21a6ff8f6bc785495058d37c
# good: [1928e87bcf185f56008d0746f887b691c1cb8c4a] Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6
git bisect good 1928e87bcf185f56008d0746f887b691c1cb8c4a
# good: [c6d92e9b84da2002ee7a75b784834970ddfd3bfd] Merge branch 'msm-usb' into for-next
git bisect good c6d92e9b84da2002ee7a75b784834970ddfd3bfd
# good: [9858a38ea3a940762ae3028cce88f686d0e0c28b] Merge branch 'sh-latest' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6
git bisect good 9858a38ea3a940762ae3028cce88f686d0e0c28b
# good: [d89ddf0da8f0a140d4dc2e2dbc594fb278e33db5] APBUART: added raw AMBA vendor/device number to match against.
git bisect good d89ddf0da8f0a140d4dc2e2dbc594fb278e33db5
# good: [a020bb17b7046cd97ea6924ca99325b6e516bc2d] sparc: add $BITS to piggyback arguments
git bisect good a020bb17b7046cd97ea6924ca99325b6e516bc2d
# good: [050855887236701c5e7ff803b42265824ce99885] sparc: update copyright in piggyback.c
git bisect good 050855887236701c5e7ff803b42265824ce99885
# good: [b69fc2efc9205d58c820eb2eb1caa6bf873b4b0d] Merge branch 'for-linus' of git://codeaurora.org/quic/kernel/davidb/linux-msm
git bisect good b69fc2efc9205d58c820eb2eb1caa6bf873b4b0d
# good: [09798eb9479da3413bdf96e7d22a84d8b21e05e1] atyfb: Fix bootup hangs on sparc64.
git bisect good 09798eb9479da3413bdf96e7d22a84d8b21e05e1
Comment 20 Rafał Miłecki 2011-04-10 14:32:39 UTC
(In reply to comment #19)
> That's odd. Since 2.6.37 _IS_ good, so
> kernels before that (its rc releases) should be ok and should not even asked by
> git on bisecting ....

This happens if tree based on old version of kernel was merged. From kernel name it looks like out of order, but this should be fine.

Example: drm-fixes could be based on rc3, but Linus merged it during rc6 stage. In such case you can see rc3 while testing patches that went into rc6.
Comment 21 Michel Dänzer 2011-04-11 03:54:07 UTC
(In reply to comment #19)
> f70f5b9dc74ca7d0a64c4ead3fb28da09dc1b234 is the first bad commit
> 
> However compiling and trying that kernel works, so it's not bad: I can't
> understand.

[...]

> # bad: [f70f5b9dc74ca7d0a64c4ead3fb28da09dc1b234] Merge
> git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next-2.6
> git bisect bad f70f5b9dc74ca7d0a64c4ead3fb28da09dc1b234

As you can see, you marked the commit as bad during the bisection. There are a few possible explanations:

* You misclassified the commit and need to redo the bisection at least from that
  point.
* The commit really was bad at the time. This could be because the problem
  doesn't always occur even with kernels containing the bug (making the bisection
  difficult, at least requiring more testing before declaring a kernel good or
  bad), or due to an inconsistent incremental build (in which case running make
  clean before building each test kernel should help).
Comment 22 Gabor Lenart 2011-04-11 04:40:42 UTC
Ok, no choice, I try to repeat the whole bisecting process, unfortunately it's not so trivial: sometimes other problems occured, and maybe I was a bit nervous to use "skip" when I should. I always use "make mrproper" now before each build since it seems sometimes I met problems even with "make clean". The interesting part that I noticed odd problems not seems to related to the radeon problem, like other kernel oopses, when I see tons of running modprobe processes in process status "D". It seems using a more basic kernel config makes them disappear (especially disabling audio helps it seems) since it's merely impossible to check the KMS specific problems because of other noticable problems, including strage messages about PnP BIOS too for example ... Hopefully that's not related to my problem otherwise it's not so possible to identify that a kernel is good or bad, if I am only focusing a single test case when inserting radeon module with KMS enabled makes the system freezes (with black screen). 

Btw that PnP BIOS problem I sometimes can notice:

[    0.167336] PnPBIOS: Scanning system for PnP BIOS support...
[    0.167754] PnPBIOS: Found PnP BIOS installation structure at 0xc00fe0f0
[    0.167811] PnPBIOS: PnP BIOS version 1.0, entry 0xf0000:0xba2f, dseg 0x400
[    0.167869] PNPBIOS fault.. attempting recovery.
[    0.167920] PnPBIOS: Warning! Your PnP BIOS caused a fatal error. Attempting to continue
[    0.167983] PnPBIOS: You may need to reboot with the "pnpbios=off" option to operate stably
[    0.168019] PnPBIOS: Check with your vendor for an updated BIOS
[    0.168073] PnPBIOS: dev_node_info: unexpected status 0x3a
[    0.168124] PnPBIOS: Unable to get node info.  Aborting.

Anyway I will tell, if bisecting is done (hopefully it will be ok this time), sorry with disturbing my intermediate msgs here all the time :)
Comment 23 Gabor Lenart 2011-04-11 04:51:51 UTC
Ok, no choice, I try to repeat the whole bisecting process, unfortunately it's not so trivial: sometimes other problems occured, and maybe I was a bit nervous to use "skip" when I should. I always use "make mrproper" now before each build since it seems sometimes I met problems even with "make clean". The interesting part that I noticed odd problems not seems to related to the radeon problem, like other kernel oopses, when I see tons of running modprobe processes in process status "D". It seems using a more basic kernel config makes them disappear (especially disabling audio helps it seems) since it's merely impossible to check the KMS specific problems because of other noticable problems, including strage messages about PnP BIOS too for example ... Hopefully that's not related to my problem otherwise it's not so possible to identify that a kernel is good or bad, if I am only focusing a single test case when inserting radeon module with KMS enabled makes the system freezes (with black screen). 

Btw that PnP BIOS problem I sometimes can notice:

[    0.167336] PnPBIOS: Scanning system for PnP BIOS support...
[    0.167754] PnPBIOS: Found PnP BIOS installation structure at 0xc00fe0f0
[    0.167811] PnPBIOS: PnP BIOS version 1.0, entry 0xf0000:0xba2f, dseg 0x400
[    0.167869] PNPBIOS fault.. attempting recovery.
[    0.167920] PnPBIOS: Warning! Your PnP BIOS caused a fatal error. Attempting to continue
[    0.167983] PnPBIOS: You may need to reboot with the "pnpbios=off" option to operate stably
[    0.168019] PnPBIOS: Check with your vendor for an updated BIOS
[    0.168073] PnPBIOS: dev_node_info: unexpected status 0x3a
[    0.168124] PnPBIOS: Unable to get node info.  Aborting.
Comment 24 Gabor Lenart 2011-04-11 10:57:16 UTC
Well, this result looks like more sane :) And it's really about radeon, so it makes sense. This is the result of bisecting:

36868bda88b92ce8a9aa8b3ee2e0d1e0de09cc19 is the first bad commit
commit 36868bda88b92ce8a9aa8b3ee2e0d1e0de09cc19
Author: Alex Deucher <alexdeucher@gmail.com>
Date:   Thu Jan 6 21:19:21 2011 -0500

    drm/radeon/kms: parse DCE5 encoder caps when setting up encoders
    
    Needed to tell which DIG encoders are HBR2 capable for DP 1.2.
    
    Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
    Signed-off-by: Dave Airlie <airlied@redhat.com>

:040000 040000 3e2fca7e36dd68f455757fab0d303c3ebe9e729f 81431d1b3a7c1ca3f217e4b670131f40aa836b90 M	drivers
Comment 25 Alex Deucher 2011-04-12 18:48:49 UTC
(In reply to comment #24)
> Well, this result looks like more sane :) And it's really about radeon, so it
> makes sense. This is the result of bisecting:
> 
> 36868bda88b92ce8a9aa8b3ee2e0d1e0de09cc19 is the first bad commit
> commit 36868bda88b92ce8a9aa8b3ee2e0d1e0de09cc19
> Author: Alex Deucher <alexdeucher@gmail.com>
> Date:   Thu Jan 6 21:19:21 2011 -0500
> 
>     drm/radeon/kms: parse DCE5 encoder caps when setting up encoders
> 
>     Needed to tell which DIG encoders are HBR2 capable for DP 1.2.
> 
>     Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
>     Signed-off-by: Dave Airlie <airlied@redhat.com>
> 
> :040000 040000 3e2fca7e36dd68f455757fab0d303c3ebe9e729f
> 81431d1b3a7c1ca3f217e4b670131f40aa836b90 M    drivers

This is a duplicate of:
https://bugs.freedesktop.org/show_bug.cgi?id=35502
which is fixed by this patch:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=97ea530f6fac1f9632b0c4792a2a56411454adbe
Comment 26 Michel Dänzer 2011-04-13 04:45:42 UTC

*** This bug has been marked as a duplicate of bug 35502 ***

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.