Bug 29389 - [r300g] [bisected] hard locks with polling enabled in 2.6.35
[r300g] [bisected] hard locks with polling enabled in 2.6.35
Status: RESOLVED FIXED
Product: Mesa
Classification: Unclassified
Component: Drivers/Gallium/r300
git
Other All
: medium normal
Assigned To: Default DRI bug account
:
: 28474 30575 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-08-03 14:17 UTC by Giacomo Perale
Modified: 2010-11-21 05:46 UTC (History)
3 users (show)

See Also:


Attachments
git-bisect.log (2.27 KB, text/plain)
2010-08-03 14:17 UTC, Giacomo Perale
Details
add parameter to disable polling (1.64 KB, patch)
2010-08-19 07:05 UTC, Giacomo Perale
Details | Splinter Review
patch to disable polling DACs (1001 bytes, patch)
2010-10-20 18:38 UTC, Dave Airlie
Details | Splinter Review
dmesg 2.6.36-rc7 with disabled polling DACs patch, rv380 (37.44 KB, text/plain)
2010-10-24 14:43 UTC, PJBrs
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Giacomo Perale 2010-08-03 14:17:16 UTC
Created attachment 37557 [details]
git-bisect.log

Yesterday I upgraded to kernel 2.6.35 and I had a lockup after playing for a few second at openarena. The day before I had played for a few minutes without any issue (testing r300g) so I immediately suspected the kernel.

I'm using a radeon x550 (rv370) with yesterday git snapshot of mesa and xf86-video-ati and xorg-server 1.8.2.

Both r300c/DRI/UMS and r300c/DRI2/KMS dont' seem to suffer from the issue, only gallium. Nothing unusual in the logs.

I bisected from 2.6.34 (good) to 2.6.35 (bad) and git bisect pointed to commit eb1f8e4f3be898df808e2dfc131099f5831d491d

eb1f8e4f3be898df808e2dfc131099f5831d491d is the first bad commit
commit eb1f8e4f3be898df808e2dfc131099f5831d491d
Author: Dave Airlie <airlied@redhat.com>
Date:   Fri May 7 06:42:51 2010 +0000

    drm/fbdev: rework output polling to be back in the core. (v4)
    
    After thinking it over a lot it made more sense for the core to deal with
    the output polling especially so it can notify X.
    
    v2: drop plans for fake connector - per Michel's comments - fix X patch sent to xorg-devel, add intel polled/hpd setting, add initial nouveau polled/hpd settings.
    
    v3: add config lock take inside polling, add intel/nouveau poll init/fini calls
    
    v4: config lock was a bit agressive, only needed around connector list reading.
    otherwise it could re-enter.
    
    glisse: discard drm_helper_hpd_irq_event
    
    v3: Reviewed-by: Michel Dänzer <michel@daenzer.net>
    Signed-off-by: Dave Airlie <airlied@redhat.com>

:040000 040000 01a1bf1ae4e06bfd3ae9ae67b5b5059e964f5ae4 041231a5c060e531ce0d8127c6f7abc79c14ce76 M	drivers
:040000 040000 b67fd6698fa239d26ca9fa2296ea2403e1eaaf1c cadb905c6d8647313107790ce8b681f4611ee726 M	include


I had to skip a few revisions because opengl didn't work at all and one revision I'm not exactly sure it crashed for the same issue (the symptoms were slightly different, it crashed after maybe two minutes instead of 5-30s) so I'm not sure that that's the real culprit.

The bisection run took me a few hours and a lot of reboots, so I'm not exactly eager to try it again...

The full git bisect log is attached.
Comment 1 Giacomo Perale 2010-08-19 07:05:58 UTC
Created attachment 37981 [details] [review]
add parameter to disable polling

I came back from vacation and found a patch from Chris Wilson in dri-devel who added a parameter to disable polling; since my bisection run pointed to the commit who enabled polling as the origin of my problem, I quickly adapted the patch to vanilla 2.6.35 to do some tests.

The patch is attached to this comment; to try it patch the kernel and boot with drm_kms_helper.poll=0; be careful, without this parameter sometime X refused to start here. Since this was only a test I didn't look too hard, I think it's related to the slow work->workqueues conversion.

Anyway, with polling disabled I was able to play with openarena for about 15 minutes, while with polling (without the patch) I had a hardlock in 30-60s.

Since this could be important, I have a VGA/DVI card and I'm using the DVI port, with nothing connected to the VGA port.
Comment 2 Niels Ole Salscheider 2010-08-26 09:09:06 UTC
I can confirm this problem. With polling disabled my computer does not lock up anymore.
Comment 3 Marek Olšák 2010-08-27 21:21:55 UTC
*** Bug 28474 has been marked as a duplicate of this bug. ***
Comment 4 Dave Airlie 2010-10-20 17:27:01 UTC
can you attach a dmesg for the rv370 card?

and also what connectors you have monitors connected to.
Comment 5 Dave Airlie 2010-10-20 18:38:47 UTC
Created attachment 39604 [details] [review]
patch to disable polling DACs

does this patch help when booting with polling enabled?
Comment 6 PJBrs 2010-10-24 14:40:13 UTC
I reported the same bug (I think) over here:

https://bugzilla.kernel.org/show_bug.cgi?id=20042

Your latest patch disabling polling DACs works for my, I applied it to 2.6.36-rc7 and then ran Penumbra Overture without any problem. I'm using rv380, not rv370 (according to dmesg). I've included dmesg output as an attachment.

Thanks very much for this patch!
Comment 7 PJBrs 2010-10-24 14:43:21 UTC
Created attachment 39745 [details]
dmesg 2.6.36-rc7 with disabled polling DACs patch, rv380
Comment 8 Tomasz Czapiewski 2010-10-25 01:51:53 UTC
I have radeon module compilation errors with this patch applied on 2.6.35:

make: Entering directory `/usr/src/linux-headers-2.6.35-22-generic'                                                                                  
  CC [M]  /usr/src/linux-source-2.6.35/linux-source-2.6.35/drivers/gpu/drm/radeon/radeon_drv.o                                                       
  CC [M]  /usr/src/linux-source-2.6.35/linux-source-2.6.35/drivers/gpu/drm/radeon/radeon_cp.o                                                        
  CC [M]  /usr/src/linux-source-2.6.35/linux-source-2.6.35/drivers/gpu/drm/radeon/radeon_state.o                                                     
  CC [M]  /usr/src/linux-source-2.6.35/linux-source-2.6.35/drivers/gpu/drm/radeon/radeon_mem.o                                                       
  CC [M]  /usr/src/linux-source-2.6.35/linux-source-2.6.35/drivers/gpu/drm/radeon/radeon_irq.o                                                       
  CC [M]  /usr/src/linux-source-2.6.35/linux-source-2.6.35/drivers/gpu/drm/radeon/r300_cmdbuf.o                                                      
  CC [M]  /usr/src/linux-source-2.6.35/linux-source-2.6.35/drivers/gpu/drm/radeon/r600_cp.o                                                          
  CC [M]  /usr/src/linux-source-2.6.35/linux-source-2.6.35/drivers/gpu/drm/radeon/radeon_device.o                                                    
  CC [M]  /usr/src/linux-source-2.6.35/linux-source-2.6.35/drivers/gpu/drm/radeon/radeon_asic.o                                                      
  CC [M]  /usr/src/linux-source-2.6.35/linux-source-2.6.35/drivers/gpu/drm/radeon/radeon_kms.o                                                       
  CC [M]  /usr/src/linux-source-2.6.35/linux-source-2.6.35/drivers/gpu/drm/radeon/radeon_atombios.o                                                  
/usr/src/linux-source-2.6.35/linux-source-2.6.35/drivers/gpu/drm/radeon/radeon_atombios.c: In function ‘radeon_atom_get_hpd_info_from_gpio’:         
/usr/src/linux-source-2.6.35/linux-source-2.6.35/drivers/gpu/drm/radeon/radeon_atombios.c:199: warning: ‘hpd.plugged_state’ is used uninitialized in 
  CC [M]  /usr/src/linux-source-2.6.35/linux-source-2.6.35/drivers/gpu/drm/radeon/radeon_agp.o                                                       
  CC [M]  /usr/src/linux-source-2.6.35/linux-source-2.6.35/drivers/gpu/drm/radeon/atombios_crtc.o                                                    
  CC [M]  /usr/src/linux-source-2.6.35/linux-source-2.6.35/drivers/gpu/drm/radeon/radeon_combios.o                                                   
  CC [M]  /usr/src/linux-source-2.6.35/linux-source-2.6.35/drivers/gpu/drm/radeon/atom.o                                                             
  CC [M]  /usr/src/linux-source-2.6.35/linux-source-2.6.35/drivers/gpu/drm/radeon/radeon_fence.o                                                     
  CC [M]  /usr/src/linux-source-2.6.35/linux-source-2.6.35/drivers/gpu/drm/radeon/radeon_ttm.o                                                       
  CC [M]  /usr/src/linux-source-2.6.35/linux-source-2.6.35/drivers/gpu/drm/radeon/radeon_object.o                                                    
  CC [M]  /usr/src/linux-source-2.6.35/linux-source-2.6.35/drivers/gpu/drm/radeon/radeon_gart.o                                                      
  CC [M]  /usr/src/linux-source-2.6.35/linux-source-2.6.35/drivers/gpu/drm/radeon/radeon_legacy_crtc.o                                               
  CC [M]  /usr/src/linux-source-2.6.35/linux-source-2.6.35/drivers/gpu/drm/radeon/radeon_legacy_encoders.o                                           
  CC [M]  /usr/src/linux-source-2.6.35/linux-source-2.6.35/drivers/gpu/drm/radeon/radeon_connectors.o                                                
/usr/src/linux-source-2.6.35/linux-source-2.6.35/drivers/gpu/drm/radeon/radeon_connectors.c: In function ‘radeon_vga_detect’:                        
/usr/src/linux-source-2.6.35/linux-source-2.6.35/drivers/gpu/drm/radeon/radeon_connectors.c:626: error: ‘force’ undeclared (first use in this functio
/usr/src/linux-source-2.6.35/linux-source-2.6.35/drivers/gpu/drm/radeon/radeon_connectors.c:626: error: (Each undeclared identifier is reported only 
/usr/src/linux-source-2.6.35/linux-source-2.6.35/drivers/gpu/drm/radeon/radeon_connectors.c:626: error: for each function it appears in.)            
/usr/src/linux-source-2.6.35/linux-source-2.6.35/drivers/gpu/drm/radeon/radeon_connectors.c: In function ‘radeon_dvi_detect’:                        
/usr/src/linux-source-2.6.35/linux-source-2.6.35/drivers/gpu/drm/radeon/radeon_connectors.c:814: error: ‘force’ undeclared (first use in this functio
make[1]: *** [/usr/src/linux-source-2.6.35/linux-source-2.6.35/drivers/gpu/drm/radeon/radeon_connectors.o] Błąd 1                                    
make[1]: *** Oczekiwanie na niezakończone zadania....                                                                                                
make: *** [_module_/usr/src/linux-source-2.6.35/linux-source-2.6.35/drivers/gpu/drm/radeon] Error 2                                                  
make: Leaving directory `/usr/src/linux-headers-2.6.35-22-generic'
Comment 9 Marek Olšák 2010-11-21 02:59:35 UTC
*** Bug 30575 has been marked as a duplicate of this bug. ***
Comment 10 Hicham HAOUARI 2010-11-21 04:17:13 UTC
This bug have been fixed in upstream kernel (.37-rc1 IIRC), so I think this bug can be safely closed.

FWIW, it has been fixed in fedora stock kernel 2.6.35-51.fc14 also.

Thank you Dave
Comment 11 Marek Olšák 2010-11-21 05:46:18 UTC
Thanks for feedback, closing..