Bug 83616 - System crashes in xonotic with kernel 3.17-rc since linux commit 86302eeadebfab94530b00f5e53a23f911ff41e4
Summary: System crashes in xonotic with kernel 3.17-rc since linux commit 86302eeadebf...
Status: RESOLVED MOVED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Radeon (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-09-08 15:34 UTC by Bug
Modified: 2019-11-19 08:55 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
only enable me/pfp sync evergreen+ (1.21 KB, patch)
2014-09-08 17:18 UTC, Alex Deucher
no flags Details | Splinter Review
dmesg-drm-next-3.18-wip.log (49.47 KB, text/plain)
2014-09-08 20:56 UTC, Dieter Nützel
no flags Details
Xorg.0.log-drm-next-3.18-wip (43.28 KB, text/plain)
2014-09-08 20:58 UTC, Dieter Nützel
no flags Details

Description Bug 2014-09-08 15:34:49 UTC
linux git 86302eeadebfab94530b00f5e53a23f911ff41e4 is the first bad commit

launching xonotic hangs and resets the gpu, game video output looks corrupted,  then eventually reboots the system, or can only sysrq out


mesa git current
xorg server 1.16
[   414.741] (--) RADEON(0): Chipset: "ATI RV730XT [Radeon HD 4670]" 
                 (ChipID = 0x9490)
Comment 1 Bug 2014-09-08 15:37:52 UTC
commit 86302eeadebfab94530b00f5e53a23f911ff41e4
Author: Christian König <christian.koenig@amd.com>
Date:   Mon Aug 18 16:30:12 2014 +0200

    drm/radeon: Sync ME and PFP after CP semaphore waits v4
    
    Fixes lockups due to CP read GPUVM faults when running piglit on Cape
    Verde.
    
    v2 (chk): apply the fix to R600+ as well, on CIK only the GFX CP has
          a PFP, add more comments to R600 code, enable flushing again
    v3: (agd5f): only apply to 7xx+.  r6xx does not have the packet.
    v4: (agd5f): split flush change into a separate patch, fix formatting
    
    Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
    Signed-off-by: Christian König <christian.koenig@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Comment 2 Bug 2014-09-08 17:05:51 UTC
not fixed in rc4 it wasnt
Comment 3 Alex Deucher 2014-09-08 17:18:48 UTC
Created attachment 105904 [details] [review]
only enable me/pfp sync evergreen+
Comment 4 Bug 2014-09-08 18:05:48 UTC
it didnt solve the bug with my RV730 chip,
but when I commented out the whole if block it did work
so maybe my chip is from before chedar

thanks
Comment 5 Alex Deucher 2014-09-08 18:21:22 UTC
(In reply to comment #4)
> it didnt solve the bug with my RV730 chip,
> but when I commented out the whole if block it did work
> so maybe my chip is from before chedar

Your chip is before cedar so the patch prevents that block from executing on your chip, the same as commenting out the entire block.  Are you sure you tested the right kernel?
Comment 6 Bug 2014-09-08 20:14:16 UTC
I made a mistake, I tested it again with your patch on a clean 3.17-rc4 and it works now, thanks
Comment 7 Dieter Nützel 2014-09-08 20:55:07 UTC
This fixes ONE of the bugs Christian and me are hunting for some days, now.
Making my RV730 AGP (!!!) working with 3.17-rcX/drm-next-3.18-wip, again.

Crashed with mplayer -vo vdpau XXX immediatly.

But 'allow UVD to use a second 256MB segment' gave no speedup, at least for me.

Maybe it is the 'second' bug (regression) introduced with the switch from 3.16 to 3.17 (drm-next).

[   11.218269] [drm] radeon kernel modesetting enabled.

[   11.218416] checking generic (c0000000 5b0000) vs hw (c0000000 10000000)

[   11.218420] fb: switching to radeondrmfb from VESA VGA
[   11.218475] Console: switching to colour dummy device 80x25
[   11.229222] [drm] initializing kernel modesetting (RV730 0x1002:0x9495 0x174B:0x0028).
[   11.229262] [drm] register mmio base: 0xDFDF0000
[   11.229265] [drm] register mmio size: 65536
[   11.229954] ATOM BIOS: 113
[   11.234817] [drm] AGP mode requested: 8
[   11.234836] agpgart-via 0000:00:00.0: AGP 3.5 bridge
[   11.234866] agpgart-via 0000:00:00.0: putting AGP V3 device into 8x mode
[   11.234946] radeon 0000:01:00.0: putting AGP V3 device into 8x mode
[   11.234951] radeon 0000:01:00.0: GTT: 256M 0xE0000000 - 0xEFFFFFFF
[   11.234959] radeon 0000:01:00.0: VRAM: 1024M 0xA0000000 - 0xDFFFFFFF (1024M used)
[   11.234964] [drm] Detected VRAM RAM=1024M, BAR=256M
[   11.234966] [drm] RAM width 128bits DDR
[   11.238704] [TTM] Zone  kernel: Available graphics memory: 441678 kiB
[   11.238712] [TTM] Zone highmem: Available graphics memory: 1033522 kiB
[   11.238715] [TTM] Initializing pool allocator
[   11.238783] [drm] radeon: 1024M of VRAM memory ready
[   11.238787] [drm] radeon: 256M of GTT memory ready.
[   11.238828] [drm] Loading RV730 Microcode
[   11.239204] [drm] Internal thermal controller with fan control
[   11.261230] [drm] radeon: dpm initialized
[   11.262463] [drm] GART: num cpu pages 65536, num gpu pages 65536

[   11.316570] radeon 0000:01:00.0: (-1) pin WB bo failed
[   11.316582] radeon 0000:01:00.0: f2fb0c00 unpin not necessary
[   11.316601] radeon 0000:01:00.0: disabling GPU acceleration
[   11.369220] radeon 0000:01:00.0: f6064000 unpin not necessary

[   11.433188] [TTM] Finalizing pool allocator
[   11.433309] [TTM] Zone  kernel: Used memory at exit: 0 kiB
[   11.433317] [TTM] Zone highmem: Used memory at exit: 0 kiB
[   11.433320] [drm] radeon: ttm finalized

[   11.433325] [drm] Forcing AGP to PCIE mode

[   11.436109] ATOM BIOS: 113
[   11.436236] radeon 0000:01:00.0: VRAM: 1024M 0x0000000000000000 - 0x000000003FFFFFFF (1024M used)
[   11.436241] radeon 0000:01:00.0: GTT: 1024M 0x0000000040000000 - 0x000000007FFFFFFF
[   11.436245] [drm] Detected VRAM RAM=1024M, BAR=256M
[   11.436248] [drm] RAM width 128bits DDR
[   11.439116] [TTM] Zone  kernel: Available graphics memory: 441678 kiB
[   11.439124] [TTM] Zone highmem: Available graphics memory: 1033522 kiB
[   11.439126] [TTM] Initializing pool allocator
[   11.439204] [drm] radeon: 1024M of VRAM memory ready
[   11.439207] [drm] radeon: 1024M of GTT memory ready.
[   11.439229] [drm] Internal thermal controller with fan control
[   11.446639] [drm] radeon: dpm initialized
[   11.451486] [drm] GART: num cpu pages 262144, num gpu pages 262144

[   11.478175] [drm] PCIE GART of 1024M enabled (table at 0x000000000025E000).
[   11.478272] radeon 0000:01:00.0: WB enabled
[   11.478280] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000040000c00 and cpu addr 0xf2fc0c00
[   11.478285] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000040000c0c and cpu addr 0xf2fc0c0c
[   11.522522] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x000000000005c598 and cpu addr 0xf929c598
[   11.522536] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[   11.522538] [drm] Driver supports precise vblank timestamp query.
[   11.522611] [drm] radeon: irq initialized.
[   11.603504] [drm] ring test on 0 succeeded in 1 usecs
[   11.603574] [drm] ring test on 3 succeeded in 1 usecs
[   11.794270] [drm] ring test on 5 succeeded in 1 usecs
[   11.794284] [drm] UVD initialized successfully.
[   11.794561] [drm] ib test on ring 0 succeeded in 0 usecs
[   11.794595] [drm] ib test on ring 3 succeeded in 0 usecs
[   12.445034] [drm] ib test on ring 5 succeeded
[   12.448099] [drm] Radeon Display Connectors
[   12.448111] [drm] Connector 0:
[   12.448113] [drm]   DVI-I-1
[   12.448115] [drm]   HPD2
[   12.448119] [drm]   DDC: 0x7f10 0x7f10 0x7f14 0x7f14 0x7f18 0x7f18 0x7f1c 0x7f1c
[   12.448121] [drm]   Encoders:
[   12.448124] [drm]     CRT2: INTERNAL_KLDSCP_DAC2
[   12.448126] [drm]     DFP2: INTERNAL_UNIPHY1
[   12.448128] [drm] Connector 1:
[   12.448130] [drm]   DVI-I-2
[   12.448132] [drm]   HPD1
[   12.448135] [drm]   DDC: 0x7e20 0x7e20 0x7e24 0x7e24 0x7e28 0x7e28 0x7e2c 0x7e2c
[   12.448137] [drm]   Encoders:
[   12.448138] [drm]     CRT1: INTERNAL_KLDSCP_DAC1
[   12.448140] [drm]     DFP1: INTERNAL_UNIPHY
[   12.448142] [drm] Connector 2:
[   12.448143] [drm]   DIN-1
[   12.448145] [drm]   Encoders:
[   12.448147] [drm]     TV1: INTERNAL_KLDSCP_DAC2
[   12.548120] [drm] fb mappable at 0xC045F000
[   12.548129] [drm] vram apper at 0xC0000000
[   12.548131] [drm] size 8294400
[   12.548133] [drm] fb depth is 24
[   12.548135] [drm]    pitch is 7680
[   12.548539] fbcon: radeondrmfb (fb0) is primary device
[   12.549685] Console: switching to colour frame buffer device 240x67
[   12.600307] radeon 0000:01:00.0: fb0: radeondrmfb frame buffer device
[   12.600311] radeon 0000:01:00.0: registered panic notifier
[   12.603112] [drm] Initialized radeon 2.40.0 20080528 for 0000:01:00.0 on minor 0

Tried after bisection with reverting:
drm-radeon-fix-display-handling-in-radeon_gpu_reset.patch

Any ideas, Alex/Christian?
Comment 8 Dieter Nützel 2014-09-08 20:56:41 UTC
Created attachment 105922 [details]
dmesg-drm-next-3.18-wip.log
Comment 9 Dieter Nützel 2014-09-08 20:58:08 UTC
Created attachment 105923 [details]
Xorg.0.log-drm-next-3.18-wip
Comment 10 Alex Deucher 2014-09-08 21:34:42 UTC
(In reply to comment #7)
> This fixes ONE of the bugs Christian and me are hunting for some days, now.
> Making my RV730 AGP (!!!) working with 3.17-rcX/drm-next-3.18-wip, again.
> 
> Crashed with mplayer -vo vdpau XXX immediatly.
> 
> But 'allow UVD to use a second 256MB segment' gave no speedup, at least for
> me.
> 
> Maybe it is the 'second' bug (regression) introduced with the switch from
> 3.16 to 3.17 (drm-next).
> 
[snip]
> 
> Tried after bisection with reverting:
> drm-radeon-fix-display-handling-in-radeon_gpu_reset.patch
> 
> Any ideas, Alex/Christian?

I'm not sure what you are asking.  What issue are you having?
Comment 11 Dieter Nützel 2014-09-08 21:46:41 UTC
(In reply to comment #10)
> (In reply to comment #7)
> > This fixes ONE of the bugs Christian and me are hunting for some days, now.
> > Making my RV730 AGP (!!!) working with 3.17-rcX/drm-next-3.18-wip, again.
> > 
> > Crashed with mplayer -vo vdpau XXX immediatly.
> > 
> > But 'allow UVD to use a second 256MB segment' gave no speedup, at least for
> > me.
> > 
> > Maybe it is the 'second' bug (regression) introduced with the switch from
> > 3.16 to 3.17 (drm-next).
> > 
> [snip]
> > 
> > Tried after bisection with reverting:
> > drm-radeon-fix-display-handling-in-radeon_gpu_reset.patch
> > 
> > Any ideas, Alex/Christian?
> 
> I'm not sure what you are asking.  What issue are you having?

Sorry Alex, but is this intended (it is in the logs since 3.16+, too)?

[   11.316570] radeon 0000:01:00.0: (-1) pin WB bo failed
[   11.316582] radeon 0000:01:00.0: f2fb0c00 unpin not necessary
[   11.316601] radeon 0000:01:00.0: disabling GPU acceleration
[   11.369220] radeon 0000:01:00.0: f6064000 unpin not necessary

[   11.433188] [TTM] Finalizing pool allocator
[   11.433309] [TTM] Zone  kernel: Used memory at exit: 0 kiB
[   11.433317] [TTM] Zone highmem: Used memory at exit: 0 kiB
[   11.433320] [drm] radeon: ttm finalized

[   11.433325] [drm] Forcing AGP to PCIE mode

So, maybe GTT/GART grows to big?
Comment 12 Alex Deucher 2014-09-08 21:56:00 UTC
(In reply to comment #11)
> 
> Sorry Alex, but is this intended (it is in the logs since 3.16+, too)?
> 
> [   11.316570] radeon 0000:01:00.0: (-1) pin WB bo failed
> [   11.316582] radeon 0000:01:00.0: f2fb0c00 unpin not necessary
> [   11.316601] radeon 0000:01:00.0: disabling GPU acceleration
> [   11.369220] radeon 0000:01:00.0: f6064000 unpin not necessary
> 
> [   11.433188] [TTM] Finalizing pool allocator
> [   11.433309] [TTM] Zone  kernel: Used memory at exit: 0 kiB
> [   11.433317] [TTM] Zone highmem: Used memory at exit: 0 kiB
> [   11.433320] [drm] radeon: ttm finalized
> 
> [   11.433325] [drm] Forcing AGP to PCIE mode
> 
> So, maybe GTT/GART grows to big?

AGP should work.  You'll have to bisect.  Anyway, this is not related to this bug.  Please open a new one.
Comment 13 Dieter Nützel 2014-09-19 11:12:49 UTC
Comment on attachment 105922 [details]
dmesg-drm-next-3.18-wip.log

My second bug is fixed with:

3840a65 drm/radeon: fix AGP userptr handling
Comment 14 Dieter Nützel 2014-09-19 11:13:14 UTC
Comment on attachment 105923 [details]
Xorg.0.log-drm-next-3.18-wip

My second bug is fixed with:

3840a65 drm/radeon: fix AGP userptr handling
Comment 15 Martin Peres 2019-11-19 08:55:16 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/523.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.