Bug 82455

Summary: [SPARC] Failed to allocate virtual address for buffer
Product: Mesa Reporter: charlie <407883775>
Component: Drivers/Gallium/radeonsiAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED WONTFIX QA Contact: charlie <407883775>
Severity: normal    
Priority: medium    
Version: git   
Hardware: SPARC   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: detail
detail
Convert resource descriptors to little endian on big endian hosts
bigend
Use CPU page size instead of hardcoding 4096
GPU CP LOOKUP

Description charlie 2014-08-11 09:16:57 UTC
I am update kernel to 3.10.0 , and the mesa version is 9.2.5; Our video card is AMD HD7750 . 
             The system is running very well in text mode. But when I am used startx command start graphic desktop. The X can't work.
             Our team analysis the problem and read the mesa code. Due to Mesa , I think.
              our computer arch is "sparc"
              I sought you commit the bug in the freedesktoporg :  http://cgit.freedesktop.org/mesa/mesa/commit/?id=a37835c8eda017f0c955e0927e7418e7f3ba3b73
              but the mesa (version 9.2.5)  have this bug. how could i do . pls give me some message . thk . 
              The attachment is system dmesg . 
         
             The debug message is follow:
              [dmesg]
 [  355.852288] radeon 0000:01:00.0: soffset:0x00100000, eoffset: 0x00200000 
[  355.853024] radeon 0000:01:00.0: tmp->soffset:0x00000000, tmp->eoffset: 0x00000000, size:0x00100000
[  359.114368] radeon 0000:01:00.0: soffset:0x00100000, eoffset: 0x00200000 
[  359.114399] radeon 0000:01:00.0: tmp->soffset:0x00000000, tmp->eoffset: 0x00000000, size:0x00100000
[  359.433678] radeon 0000:01:00.0: soffset:0x00100000, eoffset: 0x00200000 
[  359.433711] radeon 0000:01:00.0: tmp->soffset:0x00000000, tmp->eoffset: 0x00000000, size:0x00100000
[  359.433891] radeon 0000:01:00.0: soffset:0x00100000, eoffset: 0x00200000 
[  359.433915] radeon 0000:01:00.0: tmp->soffset:0x00000000, tmp->eoffset: 0x00000000, size:0x00100000
[  519.314418] radeon 0000:01:00.0: soffset:0x00100000, eoffset: 0x00200000 
[  519.314453] radeon 0000:01:00.0: tmp->soffset:0x00000000, tmp->eoffset: 0x00000000, size:0x00100000
[  521.768715] radeon 0000:01:00.0: soffset:0x00100000, eoffset: 0x00200000 
[  521.768746] radeon 0000:01:00.0: tmp->soffset:0x00000000, tmp->eoffset: 0x00000000, size:0x00100000
[  522.077890] radeon 0000:01:00.0: soffset:0x00100000, eoffset: 0x00200000 
[  522.077924] radeon 0000:01:00.0: tmp->soffset:0x00000000, tmp->eoffset: 0x00000000, size:0x00100000
[  522.078105] radeon 0000:01:00.0: soffset:0x00100000, eoffset: 0x00200000 
[  522.078129] radeon 0000:01:00.0: tmp->soffset:0x00000000, tmp->eoffset: 0x00000000, size:0x00100000
[  565.368145] radeon 0000:01:00.0: [radeon_gem_va_ioctl, 525]:args->offset: 0x00800000,  argx->flags:22 
[  565.368180] radeon 0000:01:00.0: soffset:0x00800000, eoffset: 0x00802000 
[  565.368201] radeon 0000:01:00.0: tmp->soffset:0x00000000, tmp->eoffset: 0x00000000, size:0x00002000
[  565.368221] radeon 0000:01:00.0: tmp->soffset:0x00100000, tmp->eoffset: 0x00200000, size:0x00002000
[  763.251113] radeon 0000:01:00.0: [radeon_gem_va_ioctl, 525]:args->offset: 0x00801000,  argx->flags:22 
[  763.251148] radeon 0000:01:00.0: soffset:0x00801000, eoffset: 0x00803000 
[  763.251169] radeon 0000:01:00.0: tmp->soffset:0x00000000, tmp->eoffset: 0x00000000, size:0x00002000
[  763.251189] radeon 0000:01:00.0: tmp->soffset:0x00100000, tmp->eoffset: 0x00200000, size:0x00002000
[  763.251210] radeon 0000:01:00.0: tmp->soffset:0x00800000, tmp->eoffset: 0x00802000, size:0x00002000
[  763.251234] radeon 0000:01:00.0: bo fffff801f3ee2c00 va 0x00000000 conflict with (bo fffff801f3ee3000 0x00800000 0x00802000)
[  763.251855] radeon 0000:01:00.0: [radeon_gem_va_ioctl, 525]:args->offset: 0x00801000,  argx->flags:22 
[  763.251878] radeon 0000:01:00.0: soffset:0x00801000, eoffset: 0x00803000 
[  763.251899] radeon 0000:01:00.0: tmp->soffset:0x00000000, tmp->eoffset: 0x00000000, size:0x00002000
[  763.251919] radeon 0000:01:00.0: tmp->soffset:0x00100000, tmp->eoffset: 0x00200000, size:0x00002000
[  763.251939] radeon 0000:01:00.0: tmp->soffset:0x00800000, tmp->eoffset: 0x00802000, size:0x00002000
[  763.251962] radeon 0000:01:00.0: bo fffff801f3ee2c00 va 0x00000000 conflict with (bo fffff801f3ee3000 0x00800000 0x00802000)
[  884.626830] radeon 0000:01:00.0: soffset:0x00100000, eoffset: 0x00200000 
[  884.626864] radeon 0000:01:00.0: tmp->soffset:0x00000000, tmp->eoffset: 0x00000000, size:0x00100000
[  887.207173] radeon 0000:01:00.0: soffset:0x00100000, eoffset: 0x00200000 
[  887.207204] radeon 0000:01:00.0: tmp->soffset:0x00000000, tmp->eoffset: 0x00000000, size:0x00100000
[  887.564451] radeon 0000:01:00.0: soffset:0x00100000, eoffset: 0x00200000 
[  887.564488] radeon 0000:01:00.0: tmp->soffset:0x00000000, tmp->eoffset: 0x00000000, size:0x00100000
[  887.564672] radeon 0000:01:00.0: soffset:0x00100000, eoffset: 0x00200000 
[  887.564696] radeon 0000:01:00.0: tmp->soffset:0x00000000, tmp->eoffset: 0x00000000, size:0x00100000
[  985.395298] radeon 0000:01:00.0: soffset:0x00100000, eoffset: 0x00200000 
[  985.395332] radeon 0000:01:00.0: tmp->soffset:0x00000000, tmp->eoffset: 0x00000000, size:0x00100000
[  987.861353] radeon 0000:01:00.0: soffset:0x00100000, eoffset: 0x00200000 
[  987.861387] radeon 0000:01:00.0: tmp->soffset:0x00000000, tmp->eoffset: 0x00000000, size:0x00100000
[  988.175840] radeon 0000:01:00.0: soffset:0x00100000, eoffset: 0x00200000 
[  988.175873] radeon 0000:01:00.0: tmp->soffset:0x00000000, tmp->eoffset: 0x00000000, size:0x00100000
[  988.176054] radeon 0000:01:00.0: soffset:0x00100000, eoffset: 0x00200000 
[  988.176078] radeon 0000:01:00.0: tmp->soffset:0x00000000, tmp->eoffset: 0x00000000, size:0x00100000
Comment 1 Michel Dänzer 2014-08-11 09:36:54 UTC
> radeon 0000:01:00.0: bo fffff801f3ee2c00 va 0x00000000 conflict with (bo fffff801f3ee3000 0x00800000 0x00802000)

Looks like the Mesa radeon winsys is not assigning GPU virtual memory addresses correctly (0x00000000 is not a valid virtual address).

You might have better luck with that with current Mesa Git master, but note that the Mesa radeonsi driver needs a lot of work before it has any chance of working correctly on big endian hosts.
Comment 2 charlie 2014-08-11 11:47:57 UTC
           Michel Dänzer thk for your answer ,you means that the Mesa radeonsi drivers cannot work well on big endian hosts? Now ,I am read the Mesa codes and analysis kernel code.the bo_object can create , but it not allocate virtual space in the userspace.I am not cleaner for GEM/TTM ,and the docs is lacking. I must analysis the code .
Comment 3 Christian König 2014-08-11 11:55:55 UTC
(In reply to comment #2)
>            Michel Dänzer thk for your answer ,you means that the Mesa
> radeonsi drivers cannot work well on big endian hosts?

The radeonsi driver currently isn't very well tested on big endian hosts. In theory it should work, but that doesn't mean there might be a lot of bugs.
Comment 4 charlie 2014-08-12 01:20:37 UTC
>>The radeonsi driver currently isn't very well tested on big endian hosts.
>>In theory it should work, but that doesn't mean there might be a lot of bugs.

I test the newest Mesa(10.2.5), The problem is still existing. 
It isn't test very well on big-endian hosts , That means it can work?  I never analysis the mesa code, The code it too more, and haven't docs, Could you give me some docs? Or give me some ideals? how to alter this code, pls give me a direction. 
Thk. 

message:
Xorg.log
[  2831.801] (II) LoadModule: "fbdevhw"
[  2831.801] (II) Loading /usr/lib64/xorg/modules/libfbdevhw.so
[  2831.802] (II) Module fbdevhw: vendor="X.Org Foundation"
[  2831.802]    compiled for 1.15.0, module version = 0.0.2
[  2831.803]    ABI class: X.Org Video Driver, version 15.0
[  2831.803] (II) RADEON(0): Creating default Display subsection in Screen section
        "Default Screen Section" for depth/fbbpp 24/32
[  2831.803] (==) RADEON(0): Depth 24, (--) framebuffer bpp 32
[  2831.803] (II) RADEON(0): Pixel depth = 24 bits stored in 4 bytes (32 bpp pixmaps)
[  2831.803] (==) RADEON(0): Default visual is TrueColor
[  2831.804] (==) RADEON(0): RGB weight 888
[  2831.804] (II) RADEON(0): Using 8 bits per RGB (8 bit DAC)
[  2831.804] (--) RADEON(0): Chipset: "VERDE" (ChipID = 0x683f)
[  2831.805] (II) Loading sub module "dri2"
[  2831.805] (II) LoadModule: "dri2"
[  2831.805] (II) Module "dri2" already built-in
[  2831.805] (II) Loading sub module "glamoregl"
[  2831.805] (II) LoadModule: "glamoregl"
[  2831.806] (II) Loading /usr/lib64/xorg/modules/libglamoregl.so
[  2831.806] (II) Module glamoregl: vendor="X.Org Foundation"
[  2831.806]    compiled for 1.15.0, module version = 0.5.1
[  2831.806]    ABI class: X.Org ANSI C Emulation, version 0.4
[  2831.806] (II) glamor: OpenGL accelerated X.org driver based.
[  2831.915] (EE)
[  2831.915] (EE) Backtrace:
[  2831.916] (EE) 0: /usr/bin/X (0x100000+0x1d0704) [0x2d0704]
[  2831.917] (EE) 1: /lib64/libc.so.6 (0xfffff80100ff0000+0x39d58) [0xfffff80101029d58]
[  2831.917] (EE) 2: /usr/lib64/dri/radeonsi_dri.so (0xfffff80103e4c000+0x3c8778) [0xfffff80104214778]
[  2831.917] (EE) 3: /usr/lib64/dri/radeonsi_dri.so (0xfffff80103e4c000+0x3c9298) [0xfffff80104215298]
[  2831.917] (EE) 4: /usr/lib64/dri/radeonsi_dri.so (0xfffff80103e4c000+0x3c9670) [0xfffff80104215670]
[  2831.918] (EE) 5: /usr/lib64/dri/radeonsi_dri.so (0xfffff80103e4c000+0x45f88) [0xfffff80103e91f88]
[  2831.918] (EE) 6: /usr/lib64/dri/radeonsi_dri.so (0xfffff80103e4c000+0x3e630c) [0xfffff8010423230c]
[  2831.918] (EE) 7: /usr/lib64/dri/radeonsi_dri.so (0xfffff80103e4c000+0x4729c) [0xfffff80103e9329c]
[  2831.918] (EE) 8: /lib64/libgbm.so.1 (0xfffff80102028000+0x4594) [0xfffff8010202c594]
[  2831.919] (EE) 9: /lib64/libgbm.so.1 (_gbm_create_device+0xc4) [0xfffff8010202a964]
[  2831.919] (EE) 10: /lib64/libgbm.so.1 (gbm_create_device+0x6c) [0xfffff8010202a18c]
[  2831.919] (EE) 11: /usr/lib64/xorg/modules/libglamoregl.so (glamor_egl_init+0x88) [0xfffff80101dfe988]
[  2831.919] (EE) 12: /usr/lib64/xorg/modules/drivers/radeon_drv.so (0xfffff801038a0000+0x61ef4) [0xfffff80103901ef4]
[  2831.920] (EE) 13: /usr/lib64/xorg/modules/drivers/radeon_drv.so (0xfffff801038a0000+0x5b2a4) [0xfffff801038fb2a4]
[  2831.920] (EE) 14: /usr/bin/X (InitOutput+0xd50) [0x189b30]
[  2831.921] (EE) 15: /usr/bin/X (0x100000+0x3f880) [0x13f880]
[  2831.921] (EE) 16: /lib64/libc.so.6 (__libc_start_main+0x100) [0xfffff80101013c20]
[  2831.922] (EE) 17: /usr/bin/X (_start+0x2c) [0x128d0c]
[  2831.922] (EE)
[  2831.922] (EE) Segmentation fault at address 0x48
[  2831.923] (EE)
Fatal server error: 

kernel
[ 2865.746929] radeon 0000:0a:00.0: bo fffff801f546ec00 va 0x00000000 conflict with (bo fffff801f546c000 0x00802000 0x00804000)
[ 2865.747488] radeon 0000:0a:00.0: bo fffff801f546ec00 va 0x00000000 conflict with (bo fffff801f546c000 0x00802000 0x00804000)
Comment 5 Michel Dänzer 2014-08-12 01:22:49 UTC
(In reply to comment #3)
> In theory it should work, but that doesn't mean there might be a lot of bugs.

Actually, it most definitely can't work right now. Just for one thing, every memory access by shaders (for constants, resource descriptors, ...) currently can't work correctly on big endian hosts.
Comment 6 Michel Dänzer 2014-08-12 01:26:43 UTC
FWIW, the failure to assign a valid virtual address is probably an endianness bug in the virtual memory related code in src/gallium/winsys/radeon/drm/radeon_drm_bo.c, e.g. in radeon_bomgr_find_va(), or maybe in the corresponding kernel code.
Comment 7 charlie 2014-08-12 01:33:41 UTC
To all:
   Thks for your help.^_^^_^^_^^_^^_^^_^^_^
Comment 8 charlie 2014-08-14 09:07:05 UTC
Our system PAGE_SIZE is 8k, but the GPU PAGE_SIZE is 4k. The mesa radeon
Comment 9 charlie 2014-08-15 02:29:53 UTC
Created attachment 104647 [details]
detail

dmesg
Comment 10 charlie 2014-08-15 07:17:46 UTC
Created attachment 104657 [details]
detail

Our system PAGE_SIZE is 8k, but the GPU PAGE_SIZE is 4k. I have changed the mesa radeonsi driver memory 8K alignment . So, the virtual address allocate conflict disappear . But the new problem is come out .
The log message is :
[ 3574.344456] radeon 0000:01:00.0: GPU fault detected: 147 0x01424802
[ 3574.344481] radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x0002020A
[ 3574.344499] radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x02048002
[ 3574.344518] VM fault (0x02, vmid 1) at page 131594, read from TC (72)
[ 3574.344543] radeon 0000:01:00.0: GPU fault detected: 147 0x01424802
[ 3574.344563] radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
[ 3574.344583] radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000
[ 3574.344601] VM fault (0x00, vmid 0) at page 0, read from unknown (0)

what's fault with it? The GPU cannot detected?
Comment 11 Michel Dänzer 2014-08-19 09:23:57 UTC
Created attachment 104877 [details] [review]
Convert resource descriptors to little endian on big endian hosts

(In reply to comment #10)
> Our system PAGE_SIZE is 8k, but the GPU PAGE_SIZE is 4k. I have changed the
> mesa radeonsi driver memory 8K alignment . So, the virtual address allocate
> conflict disappear .

Please submit the fix to the mesa-dev mailing list.


> But the new problem is come out .

Does this patch help?
Comment 12 Marek Olšák 2014-08-19 09:59:21 UTC
Don't we need to endian-swap the whole IB for big endian?
Comment 13 Christian König 2014-08-19 10:01:52 UTC
(In reply to comment #12)
> Don't we need to endian-swap the whole IB for big endian?

That depends on which engine you use. IIRC the CP, DMA and UVD till SI can do endian swap in hardware, but the support got removed for DMA with CIK.

VCE never could do it.
Comment 14 Alex Deucher 2014-08-19 13:37:39 UTC
(In reply to comment #13)
> (In reply to comment #12)
> > Don't we need to endian-swap the whole IB for big endian?
> 
> That depends on which engine you use. IIRC the CP, DMA and UVD till SI can
> do endian swap in hardware, but the support got removed for DMA with CIK.

SDMA on CIK can still do endian swaps for IBs.
Comment 15 Christian König 2014-08-19 14:16:38 UTC
(In reply to comment #14)
> (In reply to comment #13)
> > (In reply to comment #12)
> > > Don't we need to endian-swap the whole IB for big endian?
> > 
> > That depends on which engine you use. IIRC the CP, DMA and UVD till SI can
> > do endian swap in hardware, but the support got removed for DMA with CIK.
> 
> SDMA on CIK can still do endian swaps for IBs.

Than that was another engine, anyway you guys should test without the DMA initially to rule out any problems.

Basically first try to get the most simple use case working, e.g. X with just glxgears.
Comment 16 Michel Dänzer 2014-08-20 01:54:11 UTC
(In reply to comment #15)
> Basically first try to get the most simple use case working, e.g. X with
> just glxgears.

The simplest app I used for the initial radeonsi driver bringup was the Mesa demo egltri_screen.
Comment 17 charlie 2014-08-20 02:43:47 UTC
Created attachment 104930 [details]
bigend

This patch is about virtual space conflict . But this change has been fix in the mesa-20140607(the version is 10.1.5) ; The new mesa version have a new problem in our machine .
in founction
struct pipe_resource *r600_resource_create_common(struct pipe_screen *screen,
                                                  const struct pipe_resource *templ)
{
        if (templ->target == PIPE_BUFFER) {
                return r600_buffer_create(screen, templ, 8192); // original value is 4096  chen changed 2014-8-19
        } else {
                return r600_texture_create(screen, templ);
        }
}

So, that the virtual address space conflict is disappear.

I am seeing your patch .
https://bugs.freedesktop.org/attachment.cgi?id=104877&action=edit

but i cannot find the "si_update_vertex_buffers" function in our mesa version. 
thinks for your help . i will do many test in our machine.
Comment 18 charlie 2014-08-20 03:38:43 UTC
>>anyway you guys should test without the DMA initially to rule out any problems.

How do it ? 

Is is delete DMA options when i make kernel?
Comment 19 Marek Olšák 2014-08-20 13:23:23 UTC
(In reply to comment #18)
> >>anyway you guys should test without the DMA initially to rule out any problems.
> 
> How do it ? 
> 
> Is is delete DMA options when i make kernel?

He was talking about the DMA engine on the GPU, which can be disabled by setting this environment variable prior to running any apps:

R600_DEBUG=nodma

It can also be used to disable DMA for a specific app.
Comment 20 Michel Dänzer 2014-08-21 09:35:27 UTC
Created attachment 105024 [details] [review]
Use CPU page size instead of hardcoding 4096

Does this patch instead of your changes fix the virtual address allocation?

(In reply to comment #17)
> I am seeing your patch .
> https://bugs.freedesktop.org/attachment.cgi?id=104877&action=edit
> 
> but i cannot find the "si_update_vertex_buffers" function in our mesa
> version. 

Please use Mesa Git master. Once we have a set of fixes, we can consider backporting them to stable branches, but they have to be on master first.
Comment 21 charlie 2014-08-27 03:36:32 UTC
Created attachment 105313 [details]
GPU CP LOOKUP

    I am git newest mesa code, and change PAGE_SIZE alignment.
    I am closed all GPU DMA engine. R600_DMA=nodma.
    but the GPU CP is locked 
    This time the ring index is 3. it isa DMA_INDEX_RING 
    
  [  434.621770] radeon 0000:0a:00.0: GPU lockup CP stall for more than 5000msec
[  434.621805] radeon 0000:0a:00.0: GPU lockup (waiting for 0x0000000000000017 last fence id 0x0000000000000016)
[  434.621834] radeon 0000:0a:00.0: scheduling IB failed (-78).
Comment 22 Michel Dänzer 2014-08-27 06:26:45 UTC
(In reply to comment #21)
>     I am git newest mesa code, and change PAGE_SIZE alignment.

So my patch fixes the virtual address allocation failure?

>     I am closed all GPU DMA engine. R600_DMA=nodma.

That should be R600_DEBUG=nodma.

>     but the GPU CP is locked 

What did you try running when this happened?

>     This time the ring index is 3. it isa DMA_INDEX_RING 

Please attach the full dmesg from after the lockup.
Comment 23 charlie 2014-08-27 08:12:01 UTC
>>So my patch fixes the virtual address allocation failure?
  I am must change the code. you patch isn't work .
>>That should be R600_DEBUG=nodma.
   Sorry , i am writing error. Actually, i am using R600_DEBUG=nodma
>>What did you try running when this happened?
   i am using X:0& starting X, and I am runing xterm  The GPU is lockup still . ring index is 0

  when i am runing startxfce4 , the GPU lookup , ring index is 3.
Comment 24 charlie 2014-08-27 08:19:01 UTC
I am only running X, no other apply . the Xorg.0.log have any error message.
Comment 25 Michel Dänzer 2014-08-27 09:13:37 UTC
(In reply to comment #23)
> >>So my patch fixes the virtual address allocation failure?
>   I am must change the code. you patch isn't work .

My 'Use CPU page size instead of hardcoding 4096' patch should encompass both changes you mentioned before. What other changes do you need for that?

> >>What did you try running when this happened?
>    i am using X:0& starting X, and I am runing xterm  The GPU is lockup
> still . ring index is 0

As discussed before, can you try running a simpler app which doesn't need X, e.g. egltri_screen from the mesa/demos repository?
Comment 26 charlie 2014-08-28 02:46:52 UTC
 Some place in mesa code you are still useing 4096 . 
 I am setting environment EGL_LOG_LEVEL=debug EGL_PLATFORM=drm R600_DEBUG=nodma , and I am running egltri_screen.    There havn't a tringle display in the screen center. the message info as follow :

libEGL debug: Native platform type: drm (environment overwrite)
libEGL debug: EGL search path is /usr/lib64/egl
libEGL debug: added egl_dri2 to module array
libEGL debug: the best driver is DRI2
EGL_VERSION = 1.4 (DRI2)
libEGL debug: attribute 0x3033 has an invalid value 0x8
libEGL debug: EGL user error 0x3004 (EGL_BAD_ATTRIBUTE) in eglChooseConfig

EGLUT: failed to choose a config
Comment 27 Michel Dänzer 2014-08-28 03:31:40 UTC
(In reply to comment #26)
>  Some place in mesa code you are still useing 4096 . 

Please be more specific. Do you still need the change you described in comment 17, or something else?


> libEGL debug: attribute 0x3033 has an invalid value 0x8
> libEGL debug: EGL user error 0x3004 (EGL_BAD_ATTRIBUTE) in eglChooseConfig

Looks like maybe your Mesa build picked up a non-Mesa EGL/eglext.h which doesn't pull in eglmesaext.h, so EGL_MESA_screen_surface isn't defined in src/egl/main/eglconfig.c .
Comment 28 charlie 2014-08-28 13:58:46 UTC
>>Looks like maybe your Mesa build picked up a non-Mesa EGL/eglext.h which doesn't pull in eglmesaext.h, so EGL_MESA_screen_surface isn't defined in src/egl/main/eglconfig.c .

  Sorry , I am not clean about you saying . Do you means is that the eglmesaext.h(PATH:include/EGL/eglmesaext.h) need including EGL/eglext.h file?

  I change it , but not help.
 
  or , I add EGL_MESA_screen_surface define in  src/egl/main/eglconfig.c 
 
  #define  EGL_MESA_screen_surface 1 .
  
  the problem didn't solved.
Comment 29 charlie 2014-08-29 01:18:43 UTC
 I am so sorry, i made a mistake . I am configure mesa no --enable-gallium-egl options, I rebuild my mesa. The egltri_screen program can run in my system, But the colourized triangle is didn't show in my machine screen. It is only a white screen for a few minutes.

radeon: The kernel rejected CS, see dmesg for more information ,but any message display in dmesg.


the message is follow:
libEGL debug: Native platform type: drm (environment overwrite)
libEGL debug: EGL search path is /usr/lib64/egl
libEGL debug: added /usr/lib64/egl/egl_gallium.so to module array
libEGL debug: added egl_dri2 to module array
libEGL debug: dlopen(/usr/lib64/egl/egl_gallium.so)
libEGL info: use DRM for display (nil)
libEGL debug: the best driver is Gallium
EGL_VERSION = 1.4 (Gallium)
Found 16 modes:
  0: 1920 x 1080
  1: 1680 x 1050
  2: 1280 x 1024
  3: 1280 x 1024
  4: 1440 x 900
  5: 1024 x 768
  6: 1024 x 768
  7: 1024 x 768
  8: 832 x 624
  9: 800 x 600
 10: 800 x 600
 11: 800 x 600
 12: 640 x 480
 13: 640 x 480
 14: 640 x 480
 15: 640 x 480
Will use screen size: 1920 x 1080
radeon: The kernel rejected CS, see dmesg for more information.

My system is 64bits and big-endian. It is means that the radeinsi driver send a error commond using ring?  It need big-endian to little-endian?


The eglinfo message:
libEGL debug: Native platform type: drm (environment overwrite)
libEGL debug: EGL search path is /usr/lib64/egl
libEGL debug: added /usr/lib64/egl/egl_gallium.so to module array
libEGL debug: added egl_dri2 to module array
libEGL debug: dlopen(/usr/lib64/egl/egl_gallium.so)
libEGL info: use DRM for display (nil)
libEGL debug: the best driver is Gallium
EGL API version: 1.4
EGL vendor string: Mesa Project
EGL version string: 1.4 (Gallium)
EGL client APIs: OpenGL OpenGL_ES2 
EGL extensions string:
    EGL_MESA_screen_surface EGL_MESA_drm_display EGL_MESA_drm_image
    EGL_WL_bind_wayland_display EGL_KHR_image_base EGL_KHR_reusable_sync
    EGL_KHR_fence_sync EGL_KHR_surfaceless_context
Configurations:
     bf lv colorbuffer dp st  ms    vis   cav bi  renderable  supported
  id sz  l  r  g  b  a th cl ns b    id   eat nd gl es es2 vg surfaces 
---------------------------------------------------------------------
0x01 32  0  8  8  8  8  0  0  0 0 0x00--      a  y     y     pb,scrn
0x02 32  0  8  8  8  8 16  0  0 0 0x00--      a  y     y     pb,scrn
0x03 32  0  8  8  8  8 24  8  0 0 0x00--      a  y     y     pb,scrn
0x04 32  0  8  8  8  8 24  0  0 0 0x00--      a  y     y     pb,scrn
Number of Screens: 1

Screen 0 Modes:
  id  width height refresh  name
-----------------------------------------
0x01  1920   1080   60.000  1920x1080
0x02  1680   1050   60.000  1680x1050
0x03  1280   1024   75.000  1280x1024
0x04  1280   1024   60.000  1280x1024
0x05  1440    900   60.000  1440x900
0x06  1024    768   75.000  1024x768
0x07  1024    768   70.000  1024x768
0x08  1024    768   60.000  1024x768
0x09   832    624   75.000  832x624
0x0a   800    600   75.000  800x600
0x0b   800    600   72.000  800x600
0x0c   800    600   60.000  800x600
0x0d   640    480   75.000  640x480
0x0e   640    480   73.000  640x480
0x0f   640    480   67.000  640x480
0x10   640    480   60.000  640x480

i am running eglkms program . the gpu will be lockup:

libEGL debug: Native platform type: drm (environment overwrite)
libEGL debug: EGL search path is /usr/lib64/egl
libEGL debug: added /usr/lib64/egl/egl_gallium.so to module array
libEGL debug: added egl_dri2 to module array
libEGL debug: dlopen(/usr/lib64/egl/egl_gallium.so)
libEGL info: use DRM for display 0x206010
libEGL debug: EGL user error 0x3001 (EGL_NOT_INITIALIZED) in eglInitialize(no usable display)

libEGL debug: the best driver is DRI2
EGL_VERSION = 1.4 (DRI2)
radeon: The kernel rejected CS, see dmesg for more information.
radeon: The kernel rejected CS, see dmesg for more information.
handle=43, stride=7680

dmesg:


[  543.035612] radeon 0000:0a:00.0: GPU lockup CP stall for more than 2000msec
[  543.063462] radeon 0000:0a:00.0: GPU lockup (waiting for 0x0000000000000009 last fence id 0x0000000000000008)
[  543.063513] [drm] Disabling audio 0 support
[  543.063526] [drm] Disabling audio 1 support
[  543.063538] [drm] Disabling audio 2 support
[  543.063549] [drm] Disabling audio 3 support
[  543.063560] [drm] Disabling audio 4 support
[  543.063572] [drm] Disabling audio 5 support
[  543.063746] radeon 0000:0a:00.0: sa_manager is not empty, clearing anyway
[  543.790849] radeon 0000:0a:00.0: Saved 29 dwords of commands on ring 0.
[  543.791009] radeon 0000:0a:00.0: GPU softreset: 0x00000049
[  543.791030] radeon 0000:0a:00.0:   GRBM_STATUS               = 0xB3523028
[  543.791052] radeon 0000:0a:00.0:   GRBM_STATUS_SE0           = 0x2D800006
[  543.791074] radeon 0000:0a:00.0:   GRBM_STATUS_SE1           = 0x2D000006
[  543.791092] radeon 0000:0a:00.0:   SRBM_STATUS               = 0x200000C0
[  543.791220] radeon 0000:0a:00.0:   SRBM_STATUS2              = 0x00000000
[  543.791238] radeon 0000:0a:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
[  543.791256] radeon 0000:0a:00.0:   R_008678_CP_STALLED_STAT2 = 0x40000000
[  543.791275] radeon 0000:0a:00.0:   R_00867C_CP_BUSY_STAT     = 0x00008000
[  543.791293] radeon 0000:0a:00.0:   R_008680_CP_STAT          = 0x80228647
[  543.791312] radeon 0000:0a:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
[  543.791331] radeon 0000:0a:00.0:   R_00D834_DMA_STATUS_REG   = 0x44C83D57
[  543.791350] radeon 0000:0a:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
[  543.791369] radeon 0000:0a:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000
[  544.484393] radeon 0000:0a:00.0: GRBM_SOFT_RESET=0x0000DDFF
[  544.484468] radeon 0000:0a:00.0: SRBM_SOFT_RESET=0x00000100
[  544.485647] radeon 0000:0a:00.0:   GRBM_STATUS               = 0x00003028
[  544.485668] radeon 0000:0a:00.0:   GRBM_STATUS_SE0           = 0x00000006
[  544.485687] radeon 0000:0a:00.0:   GRBM_STATUS_SE1           = 0x00000006
[  544.485705] radeon 0000:0a:00.0:   SRBM_STATUS               = 0x200000C0
[  544.485832] radeon 0000:0a:00.0:   SRBM_STATUS2              = 0x00000000
[  544.485851] radeon 0000:0a:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
[  544.485869] radeon 0000:0a:00.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
[  544.485887] radeon 0000:0a:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
[  544.485906] radeon 0000:0a:00.0:   R_008680_CP_STAT          = 0x00000000
[  544.485924] radeon 0000:0a:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
[  544.485943] radeon 0000:0a:00.0:   R_00D834_DMA_STATUS_REG   = 0x44C83D57
[  544.486089] radeon 0000:0a:00.0: GPU reset succeeded, trying to resume
[  544.499083] [drm] probing gen 2 caps for device 10b5:8648 = 838cd02/0
[  544.499102] [drm] PCIE gen 2 link speeds already enabled
[  544.517700] [drm] PCIE GART of 1024M enabled (table at 0x0000000000276000).
[  544.517913] radeon 0000:0a:00.0: WB enabled
[  544.517939] radeon 0000:0a:00.0: fence driver on ring 0 use gpu addr 0x0000000080000c00 and cpu addr 0xfffff80008880c00
[  544.517961] radeon 0000:0a:00.0: fence driver on ring 1 use gpu addr 0x0000000080000c04 and cpu addr 0xfffff80008880c04
[  544.517984] radeon 0000:0a:00.0: fence driver on ring 2 use gpu addr 0x0000000080000c08 and cpu addr 0xfffff80008880c08
[  544.518005] radeon 0000:0a:00.0: fence driver on ring 3 use gpu addr 0x0000000080000c0c and cpu addr 0xfffff80008880c0c
[  544.518027] radeon 0000:0a:00.0: fence driver on ring 4 use gpu addr 0x0000000080000c10 and cpu addr 0xfffff80008880c10
[  544.532531] radeon 0000:0a:00.0: fence driver on ring 5 use gpu addr 0x0000000000075a18 and cpu addr 0x000000ca10075a18
[  544.767646] [drm] ring test on 0 succeeded in 1 usecs
[  544.767669] [drm] ring test on 1 succeeded in 1 usecs
[  544.767689] [drm] ring test on 2 succeeded in 1 usecs
[  544.767773] [drm] ring test on 3 succeeded in 2 usecs
[  544.767803] [drm] ring test on 4 succeeded in 2 usecs
[  544.945163] [drm] ring test on 5 succeeded in 2 usecs
[  544.945186] [drm] UVD initialized successfully.
[  544.945245] [drm] Enabling audio 0 support
[  544.945258] [drm] Enabling audio 1 support
[  544.945270] [drm] Enabling audio 2 support
[  544.945281] [drm] Enabling audio 3 support
[  544.945292] [drm] Enabling audio 4 support
[  544.945304] [drm] Enabling audio 5 support
[  544.945388] [drm] ib test on ring 0 succeeded in 0 usecs
[  544.945487] [drm] ib test on ring 1 succeeded in 0 usecs
[  544.945573] [drm] ib test on ring 2 succeeded in 0 usecs
[  544.945637] [drm] ib test on ring 3 succeeded in 0 usecs
[  544.945699] [drm] ib test on ring 4 succeeded in 0 usecs
Comment 30 charlie 2014-08-29 01:19:37 UTC
>>It is only a white screen for a few minutes.

It is only a white screen for a few second
Comment 31 Michel Dänzer 2014-09-01 06:43:09 UTC
(In reply to comment #29)
> radeon: The kernel rejected CS, see dmesg for more information.
> 
> My system is 64bits and big-endian. It is means that the radeinsi driver
> send a error commond using ring?  It need big-endian to little-endian?

That shouldn't be necessary, the kernel sets up the hardware to byte-swap the ring/indirect buffer commands on the fly.

Basically, you need to find out why the radeon_cs_ioctl() function in the kernel returns a non-0 value. Enabling DRM debugging output via /sys/module/drm/parameters/debug may help, or you may have to add more debugging printks to narrow down where it's coming from.
Comment 32 charlie 2014-09-15 08:25:01 UTC
>>radeon: The kernel rejected CS, see dmesg for more information
This error is showed in the drm_ioctl function.The part code show as follow:
 
468                 if (cmd & IOC_OUT) { //error place,
469                         printk(KERN_EMERG "cmd & IOC_OUT\n");
470                         if (copy_to_user((void __user *)arg, kdata,
471                                          usize) != 0){
472                                 printk(KERN_EMERG "cmd & IOC_OUT error\n");
473                                 retcode = -EFAULT;
474                         }
475                 }

The copy_from_user havn't error, So the error is cause by parameter.
Or, PAGE_SIZE cause it ?
Comment 33 Michel Dänzer 2014-09-17 06:57:15 UTC
(In reply to comment #32)
> This error is showed in the drm_ioctl function.

No, as I said in comment 31, it comes from somewhere inside radeon_cs_ioctl() AFAICT.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.