110897 – HyperZ is broken for r300 (bad z for some micro and macrotiles?)

Bug 110897 - HyperZ is broken for r300 (bad z for some micro and macrotiles?)

Summary: HyperZ is broken for r300 (bad z for some micro and macrotiles?)

Status:	RESOLVED MOVED

Alias:	None

Product:	Mesa
Classification:	Unclassified
Component:	Drivers/Gallium/r300 (show other bugs)
Version:	git
Hardware:	Other other

Importance:	medium normal
Assignee:	Default DRI bug account
QA Contact:	Default DRI bug account

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2019-06-11 18:03 UTC by Richard Thier
Modified:	2019-09-18 18:55 UTC (History)
CC List:	3 users (show)

See Also:
i915 platform:
i915 features:

Attachments
first screenshot (still not completely ruined zbuffer) (61.06 KB, image/png) 2019-06-11 18:03 UTC, Richard Thier	Details
Second screenshot (visible tile boundary - zbuffer is completely wasted below) (37.93 KB, image/png) 2019-06-11 18:07 UTC, Richard Thier	Details
Error gone patch - but it is slow (1.41 KB, patch) 2019-06-11 18:18 UTC, Richard Thier	Details \| Splinter Review
Added logging - magically works but slow from logging! (3.64 KB, patch) 2019-06-12 00:43 UTC, Richard Thier	Details \| Splinter Review
Log output for "added logging" patch (132.82 KB, text/plain) 2019-06-12 00:54 UTC, Richard Thier	Details
good HyperZ glxgears (114.30 KB, image/png) 2019-06-12 17:51 UTC, cosiekvfj	Details
bigger glxgears window (159.50 KB, image/png) 2019-06-12 21:04 UTC, cosiekvfj	Details
should not affect anything patch (1.56 KB, patch) 2019-06-13 15:35 UTC, Richard Thier	Details \| Splinter Review
Really no cmask ram it seems (309.05 KB, image/png) 2019-06-14 15:11 UTC, Richard Thier	Details
Working hack / quickfix (1.29 KB, patch) 2019-06-14 17:20 UTC, Richard Thier	Details \| Splinter Review
Proof screenshot that it works now (113.38 KB, image/png) 2019-06-14 17:26 UTC, Richard Thier	Details
grepping around init functions in kernel / drm (9.86 KB, text/plain) 2019-06-15 15:15 UTC, Richard Thier	Details
dmesg log (44.32 KB, text/plain) 2019-06-16 09:05 UTC, cosiekvfj	Details
Fix variant 1 (delegate to r300 init) (2.43 KB, patch) 2019-06-16 17:22 UTC, Richard Thier	Details \| Splinter Review
Fix variant 2 (special case pipe number for 0x5a62) (1.21 KB, patch) 2019-06-16 17:41 UTC, Richard Thier	Details \| Splinter Review
Added back rs400_mc_wait_for_idle - maybe final patch? (2.46 KB, patch) 2019-06-17 19:40 UTC, Richard Thier	Details \| Splinter Review
Sent patch (2.52 KB, patch) 2019-06-17 21:48 UTC, Richard Thier	Details \| Splinter Review
Show Obsolete (3) View All

Description Richard Thier 2019-06-11 18:03:29 UTC

Created attachment 144505 [details]
first screenshot (still not completely ruined zbuffer)

Hello everyone!

I went on and tried RADEON_HYPERZ=1 with my r300 card and I see bad glitches - while in the same time elevated performance. See attached screenshot(s).

This affect every application, even the simplest ones like glxgears.

The top of the screen is rendering properly always, but around the 25% of the screen it starts to break down and I can see tiles where things seem to have a really bad z-value.

What is also interesting is that [b]I feel the z-clear is the operation that is happening wrong[/b]! I am pretty sure in this because at the first few frames of glxgears I can nearly see all the gears and as the gears turn, I see less and less of them - it kind of feels that whenever some pixel got rendered, its place cannot be used anymore - or likely cannot be used! If I turn the wheels some times around the Y axis the bottom 2/3 of the screen just becomes completely dark after a while.

If I exit glxgears - or any app in question of testing -  and restart it from the terminal however, I see that everything is immediately wrong! So [b]the problem persists between multiple runs of the same program with same sized window[/b] and this also hints that the z buffer is never properly (or at all) cleared!

BUT [b]resizing the window immediately fixes the current frame[/b] with seemingly proper Z-values and if I keep resizing I can see a constant flickering - but a much more clear image. I think the resize operation triggers some resize in the buffers that cleans them properly, but in the very first second it already gets wrong again pretty much and this is what is happening.

Also while resizing the window I saw that [b]there is no straight horizontal cut above which things are good and below which things are bad, but I literally see only the number of (macro?)tiles count from the top-left corner![/b] So basically I can see the side of one of the macrotiles. I tried to picture this with a screenshot, but it is not that easy to resize that way. See the second sceenshot that does not have anything on the bottom, but you see the cut and the  left side of the tile where first things got wrong.

Also the order of how the tiles go wrong is not always linear, but the first ones always work - from top-left going just like pixels.

I am trying to use documentation that I have found here:
http://renderingpipeline.com/graphics-literature/low-level-gpu-documentation/

Of course the r300 register docs should be good I hope, but I started reading through the r500_acceleration docs as it seems many-many of it applies to all r300 cards. Am I right that these are the best sources so far?

To be honest I think the fast z-clear maybe is the problem and is badly configured to only clear the top few tiles on the screen or something similar. The tiles are approximately 32x32 or 16x16, but surely not just 1-2 pixels as they are pretty much visible to the naked eye (see second attachment screenshot).

I have just barely started my analysis, so I still have a lot of directions to take and the docs (if they are good) are really helpful at least! I did not know about them so far!!!

Currently playing around the code to see if I can help the problem disappear.

This is likely never worked. I do not know of any version where this worked on my machine, but I cannot completely rule it out of course.

Comment 1 Richard Thier 2019-06-11 18:07:16 UTC

Created attachment 144510 [details]
Second screenshot (visible tile boundary - zbuffer is completely wasted below)

Added second screenshot. You can see the tile here where the blue gear is. Above the line that tile starts and above the line from the next tile-line there is good z-buffer always.

Here I am after some seconds so the whole zbuffer became unusable at this point and that is why things are black. In games that have a skybox I tend to see the skybox instead or far away things instead of black!

Resolution is 1024x768 so you can even measure the size of the tile. I think it is 32x32 or similar.

Comment 2 Richard Thier 2019-06-11 18:18:53 UTC

Created attachment 144512 [details] [review]
Error gone patch - but it is slow

If I understand it well, HyperZ can be owned by only one process / owner and the ownership is transferred in the r300_blit.c file when clearing the buffers.

I guess this is why closing an app having HyperZ can transfer it to an other one without restart. I have figured that this is also the place where buffer clearing takes place so I played around a bit here.

Other releant files I have found are these:

[code]
src/gallium/drivers/r300/r300_context.c
src/gallium/drivers/r300/r300_emit.c
 Update is in here - this sends stuff to card according to how docs say it: 
src/gallium/drivers/r300/r300_hyperz.c
 This is where hyperZ gets first activated:
src/gallium/drivers/r300/r300_blit.c
src/gallium/drivers/r300/r300_context.h
 Register naming (closely resembles r300 and 500 docs):
src/gallium/drivers/r300/r300_reg.h
[/code]

See the attachment about what I am trying. With this attachment I do not see the issue anymore. I made this change by just looking at what the code does and how it sets the zmask_clear and hiz_clear variables. As you can see first I was just trying to set them myself, but later just commented out the part you see.

Interestingly there is a performance drop now - I mean a drop from an unchanged mesa without HYPERZ enabled to the changed one WITH hiz being enabled with the environment variable. Does the fast clear or some things work even if the flag is not added?

I am not sure what I do if I comment out the part that I did, but will look into the HyperZ update function to know what is being sent to the card registers. I see the registers and can see them in the docs, but I am sometimes puzzled. Does the API between the kernel and user level send the enable HyperZ bit in a different var and other parts of the same register in a different var for example? I can also find it on my own I guess, but I need to compare kernel and mesa side of the things and maybe someone just knows.

This is how long I went so far - but I am still very much started only a bit.

Comment 3 Richard Thier 2019-06-11 23:21:26 UTC

Hmmmm... I must have made a measurement error as looking at the code the small patch of mine cannot be slower than before it when not using HYPERZ...

Comment 4 cosiekvfj 2019-06-11 23:26:32 UTC

Hello you all! Couple of years ago I did tests HyperZ
https://bugs.freedesktop.org/show_bug.cgi?id=37724
Maybe you will find this link useful!

Comment 5 Richard Thier 2019-06-11 23:56:22 UTC

Hi!

Was it looking similar? Was it solved for your case?

Btw I just started to have insights on what the hardware does, but might need to work a bit in the wine field because of the good weather so if I am off a bit for a day or two it will not mean I am stuck with the problem.

I think I will add extra logs soon to have a grasp about what is going on.

Comment 6 Richard Thier 2019-06-12 00:43:25 UTC

Created attachment 144514 [details] [review]
Added logging - magically works but slow from logging!

This is what I am running with now. There are no other changes and not despite I gave back the commented-out earlier code part things do work, despite I have only added comments....

pffff maybe it is working because of some weird waiting that the logs give me...

I will attach the log that is generated, because things do not go in the direction I was hoping for as I expected other code paths to happen when the RADEON_HYPERZ=1 is on.

Comment 7 Richard Thier 2019-06-12 00:54:26 UTC

Created attachment 144515 [details]
Log output for "added logging" patch

As you can see the "KUL-D" and the "KULAKVA-n" log messages never gets printed out but the KUL-A gets to be and even "KUL-C" now for some reason.

I expect we should have emitted the flush, zmask clear and/or hiz clear, but it seems we are not ever getting at that code path.

I am wondering what might be the value of the "buffer" variable here as all I know is that it is not empty, but I think the code is written in the way that it expects that it is and only do the emit operations if it is empty. Maybe a new kind of buffer was added and code did not changed or who knows... then it can be a regression actually... but I have no idea why it might work for anyone then...

But look at the log and the patch. For me it is weird what is happening.

Comment 8 Richard Thier 2019-06-12 01:13:10 UTC

I think I see a bug happening here and we are not emitting the relevant things.

A bit tired now for the night...

Comment 9 Richard Thier 2019-06-12 01:22:39 UTC

So these does not seem to happen at any time:

        /* Emit clear packets. */
        r300_emit_gpu_flush(r300, r300->gpu_flush.size, r300->gpu_flush.state);
        r300->gpu_flush.dirty = FALSE;

        if (r300->zmask_clear.dirty) {
            fprintf(stderr, "KUL-AKVA\n");
            r300_emit_zmask_clear(r300, r300->zmask_clear.size,
                                  r300->zmask_clear.state);
            r300->zmask_clear.dirty = FALSE;
        }   
        if (r300->hiz_clear.dirty) {
            fprintf(stderr, "KUL-AKVA2\n");
            r300_emit_hiz_clear(r300, r300->hiz_clear.size,
                                r300->hiz_clear.state);
            r300->hiz_clear.dirty = FALSE;
        }   
        if (r300->cmask_clear.dirty) {
            r300_emit_cmask_clear(r300, r300->cmask_clear.size,
                                  r300->cmask_clear.state);
            r300->cmask_clear.dirty = FALSE;
        }

At least not with glxgears. But I kind of get the impression that the intention of the code is to end up here as earlier we set things as dirty:

            /* Setup Hyper-Z clears. */
            if (r300->hyperz_enabled) {
                if (zmask_clear) {
                    hyperz_dcv = hyperz->zb_depthclearvalue =
                        r300_depth_clear_value(fb->zsbuf->format, depth, stencil);
                    r300_mark_atom_dirty(r300, &r300->zmask_clear);
                    r300_mark_atom_dirty(r300, &r300->gpu_flush);
                    buffers &= ~PIPE_CLEAR_DEPTHSTENCIL;
/* FIXME: REMOVE KUL* LOGS: */
                    fprintf(stderr, "KUL-A\n");
                }

                if (hiz_clear) {
                    r300->hiz_clear_value = r300_hiz_clear_value(depth);
                    r300_mark_atom_dirty(r300, &r300->hiz_clear);
                    r300_mark_atom_dirty(r300, &r300->gpu_flush);
                }
                r300->num_z_clears++;
            }

Looking at mark_atom_dirty it seems to set the flag. I do not know who deletes it but it got deleted. Also I have printed what is the value of "buffers" where we enter the "KUL-C" code path (according to my log scheme) and the value is buffers=4 which is the first (0th) color buffer. It is not some new kind, but above this check I see only one place where this could get zeroed and maybe that is not happening for this card but for others it does.

What bugs me is that the dirtyness of zmask_clear and hiz_clear also go away somewhere, but I am too tired to see where. Just wanted to write down all the stuff so far and at least provide the logs and insights.

Comment 10 cosiekvfj 2019-06-12 17:33:25 UTC

>Was it looking similar? Was it solved for your case?

I didn't report that bug. Someone just wrote in that thread that HyperZ was not enabled due to lack of testing, so I ran some piglid tests. :)

I just want to warn you that there may be some more bugs in r300 driver. For example: https://bugs.freedesktop.org/show_bug.cgi?id=98869 There was workaround put in place to resolve this bug (So no proper fix for underlying issue. See this mailing list conversation: https://lists.freedesktop.org/archives/mesa-dev/2017-February/143980.html). But later I found out that after some upgrades game started crashing. But never really figured out why. And didn't test it after that as I already finished that game :)
https://bugs.freedesktop.org/show_bug.cgi?id=101382

I'm glad for your r300 work! :)

Comment 11 cosiekvfj 2019-06-12 17:51:47 UTC

Created attachment 144523 [details]
good HyperZ glxgears

Extended renderer info (GLX_MESA_query_renderer):
    Vendor: X.Org R300 Project (0x1002)
    Device: ATI RC410 (0x5a62)
    Version: 19.0.6
    Accelerated: yes
    Video memory: 128MB
    Unified memory: no
    Preferred profile: compat (0x2)
    Max core profile version: 0.0
    Max compat profile version: 2.1
    Max GLES1 profile version: 1.1
    Max GLES[23] profile version: 2.0
OpenGL vendor string: X.Org R300 Project
OpenGL renderer string: ATI RC410
OpenGL version string: 2.1 Mesa 19.0.6
OpenGL shading language version string: 1.20

Comment 12 Richard Thier 2019-06-12 20:18:30 UTC

Hi Cosiek!

What card is that (lspci output maybe)? Is HyperZ just good without any changes to stock mesa? Your card seems to be also reported as RC410 like mine, but you have much-much more FPS for some reason. Is this also a laptop?

> I just want to warn you that there may be some more bugs in r300 driver.

I also know about one more that affects specifically the "Total War" games. Both Rome: Total War 1 and Medieval: Total War 2. I see very bad "tiling" but from the size of the bad tiles I kind of think it is likely maybe some kind of texture format or something similar that goes wrong. I haven't even started the analysis because better do one thing at a time... It is good that you point me towards this issue too as who knows if this is maybe not the one affecting the game maybe?

Also I have found out only now how "easy" is to gdb the userland driver! All I need to do is to know proper function names and I can breakpoint them easily after answering yes to gdb when it asks me that it does not find the function yet but if it should set breakpoint when the shared object is loaded... Before this I just used printfs and stuff... Both have use cases of course.

Comment 13 cosiekvfj 2019-06-12 20:55:42 UTC

01:05.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] RC410M [Mobility Radeon Xpress 200M] (prog-if 00 [VGA controller])
	Subsystem: Packard Bell B.V. RC410M [Mobility Radeon Xpress 200M]
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 66 (2000ns min), Cache Line Size: 32 bytes
	Interrupt: pin A routed to IRQ 17
	Region 0: Memory at d0000000 (32-bit, prefetchable) [size=256M]
	Region 1: I/O ports at 9000 [size=256]
	Region 2: Memory at c0000000 (32-bit, non-prefetchable) [size=64K]
	[virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
	Capabilities: [50] Power Management version 2
		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [80] MSI: Enable- Count=1/1 Maskable- 64bit-
		Address: 00000000  Data: 0000
	Kernel driver in use: radeon
	Kernel modules: radeon

Comment 14 cosiekvfj 2019-06-12 21:04:30 UTC

Created attachment 144524 [details]
bigger glxgears window

>Is HyperZ just good without any changes to stock mesa?
yes, mesa is from manjaro repo, and I think that it's the same build as in arch repo.

>Your card seems to be also reported as RC410 like mine, but you have much-much more FPS for some reason.
It's because window size of glxgears ;) We could test that if you want. But we need to agree on window size ;)

>Is this also a laptop?
Yes.

I also tried to resize window to try to get some artifacts. Apart from some flickering during resizing, image was good. Flickering was also present without HyperZ.

Comment 15 Richard Thier 2019-06-13 08:56:02 UTC

(In reply to cosiekvfj from comment #14)
> Created attachment 144524 [details]
> bigger glxgears window
> 
> >Is HyperZ just good without any changes to stock mesa?
> yes, mesa is from manjaro repo, and I think that it's the same build as in
> arch repo.
> 
> >Your card seems to be also reported as RC410 like mine, but you have much-much more FPS for some reason.
> It's because window size of glxgears ;) We could test that if you want. But
> we need to agree on window size ;)
> 
> >Is this also a laptop?
> Yes.

Can you share lspci and lshw output?

> I also tried to resize window to try to get some artifacts. Apart from some
> flickering during resizing, image was good. Flickering was also present
> without HyperZ.

Resize only helped against the problem in my case so I would not expect things go flicker when resizing anyways, just that in my case it was "at least" flickering instead of not being visible at all properly.

Comment 16 Richard Thier 2019-06-13 09:00:57 UTC

Btw I now have no problems at all... This is weird... all I did was to remove my printfs so that code is actually stock mesa and I both see the speed gain and there is no problems whatsoever...

But I rebooted and started glxgears right away with HYPERZ enabled and no other programs ever. I could then start anything: browser, tuxracer, ...

I have tried a shutdown of my machine, then start glxgears without HYPERZ and then start with and it still worked...

But now I am compiling with "-O0" as I thought it is a good idea if I plan to use gdb.

I am still not sure what fixes it: is it just the "-O0" or me not starting the browser first but glxgears...

TL;DR: If you see this, there might ba a quick hack to completely turn off the machine and turn it on maybe and not start anything that uses the GPU just the 3D app that you want to have HYPERZ should maybe start first - or the workaround is to compile without any optimization maybe... These are the two things I did and now magically "it just works"... I have no idea...

Comment 17 cosiekvfj 2019-06-13 10:16:18 UTC

           *-display
                description: VGA compatible controller
                product: RC410M [Mobility Radeon Xpress 200M]
                vendor: Advanced Micro Devices, Inc. [AMD/ATI]
                physical id: 5
                bus info: pci@0000:01:05.0
                version: 00
                width: 32 bits
                clock: 66MHz
                capabilities: pm msi vga_controller bus_master cap_list rom
                configuration: driver=radeon latency=66 mingnt=8
                resources: irq:17 memory:d0000000-dfffffff ioport:9000(size=256) memory:c0000000-c000ffff memory:c0000-dffff

Comment 18 Richard Thier 2019-06-13 11:20:08 UTC

Weird. That is exactly the same card as mine... I have less and less of an idea when this thing happens. I will try building with various optimization fals but it might be also that the card can get stick in a state where hiz not working anymore but a real turnoff-turnon helped...

I am less and less sure what triggers the bad behaviour. Now it is good atmy machine too. I will see if i can build with -O2 and fast math and still see it working....

Comment 19 Richard Thier 2019-06-13 11:21:21 UTC

Btw the code paths for my logs are always the same - both when wrong and good.... I have no idea...

Comment 20 cosiekvfj 2019-06-13 12:32:05 UTC

>I have less and less of an idea when this thing happens.

My first thought was maybe compositor or 64 vs 32 bit os. I'm using xfwm with compositing turned off. I also changed the CPU in this laptop ;) and I'm using 64 bit os.

     *-cpu
          description: CPU
          product: Intel(R) Core(TM)2 CPU         T5500  @ 1.66GHz
          vendor: Intel Corp.
          physical id: 4
          bus info: cpu@0
          slot: U23
          size: 1666MHz
          capacity: 1667MHz
          width: 64 bits
          clock: 100MHz
          capabilities: fpu fpu_exception wp vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx x86-64 constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm lahf_lm pti dtherm cpufreq


But after you told us that glxgears "magically" works sometimes for you then I have no idea.

Comment 21 cosiekvfj 2019-06-13 12:35:20 UTC

No wait. That reminds me of this weird behavior I had in https://bugs.freedesktop.org/show_bug.cgi?id=101382

17.1.2-1 crash(different)
17.1.1-1 crash(different)
17.1.0-1 crash(different)
17.0.5-1 working
17.0.4-2 working
17.0.4-1 crash
17.0.3-2 working
17.0.3-1 crash
17.0.2-2 working
17.0.2-1 crash
17.0.1-2 crash
17.0.1-1 crash
17.0.0-3 crash
17.0.0-2 crash
17.0.0-1 crash
13.0.4-2 crash

Seemingly without any changes to r300 code…

Comment 22 Richard Thier 2019-06-13 15:35:23 UTC

Created attachment 144532 [details] [review]
should not affect anything patch

Interestingly now I am running with this patch, but I never saw this code path taken ever so it should not count.

Also I have compiled with this setup and it works still:

meson configure build/ -Dc_args="-O2 -Ofast" -Dcpp_args="-O2 -Ofast" -Dc_link_args="-O2 -Ofast" -Dcpp_link_args="-O2 -Ofast"

I have yet to try with -O3 so far now, but it more and more looks this is just some kind of rare heisenbug. When it is not happening I cannot even make it work..

Also now that the bug only seem to appear very rarely it is much harder to debug and I think its impact is much smaller too...

Comment 23 Richard Thier 2019-06-13 15:38:32 UTC

Or can it be that literally I had a short term hardware error in my card because of running in this hot summer for very long periods. I could be ram corruption too that just got solved now for some reason. Also I can image zram and hiz maybe seperate small static rams that were not used for years so it had to get some warmup time - quite the opposite of the overuse?

In any ways. The performance is sometime a bit better, sometimes measurably better and now I do not see any glitches. Weird. I think I make this bug to be a "minor" one as maybe it is not really a real bug?

Comment 24 Richard Thier 2019-06-13 20:07:46 UTC

Hmmm...

With -O3 I see the same kind of issue. It seems it is there when highest optimization is on maybe?

I will build -O2 once again without a restart just to see.

Comment 25 Richard Thier 2019-06-13 23:18:46 UTC

Pfff it is bad once again now and I have no idea why... maybe a real heisenbug because of some state :-(

Also turning off the machine or rebooting does not seem to help despite I went back to the "-O2" build...

Comment 26 Richard Thier 2019-06-14 12:06:34 UTC

Okay it is still bad now despite I have compiled with "-O0" flags actually...
To be honest I have no idea actually why it started working last time.
Also now that I step with the debugger I see that a lot of code paths
that I imagine should be taken are never ever taken. 

Also when going randomly in the code with gdb I have found the following things. I just want to document my findings so far with gdb breakpoints in the driver. These are maybe not general for everyone, but this is what happens on my very machine and card when debugging r300 code with gdb...

WHAT HAPPENS IN r300_blit.c
===========================

You can roughly follow this stuff here:

https://github.com/anholt/mesa/blob/master/src/gallium/drivers/r300/r300_blit.c

0.) Setting of zmask_clear=1 and hiz_clear=0 always happen
----------------------------------------------------------

  262     /* Use fast Z clear.
  263      * The zbuffer must be in micro-tiled mode, otherwise it locks up. */
  264     if (buffers & PIPE_CLEAR_DEPTHSTENCIL) {
  265         boolean zmask_clear, hiz_clear;
  266 
  267         /* If both depth and stencil are present, they must be cleared together. */
  268         if (fb->zsbuf->texture->format == PIPE_FORMAT_S8_UINT_Z24_UNORM &&
  269             (buffers & PIPE_CLEAR_DEPTHSTENCIL) != PIPE_CLEAR_DEPTHSTENCIL) {
  270             zmask_clear = FALSE;
  271             hiz_clear = FALSE;
  272         } else { /* ALWAYS HAPPENS: */
  273             zmask_clear = r300_fast_zclear_allowed(r300, buffers);
  274             hiz_clear = r300_hiz_clear_allowed(r300);
  275             /* FIXME: only for testing! */
  276             /*zmask_clear = FALSE;*/
  277             /*zmask_clear = TRUE; // - this alone looks bad, bothfalse look good, zmaks_clear only false hiz_clear untouched is good */
  278             /*hiz_clear = FALSE;*/
  279             /*hiz_clear = TRUE; // enabling this and falsing zmask_clear shows picture but FPS is lower in glxgears...*/

We always go where I marked it with "/* ALWAYS HAPPENS */".

  r300_fast_zclear_allowed(r300, buffers) return 1
  r300_hiz_clear_allowed(r300) returns 0

So:
zmask_clear == 1
hiz_clear == 0

ALWAYS. As I understand the hiz_clear is the thing that clears the "hierarchical Z" buffer and I suspect this should be returning true instead. The other for zmask clearn is the one that clears the compessed z-buffer. I had a bit of a hard time to figure out what these mean, but using the r5xx docs (despite this card is r300) helped a lot. So zmask is a lossless compressed z-buffer ram and hiz ram seem to be on the higher hiearchy level.

I have checked what is the body of the latter:

  return 300_resource(fb->zsbuf->texture)->tex.hiz_dwords[fb->zsbuf->u.tex.level] != 0;

^^and this body never returns true for some reason. I think this should be nonzero when HyperZ is properly going on isn't it? I am just running glxgears and nothing fancy.

1.) "Enabling" HyperZ seem to happen properly
---------------------------------------------

  282         /* If we need Hyper-Z. */
  283         if (zmask_clear || hiz_clear) {
  284             /* Try to obtain the access to Hyper-Z buffers if we don't have one. */
  285             if (!r300->hyperz_enabled &&
  286                 (r300->screen->caps.is_r500 || debug_get_option_hyperz())) {
  287                 r300->hyperz_enabled =
  288                     r300->rws->cs_request_feature(r300->cs,
  289                                                 RADEON_FID_R300_HYPERZ_ACCESS,
  290                                                 TRUE);
  291                 if (r300->hyperz_enabled) {
  292                    /* Need to emit HyperZ buffer regs for the first time. */
  293                    r300_mark_fb_state_dirty(r300, R300_CHANGED_HYPERZ_FLAG);
  294                 }
  295             }

On the first run we get into the cs_request_feature stuff and then the r300_mark_fb_state_dirty function properly. hiz_clear is never true here, but the zmask_clear boolean is - see above. The debug_get_option_hyperz() is the cause why this happens as that is the one that checks the environment variable.

  297             /* Setup Hyper-Z clears. */
  298             if (r300->hyperz_enabled) {
  299                 if (zmask_clear) {
  300                     hyperz_dcv = hyperz->zb_depthclearvalue =
  301                         r300_depth_clear_value(fb->zsbuf->format, depth, stencil);
  302 
  303                     r300_mark_atom_dirty(r300, &r300->zmask_clear);
  304                     r300_mark_atom_dirty(r300, &r300->gpu_flush);
  305                     buffers &= ~PIPE_CLEAR_DEPTHSTENCIL;
  306                 }
  307 
  308                 if (hiz_clear) {
  309                     r300->hiz_clear_value = r300_hiz_clear_value(depth);
  310                     r300_mark_atom_dirty(r300, &r300->hiz_clear);
  311                     r300_mark_atom_dirty(r300, &r300->gpu_flush);
  312                 }
  313                 r300->num_z_clears++;
  314             }

The zmask_clear part here also runs in every frame. I do not really know what it means if the r300->zmask_clear atom is dirty. I might need to check if this actually really means that the clear commands will be sent out to the command stream of somewhere or not. The depth clear value was FF..FF00 so a lot of Fs and two zeroes in the end. Sorry it is not in front of me while writing this so I do not remember the number of Fs just that the upper bits are all ones and the lowest byte was not.

2.) The CMASK path in the r300_clear(..) never seem to happen
-------------------------------------------------------------

  /* src/gallium/drivers/r300/r300_blit.c */
  318     /* Use fast color clear for an AA colorbuffer.
  319      * The CMASK is shared between all colorbuffers, so we use it
  320      * if there is only one colorbuffer bound. */
  321     if ((buffers & PIPE_CLEAR_COLOR) && fb->nr_cbufs == 1 && fb->cbufs[0]&&
  322         r300_resource(fb->cbufs[0]->texture)->tex.cmask_dwords) { ...

Is always  false even if I run glxgears with "-samples 4" and adding RADEUN_DEBUG=msaa showing that inded msaa is woking. Actually I can even see it.

I do not know if CMASK should be used even without MSAA turned on, but it seems on my machine it is never used. I suspect if zmask is for compressed z-buffer then cmask is for compressed color buffer?

This is also hinted here at this change:

https://gitlab.freedesktop.org/mesa/mesa/commit/ca2c28859eca83f8fbf1f43616f5ef861e95e8d6

Anyways, this is the part that is false always in the if:

      r300_resource(fb->cbufs[0]->texture)->tex.cmask_dwords

Actually this might be an issue on its own as in my understanding this code path would cause a faster code paths when doing MSAA on this card, but I rarely ever use it anyways nowadays...

If this would happen, it could have lead to the code deflagging the color channel clear as finished later in the body of the above if:

  347             if (r300->screen->cmask_resource == fb->cbufs[0]->texture) {
  348                 r300_set_clear_color(r300, color);
  349                 r300_mark_atom_dirty(r300, &r300->cmask_clear);
  350                 r300_mark_atom_dirty(r300, &r300->gpu_flush);
  351                 buffers &= ~PIPE_CLEAR_COLOR;
  352             }

3.) CBZB clear code path is always taken
----------------------------------------

  355     /* Enable CBZB clear. */
  356     else if (r300_cbzb_clear_allowed(r300, buffers)) {
  357         struct r300_surface *surf = r300_surface(fb->cbufs[0]);
  358 
  359         hyperz->zb_depthclearvalue =
  360                 r300_depth_clear_cb_value(surf->base.format, color->f);
  361 
  362         width = surf->cbzb_width;
  363         height = surf->cbzb_height;
  364 
  365         r300->cbzb_clear = TRUE;
  366         r300_mark_fb_state_dirty(r300, R300_CHANGED_HYPERZ_FLAG);
  367     }

As far as I understand this should just configure the Zbuffer unit to also clear the color buffer and both of them clearing it half and half. I will of course try forcing this to be off, just to see what happens, but it should not have a side effect for my issue. I am just documenting what happens.

4.) Clearing using the blitter always happen
--------------------------------------------

  369     /* Clear. */
  370     if (buffers) {
  371         /* Clear using the blitter. */
  372         /* FIXME: HACKZ TO SAVE FLAGS THAT blitter_clear clears */
  373         // bool zdirty = //TODO save vars???
  374         r300_blitter_begin(r300, R300_CLEAR);
  375         util_blitter_clear(r300->blitter, width, height, 1,
  376                            buffers, color, depth, stencil);
  377         r300_blitter_end(r300);

Sorry for writing into the code randomly here but I am in the middle of trying. The point is that this is always happening and I think this means that the blitter unit is always clearing the Z-buffer too and the color buffer is also cleared with the blitter.

Also the util_blitter_clear call also goes through all the atoms that are made to be dirty and this makes them not dirty by sending stuff to the kernel part of the driver.

The r300_emit_zmask_clear is never called it seems
--------------------------------------------------

After this point we just finish the clear function. Originally there was an "elseif" construct here so surely it could not go forwards after the blitter was doing its work (only if CMASK kind of color clearing was in operation!) but even if I comment out the "else" word I am not getting into anything past this:

  378 /* TIPP: here maybe not else if, just an if should be present??? */
  379     } /*else*/ if (r300->zmask_clear.dirty ||
  380                r300->hiz_clear.dirty ||
  381                r300->cmask_clear.dirty) {
  382         /* Just clear zmask and hiz now, this does not use the standard draw
  383          * procedure. */
  384         /* Calculate zmask_clear and hiz_clear atom sizes. */

^^because the flags are not dirty anymore as the blitter had a side effect of clearing any dirty atom.

Now I am trying to force this part to happen, by saving the values of the dirtyness of the zmask and hiz atoms and setting the saved value back after the blitter clear. I wonder what will happen.

If the CMASK kind of color clearing would work, I guess we would end up here and then call these:

  r300_emit_zmask_clear
  r300_emit_hiz_clear

I am wondering
==============

- Why the CMASK-related code path not even seem to work even if I use MSAA?
- Why r300_hiz_clear_allowed(r300) always return false for my case?

Further directions
==================

1.) I have found this to be interesting:

src/gallium/drivers/r300/r300_texture_desc.c / r300_setup_hyperz_properties(..)

2.) I will try "forcing" the above idea by saving and reloading the flags before and after the blitter kind of clear.

3.) Might be a good idea to put a breakpoint in the r300_texture_desc.c here:

402             /* Check whether we have enough HIZ memory. */
403             if (hiz_numdw <= screen->caps.hiz_ram * pipes) {
404                 tex->tex.hiz_dwords[i] = hiz_numdw;
405                 tex->tex.hiz_stride_in_pixels[i] = stride;
406             } else {
407                 tex->tex.hiz_dwords[i] = 0;
408                 tex->tex.hiz_stride_in_pixels[i] = 0;
409             }

It seems as if the hiz memory is "not enough"? Weird. I see many of the values are depending on how many pipes the card have and such. I have no idea how to get proper info about how many pixel and z pipes a specific card have, but these latter codes I did not look through properly.

Comment 27 Richard Thier 2019-06-14 12:28:30 UTC

This is also always happening:

  434     /* Enable fastfill and/or hiz.
  435      *
  436      * If we cleared zmask/hiz, it's in use now. The Hyper-Z state update
  437      * looks if zmask/hiz is in use and programs hardware accordingly. */
  438     if (r300->zmask_in_use || r300->hiz_in_use) {
  439         r300_mark_atom_dirty(r300, &r300->hyperz_state);
  440     }

So we do get into the body of the if as zmasn_in_use is true here.

Comment 28 Richard Thier 2019-06-14 12:52:13 UTC

I still see the glitches with the following changes:

@@ -364,11 +369,21 @@ static void r300_clear(struct pipe_context* pipe,
     /* Clear. */
     if (buffers) {
         /* Clear using the blitter. */
+
+        /* FIXME: Twoline HACKZ TO SAVE FLAGS THAT blitter_clear clears */
+        bool tmp_zmask_clear_dirty = r300->zmask_clear.dirty;
+       bool tmp_hiz_clear_dirty = r300->hiz_clear.dirty;
+
         r300_blitter_begin(r300, R300_CLEAR);
         util_blitter_clear(r300->blitter, width, height, 1,
                            buffers, color, depth, stencil);
+       /* FIXME: two lines of hack! */
+       r300->zmask_clear.dirty = tmp_zmask_clear_dirty;
+       r300->hiz_clear.dirty = tmp_hiz_clear_dirty;
+
         r300_blitter_end(r300);
-    } else if (r300->zmask_clear.dirty ||
+/* TIPP: here maybe not else if, just an if should be present??? */
+    } /*else*/ if (r300->zmask_clear.dirty ||
                    r300->hiz_clear.dirty ||
                    r300->cmask_clear.dirty) {


Even though we are now entering the things after this if and go where the zram clear emit is. Maybe this is not the problem...

Comment 29 Richard Thier 2019-06-14 13:20:58 UTC

Okay this is weird for me:

  /* src/gallium/drivers/r300/r300_texture_desc.c */
  399             /* Get the HIZ buffer size in dwords. */
  400             hiz_numdw = (stride * height) / (8*8 * pipes);
  401 
  402             /* Check whether we have enough HIZ memory. */
  403             if (hiz_numdw <= screen->caps.hiz_ram * pipes) {
  404                 tex->tex.hiz_dwords[i] = hiz_numdw;
  405                 tex->tex.hiz_stride_in_pixels[i] = stride;
  406             } else {
  407                 tex->tex.hiz_dwords[i] = 0;
  408                 tex->tex.hiz_stride_in_pixels[i] = 0;
  409             }
  
  (gdb) p hiz_numdw
  $35 = 1128
  (gdb) p screen->caps.hiz_ram
  $36 = 0
  (gdb)

(!) Is it normal to have zero hiz_ram on this card???????????????????????????????????????????????

Btw:

info.r300_num_z_pipes == 1
info.r300_num_gb_pipes == 3 (this is in the pipes var)

Comment 30 Richard Thier 2019-06-14 15:00:17 UTC

It seems it can be normal, but then my understanding of hyperZ was a bit off then.

I will try adding hiz ram for it just to see what happens :-)

Comment 31 Richard Thier 2019-06-14 15:11:52 UTC

Created attachment 144545 [details]
Really no cmask ram it seems

I have faked to have hiz ram and cmask ram just to see what happens.
When I am turning on MSAA now I get this glitches from the shared screenshot so I guess it is right that my card is just not having CMASK_RAM.

  122     case CHIP_RS480:
  123         caps->zmask_ram = RV3xx_ZMASK_SIZE;
  124         caps->has_cmask = TRUE; /* guessed because there is also HiZ */
  125         caps->hiz_ram = R300_HIZ_LIMIT;
  126         break;

hiz_ram faking seems to have no effect whatsoever.

I have also tried this hack - but it did not help:

diff --git a/src/gallium/drivers/r300/r300_emit.c b/src/gallium/drivers/r300/r300_emit.c
index 80c959b95d0..48fafecfdda 100644
--- a/src/gallium/drivers/r300/r300_emit.c
+++ b/src/gallium/drivers/r300/r300_emit.c
@@ -1224,6 +1224,10 @@ void r300_emit_hiz_clear(struct r300_context *r300, unsigned size, void *state)
     tex = r300_resource(fb->zsbuf->texture);
 
     BEGIN_CS(size);
+    // FIXME: Remove this hack!
+    OUT_CS_REG(R300_ZB_ZCACHE_CTLSTAT,
+        R300_ZB_ZCACHE_CTLSTAT_ZC_FLUSH_FLUSH_AND_FREE |
+        R300_ZB_ZCACHE_CTLSTAT_ZC_FREE_FREE);
     OUT_CS_PKT3(R300_PACKET3_3D_CLEAR_HIZ, 2);
     OUT_CS(0);
     OUT_CS(tex->tex.hiz_dwords[fb->zsbuf->u.tex.level]);
@@ -1246,6 +1250,9 @@ void r300_emit_zmask_clear(struct r300_context *r300, unsigned size, void *state
     tex = r300_resource(fb->zsbuf->texture);
 
     BEGIN_CS(size);
+    OUT_CS_REG(R300_ZB_ZCACHE_CTLSTAT,
+        R300_ZB_ZCACHE_CTLSTAT_ZC_FLUSH_FLUSH_AND_FREE |
+        R300_ZB_ZCACHE_CTLSTAT_ZC_FREE_FREE);
     OUT_CS_PKT3(R300_PACKET3_3D_CLEAR_ZMASK, 2);
     OUT_CS(0);
     OUT_CS(tex->tex.zmask_dwords[fb->zsbuf->u.tex.level]);

Comment 32 Richard Thier 2019-06-14 15:30:05 UTC

If my card has no "hiz_ram" what that means? Can anyone explain if I am right?

This is what I suspect so far:

- Zcompression I get still, because the card has zmask_ram
- No hierarchical Z (hiz) at all because there is no hiz_ram

I do not see from the docs if there are specialties if enabing zmask_ram only, but not enabling hiz.

Comment 33 Richard Thier 2019-06-14 17:20:13 UTC

Created attachment 144548 [details] [review]
Working hack / quickfix

Pretty interesting but I just tried to change the "most logical" part of r300_texture_desc.c and now it works - see quickfix patch.

I will tell you all what was my approach. I just started to ignore all my previous attempts and go for a more "mágus" approach. That is is like "bogus" approach, but "mágus" approach involves faith and magix haha.

The goal was to find a way to make the still non-erronous area of the image grow. Just by trying around the code and possibly breaking law of the rules (Judas Priest song should play in the background while readers read this for best effect).

So this file seemed to be the good candidate for "breaking the law" and just "mágus-approach" trying: r300_texture_desc.c

Then there was this part:

376             /* Get the ZMASK buffer size in dwords. */
377             zcomp_numdw = r300_pixels_to_dwords(stride, height,
378                     zmask_blocks_x_per_dw[pipes-1] * zcompsize,
379                     zmask_blocks_y_per_dw[pipes-1] * zcompsize);
380 
381             /* Check whether we have enough ZMASK memory. */
382             if (util_format_get_blocksizebits(tex->b.b.format) == 32 &&
383                 zcomp_numdw <= screen->caps.zmask_ram * pipes) {
384                 tex->tex.zmask_dwords[i] = zcomp_numdw;
385                 tex->tex.zcomp8x8[i] = zcompsize == 8;
386 
387                 tex->tex.zmask_stride_in_pixels[i] =
388                    util_align_npot(stride, zmask_blocks_x_per_dw[pipes-1] * zcompsize);

The "zcomp_numdw" seems to be "the number of dwords" in the zmask_ram - and the zmask ram is the compressed z-buffer itself. This "tex->tex.zmask_dwords[i]" has multiple element because of the mip level. Of course I think only the first element ever exists for zbuffer as I haven't seen mipmapped z ever just telling you why there is indexing.

In short I saw earlier that this "zmask_dwords" get used later on even when emitting but also at other places like in r300_blit.c for example. It is used pretty much so this seemed to be the thing to change in my planned "mágus" way ;-)

I went and first doubled the value like this (please mind the "*2"):

376             /* Get the ZMASK buffer size in dwords. */
377             zcomp_numdw = r300_pixels_to_dwords(stride, height,
378                  zmask_blocks_x_per_dw[pipes-1] * zcompsize*2,
379                  zmask_blocks_y_per_dw[pipes-1] * zcompsize*2);

After that change the "still uneffected" area of the screen shrink roughly to its 1/4th so I figured instead of multiplying the values I should divide them by two (just guessing):

376             /* Get the ZMASK buffer size in dwords. */
377             zcomp_numdw = r300_pixels_to_dwords(stride, height,
378                  zmask_blocks_x_per_dw[pipes-1] * zcompsize/2,
379                  zmask_blocks_y_per_dw[pipes-1] * zcompsize/2);

And indeed now the whole screen is good and performance is fast too.
It is a hack however - just pure magic so far, but it seems that somehow this value might be the one that can fix the problem.

My idea is that maybe that value of "pipes" is wrong???

This is where that variable is set:

357         if (screen->caps.family == CHIP_RV530) {
358             pipes = screen->info.r300_num_z_pipes;
359         } else {
360             pipes = screen->info.r300_num_gb_pipes;
361         }

Also because my card is not CHIP_RV530 the value is the screen->info.r300_num_gb_pipes

This is set in: radeon_drm_winsys.c / do_winsys_init(...)

385     if (ws->gen == DRV_R300) {
386         if (!radeon_get_drm_value(ws->fd, RADEON_INFO_NUM_GB_PIPES,
387                                   "GB pipe count",
388                                   &ws->info.r300_num_gb_pipes))
389             return false;

I have no idea where this value comes from. Would need to see the other side in the kernel modules I guess or some other places. If this is wrong it might maybe cause other problems for me that I am not aware of while things still generally work most of the time?

Also just a note for myself - the RC410 is R300 class / kind card not an R400 it seems - but I saw places where is_rv350 returns true, so those things seem to be on in the driver for me at least:

205     case CHIP_R300:
206     case CHIP_R350:
207     case CHIP_RV350:
208     case CHIP_RV370:
209     case CHIP_RV380:
210     case CHIP_RS400:
211     case CHIP_RC410:
212     case CHIP_RS480:
213         ws->info.chip_class = R300;

So gb_pipes is 3 for my case, but maybe it should be a different value and this is set badly for some reason. Of course this would explain the problem better.

One more thing: I build mesa to my own prefix and after this hacky quickfix patch things immediately started working - even if I run glxgears with HYPERZ from an other terminal that still runs with unchanged mesa!!! This is weird a bit, but it might give me some insights why yesterday the whole thing was working without me touching the relevant parts of the source: it seems if any apps start up with proper hyperZ then it becomes pretty easy for the card to "get stuck" in the good state for whatever reasons unknown to me.

Comment 34 Richard Thier 2019-06-14 17:26:39 UTC

Created attachment 144549 [details]
Proof screenshot that it works now

Added screenshot of HyperZ "just working" with the quickfix for me. I think it only applies to my machine so far but this might be the good direction.

I have first started with RADEON_HYPERZ=0 and got 350 FPS then started with the flag and still have the picture while having 440 FPS.

So I both have picture and speedup. What is weird is why it was working yesterday without me touching these parts of the code... I think some stack change got luck and got stuck in the good state for some reason...

Comment 35 Dieter Nützel 2019-06-14 21:44:11 UTC

Hello Richard,

very NICE progress!

Maybe you can run 'glmark2' with/without HyperZ.

Comment 36 Richard Thier 2019-06-14 22:01:40 UTC

Okay it seems pipes=1 is working on my machine.

diff --git a/src/gallium/drivers/r300/r300_texture_desc.c b/src/gallium/drivers/r300/r300_texture_desc.c
index 77d272bfb6b..029b28570d7 100644
--- a/src/gallium/drivers/r300/r300_texture_desc.c
+++ b/src/gallium/drivers/r300/r300_texture_desc.c
@@ -358,6 +358,9 @@ static void r300_setup_hyperz_properties(struct r300_screen *screen,
             pipes = screen->info.r300_num_z_pipes;
         } else {
             pipes = screen->info.r300_num_gb_pipes;
+           /* FIXME: Quickfix only for Mobility Radeon Xpress 200M in asus laptop! */
+            pipes = 2; // Half the screen is bad for me
+            pipes = 1; // Whole screen is ok for me
         }
 
         for (i = 0; i <= tex->b.b.last_level; i++) {

I do not even dare uploading this patch as it likely only works on my specific machine! The know-how seems to be worthy of knowing though so in case anyone see something like this, they can try something similar until there is a proper fix.

317     /* The tile size of 1 DWORD in ZMASK RAM is:
318      *
319      * GPU    Pipes    4x4 mode   8x8 mode
320      * ------------------------------------------
321      * R580   4P/1Z    32x32      64x64
322      * RV570  3P/1Z    48x16      96x32
323      * RV530  1P/2Z    32x16      64x32
324      *        1P/1Z    16x16      32x32
325      */
326     static unsigned zmask_blocks_x_per_dw[4] = {4, 8, 12, 8};
327     static unsigned zmask_blocks_y_per_dw[4] = {4, 4,  4, 8};

I should have thought that pipes=1 is for me. As you can see here, there are hardcoded values for X and Y block counts. Originally drm reports pipes=3 for my card so I end up using the third column in this table: 12*4 blocks.

Now remembering I had to half both of them earlier using the hacky patch (6*2) it was sure that "pipes=2" would not work still, because 4*8 = 32 is still much more than 6*2=12 I provided. Of course 4*4=16 so now I see my earlier hack was a bit miscalculated.

Also now I see exactly why 1/3 of the screen was only "working": because 12/4 = 3 and 4/4=1. You can clearly see this from the table!!! Wow!

I see that "r300_num_gb_pipes" is used at some of the other places:

src/gallium/drivers/r300/r300_query.c
src/gallium/drivers/r300/r300_emit.c (also for some queries)
src/gallium/drivers/r300/r300_context.c (only fprintf-ing for debugging)
src/gallium/winsys/radeon/drm/radeon_drm_winsys.c (this where the drm query is)

I do not really know what kind of "queries" are these, but I might go and change code so that winsys returns gb_pipes=1 itself without hacks at other places and see if there are other glitches (a bit prolonged testing).

Who knows, maybe things actually get less glitchy if this query stuff is really used and the value was bad before!

Then if I see that I really only have one pipeline, then maybe I should look at the other side of this drm call to see why it returns this value and not else.

PS.: One other thing that I do not know is if pipes can exist maybe but can be turned off or something? But I really have no idea about that.

PS.: I also grow to understand the logic why the smaller values here actually make more to be properly rendered on screen! Because if there are two or three pipes for example, you clear things similarly to this pattern:

01012323.... etc (I saw them in docs or source comments). So if there would be two pipes, you can z-delete two blocks at the same time etc. it is only simple maths to see the smaller values are better here then.

Comment 37 Richard Thier 2019-06-14 22:11:07 UTC

(In reply to Dieter Nützel from comment #35)
> Hello Richard,
> 
> very NICE progress!
> 
> Maybe you can run 'glmark2' with/without HyperZ.

Good idea.

Can you test if HyperZ works for you without any changes? The progress I made basically only works on my machine but above cosiekvfj seems to have no issues despite having the same card.

Actually if the gb_pipes number is wrong then the error is not even in the HyperZ code, but in the code that returns the wrong value from drm - that HyperZ code is just using.

Oh and keep in mind that I have no HiZ RAM! So if I measure speed gains others might measure a higher gain if they have HiZ RAM too as I think this way I have no hierarchical Z-buffer at all - when bigger tiles store min or max z values of theirs and first they are compared not pixels - but I have this compressed Z-buffer or zmask_ram - latter which is a lossless compression of the zbuffer. I read that they use tricks like storing the one-two triangles directions basically for whole tiles to save some bandwith and/or indicate if a tile is compressed or not at all.

This latter seems to help memory bandwith in case the triangles are bigger than the tiles (typically: walls in a game maybe?).

Comment 38 Dieter Nützel 2019-06-14 23:05:54 UTC

(In reply to Richard Thier from comment #37)
> (In reply to Dieter Nützel from comment #35)
> > Hello Richard,
> > 
> > very NICE progress!
> > 
> > Maybe you can run 'glmark2' with/without HyperZ.
> 
> Good idea.
> 
> Can you test if HyperZ works for you without any changes?

Sorry,

I'haven't any system for r300 (PCI/AGP) handy.
Latest here HD 4650, RV730 AGP (1 GB !), r600 (see older bug reports...;-)
But not booted for nearly 2 years...
Maybe I have an older r300 one (yep, 9550), but I have to dig in the basement for it, if you need.

> The progress I
> made basically only works on my machine but above cosiekvfj seems to have no
> issues despite having the same card.

You made GREAT progress!

We have to ping Michel Dänzer and Marek Olšák for your open questions.
(see CC list)
 
> Actually if the gb_pipes number is wrong then the error is not even in the
> HyperZ code, but in the code that returns the wrong value from drm - that
> HyperZ code is just using.
> 
> Oh and keep in mind that I have no HiZ RAM! So if I measure speed gains
> others might measure a higher gain if they have HiZ RAM too as I think this
> way I have no hierarchical Z-buffer at all - when bigger tiles store min or
> max z values of theirs and first they are compared not pixels - but I have
> this compressed Z-buffer or zmask_ram - latter which is a lossless
> compression of the zbuffer. I read that they use tricks like storing the
> one-two triangles directions basically for whole tiles to save some bandwith
> and/or indicate if a tile is compressed or not at all.
> 
> This latter seems to help memory bandwith in case the triangles are bigger
> than the tiles (typically: walls in a game maybe?).

Comment 39 Dieter Nützel 2019-06-14 23:09:32 UTC

Oh and have a look, here:

Feature Matrix for Free Radeon Drivers
https://www.x.org/wiki/RadeonFeature/

r300 have some points open...;-)
https://www.x.org/wiki/RadeonFeature/#note_14

Comment 40 Marek Olšák 2019-06-14 23:10:20 UTC

I'm afraid nobody remembers anymore how HyperZ works on r300. I can answer basic questions if you have any.

Comment 41 Dieter Nützel 2019-06-14 23:20:20 UTC

(In reply to Marek Olšák from comment #40)
> I'm afraid nobody remembers anymore how HyperZ works on r300. I can answer
> basic questions if you have any.

Hello Marek!

Thanks for your offer, I know you were around...

I've found some hints for Richard under #note_14
Your Mesa git commit #12dcbd5954676ee32604d82cacbf9a4259967e13
r300g: enable Hyper-Z by default on r500

Comment 42 Richard Thier 2019-06-14 23:30:39 UTC

(In reply to Marek Olšák from comment #40)
> I'm afraid nobody remembers anymore how HyperZ works on r300. I can answer
> basic questions if you have any.

Hi!

Currently I have added this to radeon_drm_winsys.c:

385     if (ws->gen == DRV_R300) {
386         if (!radeon_get_drm_value(ws->fd, RADEON_INFO_NUM_GB_PIPES,
387                                   "GB pipe count",
388                                   &ws->info.r300_num_gb_pipes))
389             return false;
390 +       // FIXME: only works for my own setup (prenex):
391 +       ws->info.r300_num_gb_pipes=1;

Now I have no problems so far. It can be that HyperZ code is just good as-is, but for some resoun the radeon_get_drm_value returns a bad gb_pipes number.

Currently testing with this a bit more throughly before moving further, but everything seems to work so far, just this is not a proper fix.

Some questions I can have:

1.) Is there any way to ensure how many pipes a card has? One pipeline seems to be really few for a GPU, but this is a mobile integrated card.

2.) Can the other indicated pipes be existing on my card but turned off for some reason?

3.) radeon_get_drm_value - is this in the kernel source tree? I will have a look later on the code behind it.

Comment 43 Marek Olšák 2019-06-15 01:23:14 UTC

You can try to compare your num_gb_pipes with somebody else who has the same GPU.

Comment 44 Marek Olšák 2019-06-15 01:27:40 UTC

RC410 most likely has only 1 pipe. 3 pipes would be for a discrete GPU that isn't small.

Comment 45 Marek Olšák 2019-06-15 02:31:51 UTC

The problem might be in the kernel. See function rs400_gpu_init. I think it should call r300_gpu_init instead of r420_pipes_init.

Comment 46 Richard Thier 2019-06-15 13:27:11 UTC

(In reply to Marek Olšák from comment #45)
> The problem might be in the kernel. See function rs400_gpu_init. I think it
> should call r300_gpu_init instead of r420_pipes_init.

I will put some logging there to be sure because as I see the codebase on the mesa side I got the feeling that "rs400" is basically r420+ and even RC410 and such belong to r300 so if this is consistent and good, then the kernel code there can be too. Except if it is really called on my machine as then that might be really-really a mistake then.

I will look at it later. Still testing a bit first if this way everything works properly - also I have less time because there is good weather to work on wine field and later there will be rainy weather to work endlessly on these. ;-)

> RC410 most likely has only 1 pipe.
> 3 pipes would be for a discrete GPU that isn't small.

Thanks for this guess! I really had no experience to even guess this. Everything points in this direction though so I think this is the case.

Comment 47 Richard Thier 2019-06-15 15:15:23 UTC

Created attachment 144552 [details]
grepping around init functions in kernel / drm

Hmmm... I see that this got printed out in dmesg:

[   17.669902] [drm] radeon: 3 quad pipes, 1 z pipes initialized.

However here is an excerpt from the kernel function you mention:

static void r300_gpu_init(struct radeon_device *rdev)
{
	uint32_t gb_tile_config, tmp;

	if ((rdev->family == CHIP_R300 && rdev->pdev->device != 0x4144) ||
	    (rdev->family == CHIP_R350 && rdev->pdev->device != 0x4148)) {
		/* r300,r350 */
		rdev->num_gb_pipes = 2;
	} else {
		/* rv350,rv370,rv380,r300 AD, r350 AH */
		rdev->num_gb_pipes = 1;
	}

...

	DRM_INFO("radeon: %d quad pipes, %d Z pipes initialized\n",
		 rdev->num_gb_pipes, rdev->num_z_pipes);
}

And it is pretty clear that the there is nothing that should change the number between the log and the setup code! This kind of tells me that this function is not called.

I see multiple "rXX_gpu_init" functions so maybe the corresponding one should be called, but in my case not an r300 but the rs400 variant is called? See attachment about me grepping around in kernel sources.

I mean.. maybe calling "rs400_gpu_init" is already bad and it should be instead the r300_gpu_init here!

Also I can see there are multiple implementations for the r420_pipes_init, but mostly for later cards so it kind of seems something is really off. Maybe someone just saw "oh RC410? That is higher number than 400 so it is an rs400 card" and completely disconfigure this card for some reason??? I mean... I can still sometimes grow unsure where this specific card belongs to "properly" so I can imagine someone just thinking the above.

[prenex@prenex-laptop zen-kernel-5.0.17-lqx1]$ grep -R r420_pipes_init drivers/gpu/*
drivers/gpu/drm/radeon/rs400.c: r420_pipes_init(rdev);
drivers/gpu/drm/radeon/rv515.o bináris fájl illeszkedik
drivers/gpu/drm/radeon/rs600.o bináris fájl illeszkedik
drivers/gpu/drm/radeon/r520.c:  r420_pipes_init(rdev);
drivers/gpu/drm/radeon/radeon.o bináris fájl illeszkedik
drivers/gpu/drm/radeon/rs690.c: r420_pipes_init(rdev);
drivers/gpu/drm/radeon/r420.c:void r420_pipes_init(struct radeon_device *rdev)
drivers/gpu/drm/radeon/r420.c:  r420_pipes_init(rdev);
drivers/gpu/drm/radeon/rs690.o bináris fájl illeszkedik
drivers/gpu/drm/radeon/r520.o bináris fájl illeszkedik
drivers/gpu/drm/radeon/radeon_asic.h:extern void r420_pipes_init(struct radeon_device *rdev);
drivers/gpu/drm/radeon/rs600.c: r420_pipes_init(rdev);
drivers/gpu/drm/radeon/r420.o bináris fájl illeszkedik
drivers/gpu/drm/radeon/radeon.ko bináris fájl illeszkedik
drivers/gpu/drm/radeon/rs400.o bináris fájl illeszkedik
drivers/gpu/drm/radeon/rv515.c: r420_pipes_init(rdev);

I don't see it yet where the radeon driver decides which of the r*_gpu_init function to call, but I am already grepping for it. If there is an easy to spot error there I might be even able to patch it - or at least go further :-)

Oh I see now that I am grepping for the wrong function as likely "r300_startup" is the one that calls the r300_gpu_init (and likely others call the same in the similar pattern) so I must search for this startup function instead.

Comment 48 Richard Thier 2019-06-15 15:20:59 UTC

Btw I saw that mesa codes handle my RC410 as one having RV350 (is_rv350 is true), but unlike R400 family chip. I have no idea if that is right, but if the kernel part do differently than the userland part that is clearly wrong that is sure!

I kind of have a feeling this card is closer to R300 family than R400, but these are the questions that I have problems answering myself :-(

In any ways. I think the wrong rXX_start function is called and r300_start shoud have been called instead in my case originally... Now I am looking where to fix this as it seems like an easy patch after properly finding the issue source.

Actually if this is the case I am only lucky that this manifests itself only in this subtle details because that means a whole minor family different code is running for my card and it takes only luck that it works...

Comment 49 Richard Thier 2019-06-15 17:39:08 UTC

Hmmm. There are a lot of FIXMEs in the kernel at this part saying things are not pretty sure. I think I am running into an other code path I should be taking, but if this was always like this then maybe "fixing" it will just cause a whole lot of errors...

RC410 is said to use the "RAGE 8" architecture:

https://www.techpowerup.com/gpu-specs/ati-rc410.g757.

Looking up Rage8 it seems to be RV350 so the mesa side seems to be more right about what category it puts my card into.

In this place I see a lot of asics (for example r300_asic and r420_asic):

    drivers/gpu/drm/radeon/radeon_asic.c

All of these structs have their rxx_init functions set in the struct (basically a vtable using structs and function pointers). I think maybe the wrong one is used for my card.

For example RV410 is said to be r400 architecture card instead of "Rage8" (RV350) architecture one here:

https://www.techpowerup.com/gpu-specs/ati-rv410.g9

It can very likely have happened that the because of the number itself they classified my Mobility Radeon Xpress 200M as r400 while it is basically an RV350 and having a marketing-affected naming scheme confusion...

Look at this function:

2306 /**
2307  * radeon_asic_init - register asic specific callbacks
2308  *
2309  * @rdev: radeon device pointer
2310  *
2311  * Registers the appropriate asic specific callbacks for each
2312  * chip family.  Also sets other asics specific info like the number
2313  * of crtcs and the register aperture accessors (all asics).
2314  * Returns 0 for success.
2315  */
2316 int radeon_asic_init(struct radeon_device *rdev)
2317 { ...

This is the one that sets the asic and there is a big switch-case for setting the "proper" one in the driver based on "rdev->family".

My RC410 is not directly named here:

	case CHIP_R300:
	case CHIP_R350:
	case CHIP_RV350:
	case CHIP_RV380:
		if (rdev->flags & RADEON_IS_PCIE)
			rdev->asic = &r300_asic_pcie;
		else
			rdev->asic = &r300_asic;
		break;
	case CHIP_R420:
	case CHIP_R423:
	case CHIP_RV410:
		rdev->asic = &r420_asic;
		/* handle macs */
		if (rdev->bios == NULL) {
			rdev->asic->pm.get_engine_clock = &radeon_legacy_get_engine_clock;
			rdev->asic->pm.set_engine_clock = &radeon_legacy_set_engine_clock;
			rdev->asic->pm.get_memory_clock = &radeon_legacy_get_memory_clock;
			rdev->asic->pm.set_memory_clock = NULL;
			rdev->asic->display.set_backlight_level = &radeon_legacy_set_backlight_level;
		}
		break;
	case CHIP_RS400:
	case CHIP_RS480:
		rdev->asic = &rs400_asic;
		break;
	case CHIP_RS600:

(there are other families mentioned here as well, but these are the ones that might affect my case I think)

According to techpowerup.com saying that Xpress 200M is Rage8 architecture I should also get an r300 asic and the same as the RV350:

https://www.techpowerup.com/gpu-specs/ati-rv350.g13

Also there is this "RS400" asic listed in the above switch case and it is still an other candidate maybe but surely not the rs400_asic.

Comment 50 Richard Thier 2019-06-15 17:40:35 UTC

"
Also there is this "RS400" asic listed in the above switch case and it is still an other candidate maybe but surely not the rs400_asic.
"

I mean my card should be surely not rs420_asic. The rs_400 I didn't look through yet. Now I am looking where the device family is being set as maybe that might indicate why I get a wrong value...

Comment 51 Richard Thier 2019-06-15 17:45:35 UTC

(In reply to cosiekvfj from comment #17)
>            *-display
>                 description: VGA compatible controller
>                 product: RC410M [Mobility Radeon Xpress 200M]
>                 vendor: Advanced Micro Devices, Inc. [AMD/ATI]
>                 physical id: 5
>                 bus info: pci@0000:01:05.0
>                 version: 00
>                 width: 32 bits
>                 clock: 66MHz
>                 capabilities: pm msi vga_controller bus_master cap_list rom
>                 configuration: driver=radeon latency=66 mingnt=8
>                 resources: irq:17 memory:d0000000-dfffffff
> ioport:9000(size=256) memory:c0000000-c000ffff memory:c0000-dffff

Hi cosiekvfj!

Can you attach what your dmesg output is?

The whole output, but especially this part:

radeon: 3 quad pipes, 1 z pipes initialized.

Do you also get 3 quad pipes and 1 z pipes?

Comment 52 Richard Thier 2019-06-15 21:48:52 UTC

(In reply to Marek Olšák from comment #45)
> The problem might be in the kernel. See function rs400_gpu_init. I think it
> should call r300_gpu_init instead of r420_pipes_init.

Aaaahhhhh.... Am I this blind? Now I see exactly this.

Also in dmesg log I see RS400 being written out so yes I end up here and according to my own links this seem to be the problem that the rs400_gpu_init should not do the r420_pipes_init, because it is not r420 yet....
Now after investigating I think I should indeed end up here with the rs400_asic, but yes the 420 pipe setup is wrong - at least for my card.

I will run with this patched and see what happens. The machine can compile the kernel through the night for testing this...

This seems to be the real core of the issue.

Comment 53 Dieter Nützel 2019-06-15 22:10:53 UTC

(In reply to Richard Thier from comment #52)
> (In reply to Marek Olšák from comment #45)
> > The problem might be in the kernel. See function rs400_gpu_init. I think it
> > should call r300_gpu_init instead of r420_pipes_init.
> 
> Aaaahhhhh.... Am I this blind? Now I see exactly this.
> 
> Also in dmesg log I see RS400 being written out so yes I end up here and
> according to my own links this seem to be the problem that the
> rs400_gpu_init should not do the r420_pipes_init, because it is not r420
> yet....
> Now after investigating I think I should indeed end up here with the
> rs400_asic, but yes the 420 pipe setup is wrong - at least for my card.
> 
> I will run with this patched and see what happens. The machine can compile
> the kernel through the night for testing this...

If you do NOT 'make mrproper' and change only radeon/amdgpu stuff it should only take some minutes (seconds) on your system/CPU.

> This seems to be the real core of the issue.

We hear from you. ;-)

Comment 54 cosiekvfj 2019-06-16 09:05:52 UTC

Created attachment 144558 [details]
dmesg log

[   14.303343] [drm] radeon: 1 quad pipes, 1 z pipes initialized.

Comment 55 Richard Thier 2019-06-16 11:21:24 UTC

(In reply to Dieter Nützel from comment #53)

> If you do NOT 'make mrproper' and change only radeon/amdgpu stuff it should
> only take some minutes (seconds) on your system/CPU.

I was dumb-enough that last time I have cleaned the kernel build dir of mine to save some space haha. This way it took around 50 minutes.

> > This seems to be the real core of the issue.
> 
> We hear from you. ;-)

I am testing right now, at least the system and X have started but will try some games and look for what the logs say. I have also added some extra logging for my liking just to gain more insightful information as I actually have two ideas for fixing this.

Comment 56 Richard Thier 2019-06-16 11:23:11 UTC

(In reply to cosiekvfj from comment #54)
> Created attachment 144558 [details]
> dmesg log
> 
> [   14.303343] [drm] radeon: 1 quad pipes, 1 z pipes initialized.

This is really interesting! It tells us why it works in your case (the 1 quad pipe is reported "properly"), but I have no idea why it is reported properly.

I see you are using 4.x kernel so there might be changes - or for some reason your card returns proper values from the registers the driver read from and mine does not. Thank you for the information!

Comment 57 cosiekvfj 2019-06-16 14:17:39 UTC

4.14.124: 1 quad pipes, 1 z pipes initialized.
4.19.49: 1 quad pipes, 1 z pipes initialized.
4.20.17: 1 quad pipes, 1 z pipes initialized.
5.0.21: 1 quad pipes, 1 z pipes initialized.
5.1.8: 1 quad pipes, 1 z pipes initialized.

Richard Thier, if we have the same graphic card then why do you sometimes get "3 quad pipes, 1 z pipes initialized."? Can you test on different kernel versions? Do you always have this 3 quad pipes or it's random? I'm happy to help further :)

Comment 58 Richard Thier 2019-06-16 15:44:14 UTC

I still have some dmesg logs around from my other debugging session and in every one of those I get 3 pipes despite different kernel versions from 4.4 to latest 5.x

Also looking at git blame I can see that relevant code paths were last touched 10 years ago with a big drm related commit. So nothing has changed in the code it seems. Ok... I didn't check if the register number in the header is changed or not, just hoped not, but the code itself is doing the same 10 year old things here. Even the FIXME saying if this is good or not (when calling r420_pipes_init in rs400_init) is that old.

[drm] initializing kernel modesetting (RS400 0x1002:0x5A62 0x1043:0x1392)

^^You seem to be also 5A62 here. In the code of r420_pipes_init, there are some hardcoded cases for some other similar values that force a single-pipe instead of reading from the register but you do not fall into those. The last two numbers are different in your case, but that does not count for that code path that decides the pipe number - except if the small variance in the last two means that your card gives back proper values in the register from which you can read the pipe num.

Btw running with r300_init call instead of the r420 pipe initializaion is going good so far: performance is the same, the glitches are the same (I know about some already), everything works that worked before...

One possible solution might be to just use r300_init for r400, but I have no idea if that breaks some r400 cards. I only have my RC410 but no other... An other appoach is to add one more special cases for the r420_pipes_init that specifically checks for 5A62 and forces a single pipeline despite what is read from the register.

Also there was one time where things were working for me before changes but I do not know what the pipe value was back then. So yes, there is SOME chance that this register can also return 1 in my case too, but it seems most of the time (99% at least if not 100%) it returns 3.

I will prepare both patches and then let others decide which direction to take. I think taking the r300_init direction is maybe better because of hopeful backwards compatibility in general.

Comment 59 Richard Thier 2019-06-16 15:47:11 UTC

> the glitches are the same (I know about some already),
> everything works that worked before...

I was kinda hoping maybe some glitches I know can go away if the init code does the r300_init but that is not the case sadly. Quite the same overall system, but now have 1 pipes reported in dmesg.

Comment 60 Richard Thier 2019-06-16 17:22:09 UTC

Created attachment 144559 [details] [review]
Fix variant 1 (delegate to r300 init)

Added a possible drm patch for kernel source tree. This version is the one that delegates the rs400_gpu_init call to the r300_gpu_init directly.

Tested at my machine only so far. Tried to comment things as good as possible.

I will make an other variant soon!

Comment 61 Richard Thier 2019-06-16 17:41:38 UTC

Created attachment 144560 [details] [review]
Fix variant 2 (special case pipe number for 0x5a62)

Added the second fix variation I've imagined. I am still compiling this so it is not tested yet, but this should work just as well.

After thinking about it now, actually I tend to favor this latter one as being more clean, because I have no idea what the other rs400 cards do if I just forward to the r300 init code. My card works with that variant, but this is more sane maybe?

The bad thing is that this literally only fixes those cards that do have 0x5a62 as their rdev->pdev->device value.

So "Fix variant 1" might fix more cards if they have a similar issue and "Fix variant 2" should fix only this one and should not break unrelated cards.

Comment 62 Richard Thier 2019-06-16 20:16:25 UTC

Tested both versions and both work for me. Performance gain when using HyperZ is around 5-10% for me (that is sometimes just 2-4 FPS, sometimes more).

gb_pipes are reported as 1 now as it should be.

Comment 63 cosiekvfj 2019-06-16 20:49:25 UTC

(In reply to Richard Thier from comment #58)
> Also there was one time where things were working for me before changes but
> I do not know what the pipe value was back then. So yes, there is SOME
> chance that this register can also return 1 in my case too, but it seems
> most of the time (99% at least if not 100%) it returns 3.

It would be lovely if you could replicate that…

Also this is consistent with your mesa testing earlier. That sometimes when you compiled mesa with your patches and stock mesa they were both working/not working. But that means that this weird behaviour:
17.1.2-1 crash(different)
17.1.1-1 crash(different)
17.1.0-1 crash(different)
17.0.5-1 working
17.0.4-2 working
17.0.4-1 crash
17.0.3-2 working
17.0.3-1 crash
17.0.2-2 working
17.0.2-1 crash
17.0.1-2 crash
17.0.1-1 crash
17.0.0-3 crash
17.0.0-2 crash
17.0.0-1 crash
13.0.4-2 crash
is something different than this bug. But I don't believe that there were any changes between for example 17.0.3-1 and 17.0.3-2…

Sooo, reboot, cold start, suspend don't change anything with pipe numbers?

Also if you take a look at my old piglit tests, there were definitely more bugs with HyperZ on.

Also great work so far!

Comment 64 Marek Olšák 2019-06-16 23:24:52 UTC

rs400, which includes rc400, is not r400. It's r300.

Comment 65 Richard Thier 2019-06-17 06:19:11 UTC

Okay. Then variant 1 is the way to go I think.

Comment 66 Michel Dänzer 2019-06-17 09:22:45 UTC

Removing myself from the CC list again, as I get updates via the mailing list anyway.

Comment 67 Marek Olšák 2019-06-17 18:30:29 UTC

Keep the rs400_mc_wait_for_idle call.

Comment 68 Richard Thier 2019-06-17 19:40:06 UTC

Created attachment 144572 [details] [review]
Added back rs400_mc_wait_for_idle - maybe final patch?

I was actually thinking about keeping that part, but it was a do-or-don't decision. Added a patch that is basically variant 1, but keeps the mentioned rs400_mc_wait_for_idle call parts untouched. Also I have obsoleted the earlier two variants.

Is this final patch for this issue maybe? I do not know if anyone still have glitches with HyperZ on r300 (I do not have any at all), but this kind of heavy glitches should at least go away now completely for everyone.

Comment 69 Marek Olšák 2019-06-17 20:49:33 UTC

Can you send it to amd-gfx@lists.freedesktop.org adding the "drm/radeon: " commit prefix and your Signed-off-by?

Thanks.

Comment 70 Richard Thier 2019-06-17 21:48:54 UTC

Created attachment 144573 [details] [review]
Sent patch

I have tried, let me know if it didn't succeed or anything need to be changed for next time.

For reference, this was the command I used (with this attached file)

git send-email 0001-drm-radeon-Fix-rs400_gpu_init-for-ATI-mobility-radeo.patch --to='amd-gfx@lists.freedesktop.org'

Comment 71 Richard Thier 2019-06-19 03:07:37 UTC

Made a similar writeup on this topic too:

http://ballmerpeak.web.elte.hu/devblog/debugging-hyperz-and-fixing-a-radeon-drm-linux-kernel-module.html

Also tried to document all relevant information if it ever helps anyone.

Comment 72 GitLab Migration User 2019-09-18 18:55:21 UTC

-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/388.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.