Summary: | nouveau causes graphic corruption were you cant do anything | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | xorg | Reporter: | zeruke <oninekoze> | ||||||||||||
Component: | Driver/nouveau | Assignee: | Nouveau Project <nouveau> | ||||||||||||
Status: | RESOLVED FIXED | QA Contact: | Xorg Project Team <xorg-team> | ||||||||||||
Severity: | blocker | ||||||||||||||
Priority: | medium | CC: | andyrtr, jwilk, kbloom, svenjoac | ||||||||||||
Version: | 7.6 (2010.12) | ||||||||||||||
Hardware: | x86 (IA32) | ||||||||||||||
OS: | Linux (All) | ||||||||||||||
Whiteboard: | |||||||||||||||
i915 platform: | i915 features: | ||||||||||||||
Attachments: |
|
Description
zeruke
2011-02-03 19:15:58 UTC
I have the same display corruption with : nVidia Corporation C73 [GeForce 7100 / nForce 630i] (rev a2) Linux arch64 2.6.37-ARCH #1 SMP PREEMPT Sat Jan 29 20:00:33 CET 2011 x86_64 Pentium(R) Dual-Core CPU E5200 @ 2.50GHz GenuineIntel GNU/Linux In the Arch Linux forum thread : https://bbs.archlinux.org/viewtopic.php?id=112758 other persons mentioned the same problem. It seems to concern NV3x and NV4x chipsets. It happens with the upgrades : libdrm 2.4.22 -> 2.4.23 libgl 7.9.0.git20101207 -> 7.10 libva 1.0.6 -> 1.0.8 mesa 7.9.0.git20101207 -> 7.10 xf86-video-nouveau 0.0.16_git20100819 -> 0.0.16_git20101217 See also the bug report in the Arch Linux flyspray : https://bugs.archlinux.org/task/22700?project=1 Everything seems normal in the log files, and the processes are running ok, X server and the WM processes are started normally. Only the mouse cursor works and the display is totally unusable as shown in the screenshots. Downgrading to the previous release version restore the display as normal. Please look into this horrible regression to fix it, because the recent release of the nouveau driver is totally unusable. This is same corruption I am having on GeForce 6100/nForce 430 (bug #33688). Try using NoAccel or ShadowFB option in xorg.conf and check if that helps... (In reply to comment #2) > This is same corruption I am having on GeForce 6100/nForce 430 (bug #33688). > > Try using NoAccel or ShadowFB option in xorg.conf and check if that helps... I do not know why BugZilla refers to wrong bug (I was writing about https://bugs.freedesktop.org/show_bug.cgi?id=33668)... @Tomasz Wasiak both of those options don't work but i did find a thing in tty1 stating [ 8.520202] [drm] nouveau 0000:00:12.0: ======= misaligned reg 0x001020FB ======= [ 8.520217] [drm] nouveau 0000:00:12.0: ======= misaligned reg 0x001020FB ======= My guess is that something is wrong with tiled scanout. This was enabled with commit http://cgit.freedesktop.org/nouveau/xf86-video-nouveau/commit/?id=c88f13e25b0040c1dd0f93e0ac40f62a6005ce59 You're not the first to complain, and it's maybe a good to gather the "real" names of problem cards. In my log i have: [drm] nouveau 0000:01:00.0: Detected an NV50 generation card (0x096000c1) You should have something with NV40 or NV44 generation card, the 96 in my case means that i have an NV96. List of cards reported so far, I looked up for codenames on http://nouveau.freedesktop.org/wiki/CodeNames GeForce 7150m / nForce 630M - nv67 GeForce 7100 / nForce 630i - nv67? GeForce 6100 / nForce 430 - nv4e GeForce 6150SE nForce 430 - nv4c GeForce FX Go5700 - nv36 For the GeForce 7100 / nForce 630i I have in the log : [drm] nouveau 0000:00:10.0: Detected an NV40 generation card (0x063000a2) From http://en.wikipedia.org/wiki/GeForce_7_Series : "The 7100 series was introduced on August 30, 2006 and is based on GeForce 6200 Series architecture." and "it is little more than a revamped version of the GeForce 6200TC" I presume that's why it is considered as NV40 chipset. Sorry I just noticed in the list from http://nouveau.freedesktop.org/wiki/CodeNames : NV63 GeForce 7100 / nForce 630i In dmesg I have : [drm] nouveau 0000:00:10.0: Detected an NV40 generation card (0x063000a2) But in Xorg.0.log I have effectively : [ 133.128] (--) NOUVEAU(0): Chipset: "NVIDIA NV63" and also : [ 133.472] (II) NOUVEAU(0): [XvMC] Associated with NV40 texture adapter. So it is not clear to me what chipset it is. NV6X is just because there were no numbers left in NV4X :) I think a mmio trace (http://nouveau.freedesktop.org/wiki/MmioTrace) of all the problematic cards running the closed source driver should shed some light on what is wrong with the tiling code on these cards. Because i suspect the blob uses a tiled frontbuffer too. You can send them to the email address mentioned at the bottom of the wiki page. Even though I'm not the best person to look at this (i don't use that generation of hardware anymore for example), I'll do what i can though if noone steps up. @ Maarten Maathuis i would do the mmio trace if i could but right now until nvidia updates the beta driver to support the new xorg stuff then i wont be able to use the closed source drivers unless i downgrade which at the moment i don't really want to do (In reply to comment #11) > @ Maarten Maathuis > > i would do the mmio trace if i could but right now until nvidia updates the > beta driver to support the new xorg stuff then i wont be able to use the closed > source drivers unless i downgrade which at the moment i don't really want to do Which xorg version and which nvidia version are you using ? http://nouveau.freedesktop.org/wiki/BlobVersions (In reply to comment #12) > (In reply to comment #11) > > @ Maarten Maathuis > > > > i would do the mmio trace if i could but right now until nvidia updates the > > beta driver to support the new xorg stuff then i wont be able to use the closed > > source drivers unless i downgrade which at the moment i don't really want to do > > Which xorg version and which nvidia version are you using ? > http://nouveau.freedesktop.org/wiki/BlobVersions right now because how im using ubuntu 11.04 alpha2 the xserver is 1.9.99.901+git20110131.be3be768-0ubuntu3 which is seen as xserver 1.10 and nvidia only had preliminary support with the 270.18 which 270.18 has a problem with the ABI right now and if i set it to ignoreABI it then i get segfaults which is a known problem which should be fixed in the next release so right now im not running nvidia drivers so im using the basic xorg graphics because i have to modeset=0 to nouveau so i can see things correctly Isn't it possible to compare previous release and last one to list the changes made and see what patches or changes could have caused the regression ? Isn't it possible to reverse some changes to previous state ? You are not listening, we already know what commit broke it : http://cgit.freedesktop.org/nouveau/xf86-video-nouveau/commit/?id=c88f13e25b0040c1dd0f93e0ac40f62a6005ce59 Now we want to know why tiled scanout does not work with these nforce boards, and we need a mmiotrace for that. Unfortunately mentioned commit is not the only issue. Using revision 38e8809bb415bae5c182fc79c8fc62992c5e4ed0 patched not to use tiled scanout helps only a bit when using current master branch of mesa... You need to switch to mesa-7.9 branch in order to have X working normally without major screen corruption (unfortunately there are still some minor corruptions here and there but you can live with them...). Still I got only 2D acceleration working (I know 3D is not supported :-D) - screen (or window) is totally messed up even when launching glxgears demo using Gallium3D nouveau driver. I had been trying nearly all revisions of xf86-video-nouveau (from 4063616938f76af8028491276039d422c0782b1b dated April 9th 2010 till current) built on top of current master branch of mesa with same major screen corruption! Of course most of them need some patches not to lock the GPU when built on top of current versions of libdrm/mesa/xorg-server but I have carefully checked if those patches could be source of screen corruption issues. I read the mmiotrace.txt file on how to use the kernel functionality. Can you tell me what actions exactly would be useful to be traced after the WM is started. It is written : "During tracing you can place comments (markers) into the trace by $ echo "X is up" > /sys/kernel/debug/tracing/trace_marker This makes it easier to see which part of the (huge) trace corresponds to which action. It is recommended to place descriptive markers about what you do." But what actions exactly to do during the trace process ? And : "Please, pack into a compressed archive the trace file and a free description about what you do during the trace." Again what is useful to do ? Can you also precise to me the format of the name of the archive file. "The name of the archive file should contain the PCI id and GPU family, or the commercial name of your card." Can you give an example of name please ? Again : "If you are doing a trace for a driver project, e.g. Nouveau, you should also do the following before sending your results: $ lspci -vvv > lspci.txt $ dmesg > dmesg.txt $ tar zcf pciid-nick-mmiotrace.tar.gz mydump.txt lspci.txt dmesg.txt and then send the .tar.gz file. The trace compresses considerably. Replace "pciid" and "nick" with the PCI ID or model name of your piece of hardware under investigation and your nickname." I would like an example of name of the tarball file. (In reply to comment #17) > I read the mmiotrace.txt file on how to use the kernel functionality. > Can you tell me what actions exactly would be useful to be traced after the WM > is started. > > It is written : > "During tracing you can place comments (markers) into the trace by > $ echo "X is up" > /sys/kernel/debug/tracing/trace_marker > This makes it easier to see which part of the (huge) trace corresponds to > which action. It is recommended to place descriptive markers about what you > do." > But what actions exactly to do during the trace process ? > > And : > "Please, pack into a compressed archive the trace file and a free description > about what you do during the trace." > Again what is useful to do ? > AFAIK with nouveau, you get corruption just by starting X. So I think you just need to start X with the blob, mark X is up in the trace, and stop. > Can you also precise to me the format of the name of the archive file. > "The name of the archive file should contain the PCI id and GPU family, or the > commercial name of your card." > Can you give an example of name please ? > > Again : > "If you are doing a trace for a driver project, e.g. Nouveau, you should also > do the following before sending your results: > $ lspci -vvv > lspci.txt > $ dmesg > dmesg.txt > $ tar zcf pciid-nick-mmiotrace.tar.gz mydump.txt lspci.txt dmesg.txt > and then send the .tar.gz file. The trace compresses considerably. Replace > "pciid" and "nick" with the PCI ID or model name of your piece of hardware > under investigation and your nickname." > I would like an example of name of the tarball file. $ lspci -n -d 10de: 01:00.0 0300: 10de:0407 (rev a1) -> the pci id of my card is 0407 (10de is vendor id, nvidia) $ dmesg | grep generation [11562.063550] [drm] nouveau 0000:01:00.0: Detected an NV50 generation card (0x084700a2) -> generation is nv50, codename nv84. So in my case I would just call it nv84-0407-shining-mmiotrace.tar.gz Created attachment 43011 [details] [review] possible fix for nv4x/nv6x chipsets I don't know these cards as well as curro, but, we do this wrong on at least NV67, quite possible some others too. Can anyone on nv4x experiencing this give this patch a shot? (In reply to comment #19) > Created an attachment (id=43011) [details] > possible fix for nv4x/nv6x chipsets > > I don't know these cards as well as curro, but, we do this wrong on at least > NV67, quite possible some others too. Can anyone on nv4x experiencing this > give this patch a shot? i tried but im guessing im doing something wrong because i get this when trying to patch patching file nv40_graph.c Hunk #1 FAILED at 223. Hunk #2 FAILED at 230. Hunk #3 FAILED at 239. 3 out of 3 hunks FAILED -- saving rejects to file nv40_graph.c.rej (In reply to comment #20) > (In reply to comment #19) > > Created an attachment (id=43011) [details] [details] > > possible fix for nv4x/nv6x chipsets > > > > I don't know these cards as well as curro, but, we do this wrong on at least > > NV67, quite possible some others too. Can anyone on nv4x experiencing this > > give this patch a shot? > > i tried but im guessing im doing something wrong because i get this when trying > to patch > > patching file nv40_graph.c > Hunk #1 FAILED at 223. > Hunk #2 FAILED at 230. > Hunk #3 FAILED at 239. > 3 out of 3 hunks FAILED -- saving rejects to file nv40_graph.c.rej i found out what i did wrong and about to go through the whole set-up in a bit (In reply to comment #21) > (In reply to comment #20) > > (In reply to comment #19) > > > Created an attachment (id=43011) [details] [details] [details] > > > possible fix for nv4x/nv6x chipsets > > > > > > I don't know these cards as well as curro, but, we do this wrong on at least > > > NV67, quite possible some others too. Can anyone on nv4x experiencing this > > > give this patch a shot? > > > > i tried but im guessing im doing something wrong because i get this when trying > > to patch > > > > patching file nv40_graph.c > > Hunk #1 FAILED at 223. > > Hunk #2 FAILED at 230. > > Hunk #3 FAILED at 239. > > 3 out of 3 hunks FAILED -- saving rejects to file nv40_graph.c.rej > > i found out what i did wrong and about to go through the whole set-up in a bit and i cant seem to get anything to work how it should no matter where i get the instructions and such I have same problems! My video card is: 00:0d.0 VGA compatible controller: nVidia Corporation C61 [GeForce 6100 nForce 405] (rev a2) (prog-if 00 [VGA controller]) Subsystem: ASRock Incorporation Device 03d1 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Interrupt: pin A routed to IRQ 20 Region 0: Memory at de000000 (32-bit, non-prefetchable) [size=16M] Region 1: Memory at c0000000 (64-bit, prefetchable) [size=256M] Region 3: Memory at dd000000 (64-bit, non-prefetchable) [size=16M] Expansion ROM at dfcc0000 [disabled] [size=128K] Capabilities: <access denied> Kernel driver in use: nouveau Kernel modules: nouveau, nvidiafb As i understand developers needs mmiotrace but did i have to make it with nvidia module? We can't use it right now because of broken ABI??!? Nvidia only supports released xservers. zeruke is using an alpha ubuntu with a prerelease xserver. So if you have a normal release you should be fine. comment 19 provides a patch, so forget about the mmiotrace, just try the patch. But you need to be able to build a kernel from source, probably from git and apply the patch there. http://nouveau.freedesktop.org/wiki/InstallDRM dmesg | grep generation [drm] nouveau 0000:01:00.0: Detected an NV50 generation card (0x0a3180a2) I boot with nouveau.noaccel=1 as mentioned in $INTERNET,but still see sometimes some corruptions moving windows. moving it out the screen and back helps. kernels 35,37,38 (In reply to comment #26) > dmesg | grep generation > [drm] nouveau 0000:01:00.0: Detected an NV50 generation card (0x0a3180a2) > > > I boot with nouveau.noaccel=1 as mentioned in $INTERNET,but still see sometimes > some corruptions moving windows. moving it out the screen and back helps. > > kernels 35,37,38 I *highly* doubt the bug you're seeing is the same bug. Plus, if you're seeing corruption with noaccel, it's likely not nouveau's fault at all either. aha okay. yust checked: with distro kernel it's gone. with 38-rc4 I will attache 2 photo's tell me if should open a new or attach them to an already open bug. Created attachment 43162 [details]
kernel-38-rc4 with nouveau.noaccel=1
kernel-38-rc4 with nouveau.noaccel=1
Created attachment 43163 [details]
kernel-38-rc4 without nouveau.noaccel=1
kernel-38-rc4 *without* nouveau.noaccel=1
same with kernel 37
only hard reset works
well i still cant check on the patch because for some reason I cant build it....i get errors like the kernel tree is wrong and if not that i get something about files being unexpected or expected somewhere.....it might be because im using ubuntu but im not sure...or maybe im just missing a step but i am using the instructions at http://nouveau.freedesktop.org/wiki/InstallDRM maybe i can get one already compiled by someone?....and im using the latest rc of kernel 2.6.38 with the ubuntu's patch on it.. Created attachment 43234 [details]
after patch with res at 1280x800(16:10)
the patch fixes it were now i can see for the most part but it doesnt fix what was going on befor the compleate blocking of the screen
still has a small tiling of about 3 or so when resolution is at 1280x800(16:10) which i believe is my screens native resolution....that tiling is fixed by lowering the resolution which i now have at 1024x768(4:3)
it seems to have the small tiling when the aspect ratio is at (16:10) and (9:5), all resolutions using the aspect ratio (4:3) shows things perfectly when using the patch
Note that Ubuntu 2.6.38-rc4 based kernels are available with the "possible fix for nv4x/nv6x chipsets" patch applied are available, see the downstream bug for details: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/711591/comments/24 I was experiencing a similar problem with GeForce 6150SE nForce 430. After rebuilding my kernel with https://bugs.freedesktop.org/attachment.cgi?id=43011 applied, and the problem went way. More details in the downstream bug report: http://bugs.debian.org/613078 We have a couple of reports back on the downstream bug with the "possible fix for nv4x/nv6x chipsets" patch applied. One report of complete mitigation, another which sounds like there is a second issue but they are improved by the patch. Comments #25 and #26 below: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/711591 That patch was never meant to be tested by everyone. It was a hack to test for people who have problems. Ben committed this 5 days ago. It seems to be in linus' tree too. http://cgit.freedesktop.org/nouveau/linux-2.6/commit/?id=aaa3d08c357dcfbe13ec23786c294759183a4d8d |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.