Summary: | NV43 GeForce 6600 Nouveau is not stable on legacy hardware | ||
---|---|---|---|
Product: | xorg | Reporter: | Vasili Pupkin <diggest> |
Component: | Driver/nouveau | Assignee: | Nouveau Project <nouveau> |
Status: | RESOLVED MOVED | QA Contact: | Xorg Project Team <xorg-team> |
Severity: | not set | ||
Priority: | not set | CC: | diggest, igettaja |
Version: | unspecified | ||
Hardware: | x86-64 (AMD64) | ||
OS: | All | ||
Whiteboard: | |||
i915 platform: | i915 features: |
Description
Vasili Pupkin
2019-09-11 06:43:56 UTC
Those messages imply you've run out of vram. Are you using something like gnome or kde? Those won't work well on this hardware. Ubuntu 16.04 works just fine with nvidia-304 drivers. If the lack of ram is the problem it would be helpful to have better diagnostic messages. (In reply to Vasili Pupkin from comment #2) > Ubuntu 16.04 works just fine with nvidia-304 drivers. If the lack of ram is > the problem it would be helpful to have better diagnostic messages. We have them. [ 199.658774] nouveau 0000:04:00.0: systemd-logind[1352]: validate: -22 [ 199.658902] nouveau 0000:04:00.0: systemd-logind[1352]: fail set_domain [ 199.658905] nouveau 0000:04:00.0: systemd-logind[1352]: validating bo list This indicates a lack of ability to place all the buffers needed into vram/gart as requested by the submitter. The NVIDIA drivers will work much better for you if you're trying to make heavy use of GL, like modern systems like to do. This diagnostics is only make sense for developers of nouveau. nvidia-304 drivers are dropped and nouveau seams like the only out of the box option for Ubuntu 18.04. But why is nvidia drivers don't experience this lack of vram problem? (In reply to Vasili Pupkin from comment #4) > This diagnostics is only make sense for developers of nouveau. > nvidia-304 drivers are dropped and nouveau seams like the only out of the > box option for Ubuntu 18.04. But why is nvidia drivers don't experience this > lack of vram problem? Because they've had man-decades invested into their development to ensure that they handle these types of situations well. Nouveau GL drivers have not. The lack of human resources is sad but it is still a bug. Mark it as wontfix if the support of this legacy hardware is outside of the project goal. I am happy to test a patch if it is not architectural impossible in nouveau to fix it and back vram in main memory. In the meanwhile, I suspect if you add LIBGL_ALWAYS_SOFTWARE=1 into your /etc/environment, you will be much happier. You can then still enable GL for certain programs that you actually want to use it for, but not for random GTK/Qt programs that want to draw a button and think it's a great idea to start using GL for that. LIBGL_ALWAYS_SOFTWARE=1 didn't help at all, same messages in dmesg and syslog (In reply to Vasili Pupkin from comment #8) > LIBGL_ALWAYS_SOFTWARE=1 didn't help at all, same messages in dmesg and syslog Must not have gotten picked up =/ Just remove nouveau_dri.so from ... /usr/lib/dri/ or something along those lines. Removing nouveau_dri.so didn't help either (In reply to Vasili Pupkin from comment #10) > Removing nouveau_dri.so didn't help either Erm ... that implies that the error is not from what I think it is. Just to super-triple-check ... run "glxinfo" - does that say you're using LLVMpipe or nouveau? It shows OpenGL renderer string: llvmpipe (LLVM 8.0, 256 bits) (In reply to Vasili Pupkin from comment #12) > It shows > > OpenGL renderer string: llvmpipe (LLVM 8.0, 256 bits) OK. So you've successfully killed off nouveau GL impl, but you're still seeing those errors? That's very very very surprising. Can you tell me more about your environment? (Have you tried restarting your desktop / window manager? Otherwise this is all moot.) The system has two identical card installed (I completely forgotten about this fact because no monitor is connected to the second adapter). I've connected monitor to the second one and it only shows cursor, no window is shown when I move one to the second monitor. Ok. I removed the second card and nouveau stop spinning dmesg with those messages. The freezes remains and it feels less stable with one card than it was with two cards, rarely survive more than five minutes. So there are two questions: Are those messages a bug or the two adapter setup require some additional configuration to work properly? How to debug freezes? Ctrl+Alt+F2 doesn't work. I've found this stacktrace in syslog at the end, not sure if it is the last stack before crash ------------[ cut here ]------------ nouveau 0000:01:00.0: timeout .... .... Call Trace: nvkm_vmm_iter.constprop.12+0x2e5/0x880 [nouveau] ? nv41_vmm_pgt_sgl+0x140/0x140 [nouveau] ? nvkm_vmm_free_insert+0x80/0x80 [nouveau] ? nvkm_vmm_put_region+0xd0/0x160 [nouveau] nvkm_vmm_ptes_unmap_put+0x32/0x50 [nouveau] ? nv41_vmm_pgt_sgl+0x140/0x140 [nouveau] nvkm_vmm_put_locked+0x103/0x220 [nouveau] nvkm_uvmm_mthd+0x7eb/0x850 [nouveau] nvkm_object_mthd+0x1a/0x30 [nouveau] nvkm_ioctl_mthd+0x5d/0xb0 [nouveau] nvkm_ioctl+0x11d/0x280 [nouveau] nvkm_client_ioctl+0x12/0x20 [nouveau] nvif_object_ioctl+0x47/0x50 [nouveau] nvif_object_mthd+0x129/0x150 [nouveau] ? __ttm_dma_free_page.isra.5+0x32/0x40 [ttm] ? isolate_huge_page+0x30/0xa0 ? __ttm_dma_free_page.isra.5+0x32/0x40 [ttm] ? ttm_dma_page_put+0x53/0x90 [ttm] nvif_vmm_put+0x5f/0x80 [nouveau] nouveau_mem_fini+0x3b/0x70 [nouveau] nv04_sgdma_unbind+0x12/0x20 [nouveau] ttm_tt_unbind+0x21/0x40 [ttm] ttm_tt_destroy.part.12+0x12/0x60 [ttm] ttm_tt_destroy+0x13/0x20 [ttm] ttm_bo_cleanup_memtype_use+0x32/0x70 [ttm] ttm_bo_cleanup_refs+0x1c0/0x200 [ttm] ? ttm_mem_global_free+0x13/0x20 [ttm] ttm_bo_delayed_delete+0x1cd/0x1e0 [ttm] ttm_bo_delayed_workqueue+0x1b/0x40 [ttm] process_one_work+0x1fd/0x400 worker_thread+0x34/0x410 kthread+0x121/0x140 ? process_one_work+0x400/0x400 ? kthread_park+0x90/0x90 ret_from_fork+0x35/0x40 Your desktop environment could be trying to add the second GPU's outputs and/or trying to use it for render offload. Neither will work well with nv4x generation GPUs. I can believe that the set_domain stuff is failing because of that. I hadn't considered that option. The timeout is bad - that means something hung. Probably the messages before that would be more interesting than after. The first error tends to be the most useful one. What desktop environment are you using, if any? I am using gnome 3.28.2 on xorg 1.20.4, wayland disappeared from the list on gdm login screen after recent upgrades It seems that this timeout error in nv41_vmm_flush may not be the cause of the problem but a consequence of the bug. nouveau starts issuing this timeout exception log after a freeze and there can be quite a few such timeouts until I restart the system. -- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/xorg/driver/xf86-video-nouveau/issues/502. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.