Bug 58615 - [NV43] hangs with direct rendering since 3.7 rework
Summary: [NV43] hangs with direct rendering since 3.7 rework
Status: RESOLVED WORKSFORME
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/nouveau (show other bugs)
Version: git
Hardware: x86 (IA32) Linux (All)
: medium normal
Assignee: Nouveau Project
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-12-21 12:18 UTC by Andrew Randrianasulu
Modified: 2017-01-29 07:12 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
nv43 bisection log [partial] (5.65 KB, text/plain)
2012-12-21 12:20 UTC, Andrew Randrianasulu
no flags Details
possible bad commits (805 bytes, text/plain)
2012-12-21 12:22 UTC, Andrew Randrianasulu
no flags Details
Kernel .config (74.48 KB, text/plain)
2012-12-21 12:28 UTC, Andrew Randrianasulu
no flags Details
working X log (41.39 KB, text/plain)
2012-12-21 12:32 UTC, Andrew Randrianasulu
no flags Details
X log for affected kernel (37.49 KB, text/plain)
2012-12-21 13:14 UTC, Andrew Randrianasulu
no flags Details
dmesg after X start and launching gears (233.61 KB, text/plain)
2012-12-21 13:16 UTC, Andrew Randrianasulu
no flags Details
Kernel config [working glxgears with direct rendering] (75.54 KB, text/plain)
2013-02-15 05:47 UTC, Andrew Randrianasulu
no flags Details
Nouveau git kernel + experimental pmpeg path also hangs (138.03 KB, text/plain)
2013-08-02 08:23 UTC, Andrew Randrianasulu
no flags Details
dmesg from nouveau/master (274.13 KB, text/plain)
2013-09-04 01:52 UTC, Andrew Randrianasulu
no flags Details

Description Andrew Randrianasulu 2012-12-21 12:18:43 UTC
I  have  my  nv43 back  in  action -  and  unfortunately new  bugs  to report. 

When I compiled  new 3.7-rc  kernel  i  noticed   anything 3d  related, even  trivial/tri  example from mesa/demos started  to  hang  my X server - usually  I  was  able  to  switch  consoles  and reboot  clearly, but  not  always. Strangely, _indirect_ rendering  worked  ok! kernels 3.4.6, 3.5.7, 3.6.x - all OK.

will  add  kernel/X  log  after reboot into freshly  compiled  nouveau  kernel,  now  I want to attach  partial  bisection  logs, done  with mainline  kernel.
Comment 1 Andrew Randrianasulu 2012-12-21 12:20:30 UTC
Created attachment 71920 [details]
nv43  bisection log [partial]
Comment 2 Andrew Randrianasulu 2012-12-21 12:22:53 UTC
Created attachment 71921 [details]
possible  bad  commits

Unfortunately, all those commits  resulted  in BUG or  simply  black  screen during  boot, with  completely non-working DRM. So, i  can't  test them.
Comment 3 Andrew Randrianasulu 2012-12-21 12:28:47 UTC
Created attachment 71922 [details]
Kernel .config

This is  my  minimal  kernel  config -  but  bug  also  happens  with  bigger  one (SMP, SLUB, tons  of  modules, etc). What  I  haven't  checked  - if  changing ARCH  from i486 to something  more  modern will do  any  good -  will try this, too.
Comment 4 Andrew Randrianasulu 2012-12-21 12:32:21 UTC
Created attachment 71923 [details]
working  X  log

This is  on kernel 3.6.11. 

bash-4.2$ glxgears
Running synchronized to the vertical refresh.  The framerate should be
approximately the same as the monitor refresh rate.
809 frames in 5.0 seconds = 161.695 FPS
1080 frames in 5.0 seconds = 215.978 FPS
997 frames in 5.0 seconds = 199.132 FPS
951 frames in 5.0 seconds = 189.950 FPS
988 frames in 5.0 seconds = 197.346 FPS
XIO:  fatal IO error 11 (Resource temporarily unavailable) on X server ":0"
      after 16815 requests (16815 known processed) with 0 events remaining.
bash-4.2$
Comment 5 Andrew Randrianasulu 2012-12-21 13:14:03 UTC
Created attachment 71927 [details]
X log for  affected  kernel
Comment 6 Andrew Randrianasulu 2012-12-21 13:16:29 UTC
Created attachment 71928 [details]
dmesg after X  start  and launching gears

Note:  I switched  away  from X's  vt  shortly  after black glxgears window appear -  and  captured this dmesg  from  another  vt.
Comment 7 Andrew Randrianasulu 2013-02-15 05:42:47 UTC
Interesting,  if  I  disable  ACPI  completely -  at  least  glxgears starts  to  works! (I  need  to disable  it  in kernel  config,  because  simply  booting  with  acpi=off  stops ACPI-aware  nouveau.ko  from  loading :( )
Comment 8 Andrew Randrianasulu 2013-02-15 05:47:27 UTC
Created attachment 74851 [details]
Kernel  config [working  glxgears with  direct  rendering]

Kernel  was  from  nouveau  tree  at  commit 

commit f253235ed48aca9ebf008952d8484e59e64bebae
Author: Ben Skeggs <bskeggs@redhat.com>
Date:   Thu Feb 14 13:43:21 2013 +1000

    drm/nv84-/fence: prepare for emit/sync support of sysram sequences
---------------

Sadly,  launching  seamonkey  caused  GPU  lockup,  still. But  I  was able  to  switch  back  to console  and  reboot. But  it was  different  bug,  this  one  doesn't  show  any  'GPU  lockup'  messages.
Comment 9 Andrew Randrianasulu 2013-08-02 08:23:49 UTC
Created attachment 83509 [details]
Nouveau git kernel + experimental pmpeg path also hangs

I also tried many new kernels, up to "drm/nouveau/vm: make vm refcount into a kref" (3.11-rc3 based) - they all hang  with enabled ACPI. This dmesg was  captured after applying mesa patch  from this thread [for pmpeg testing]: http://lists.freedesktop.org/archives/mesa-dev/2013-July/042473.html

As  you hopefully  can see - XvMC also  hangs like DRI clients.
Comment 10 Ilia Mirkin 2013-09-02 05:46:21 UTC
The PMPEG hangs are expected... pre-NV44 doesn't have context switching. I have a patch (which I sent to the ML) but it needs some work. So don't worry about that not working. Although it shouldn't hang, it just shouldn't do any actual decoding (and generate all those errors).

Some of the warnings in your dmesg should have be fixed as of the latest nouveau/master or 3.11-rc7.

Are you saying that simply running trivial/tri will hang X? I don't see that on a NV42 (PCIe), which is rather similar to your NV43 (AGP). Could you show a dmesg log of trivial/tri hanging things? (Are you using mesa 9.2 or mesa-git?)
Comment 11 Andrew Randrianasulu 2013-09-04 01:52:13 UTC
Created attachment 85161 [details]
dmesg from nouveau/master

Using mesa  git (9.3.0-6b5c802) trivial/tri also  hangs X server. (I switched  away from it to  working VT, and captured  this  dmesg.)
Comment 12 Andrew Travneff 2013-09-15 18:57:22 UTC
I have lockups at NV43 too. Affected kernels are at least 3.10 and 3.9. More info: https://bugzilla.redhat.com/show_bug.cgi?id=979537
Comment 13 Andrew Randrianasulu 2015-01-19 20:26:09 UTC
I seems I  can avoid (workaround) this bug by simply changing in my .config CONFIG_PREEMPT_NONE=y to CONFIG_PREEMPT_VOLUNTARY=y . I made  this  discovery  while  trying  official Slackware kernel from Slackware-current. May be this bug has something in  common with bug described  in https://www.kernel.org/pub/linux/kernel/v3.x/ChangeLog-3.18.3 as "mm, vmscan: prevent kswapd livelock due to pfmemalloc-throttled process being killed' .  Bug seems to hit  only nv40 (and less?) codepath, because my nv50 works fine without preemption.

Should I  leave  this  bugreport open, or resolve it as WORKSFORME? (currently I'm  compiling additional test kernel based on nouveau/linux-2.6 linux-3.19 branch , just  to test my workaround one more time.)

Reclocking still seems to work only partially (lost  display  and  another  kind  of  hang  after  just  few seconds of glxgears  on highest perf level, lost  display after I switch clocks back to lower frequencies), but this is  another  issue
Comment 14 Andrew Randrianasulu 2017-01-29 07:12:22 UTC
I think I can close this one, currently running  4.10.0-rc5-i486 #2 and it seems stable (after I tweaked BIOS settings - enabled "PnP OS installed" and disabled APIC).


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.