Bug 58776 - [NV44] Some cards require NvPCIE=0
Summary: [NV44] Some cards require NvPCIE=0
Status: RESOLVED INVALID
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/nouveau (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Nouveau Project
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-12-26 10:16 UTC by gabriele balducci
Modified: 2013-08-20 14:15 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
lspci output (16.73 KB, text/plain)
2012-12-26 10:16 UTC, gabriele balducci
no flags Details
kernel config (61.91 KB, text/plain)
2012-12-26 10:18 UTC, gabriele balducci
no flags Details
dmesg output (part 01) (1.29 MB, text/plain)
2012-12-26 10:24 UTC, gabriele balducci
no flags Details
dmesg output (part 02) (1.95 MB, text/plain)
2012-12-26 10:25 UTC, gabriele balducci
no flags Details
demsg for kernel-3.6.10 (32.11 KB, text/plain)
2013-01-03 14:02 UTC, gabriele balducci
no flags Details
dmesg for kernel-3.7.1 with iommu=soft nouveau.vram_pushbuf=1 (31.56 KB, text/plain)
2013-01-03 14:19 UTC, gabriele balducci
no flags Details
xorg.conf (2.57 KB, text/plain)
2013-01-03 14:24 UTC, gabriele balducci
no flags Details
Xorg.0.log for kernel-3.6.10 (41.07 KB, text/plain)
2013-01-03 14:27 UTC, gabriele balducci
no flags Details
Xorg.0.log for kernel-3.7.1 without acceleration (40.11 KB, text/plain)
2013-01-03 14:30 UTC, gabriele balducci
no flags Details
Xorg.0.log with acceleration (41.46 KB, text/plain)
2013-01-03 14:32 UTC, gabriele balducci
no flags Details

Description gabriele balducci 2012-12-26 10:16:32 UTC
Created attachment 72129 [details]
lspci output

hello everybody,

following advice from dri-devel list I am opening a bug report here.

Starting with kernel-3.7 I am not able to boot any more in KMS: as
soon as the screen resolution changes in the very early stage of the
boot, the machine freezes and the only option I'm left with is to
press a physical reset button.

I am on an athlon64 dual core machine and using DRM/NOUVEAU with a
NVIDIA GeForce 6200 LE GPU

Here are some facts:

=> while kernel-3.7.x gives the problems mentioned above,
   kernel-3.6.{8-10} boots just fine (and X11 works nicely)
=> if it can be of any use: if I reboot from a running 3.6.x kernel
   into a 3.7.x, the obtained freezed console screen shows the (somewhat
   misplaced) lines which were printed during the shutdown phase of
   the reboot process (this is similar to what reported in Description
   of bug#46557)
=> kernel-3.7.x boots just fine on another box (another athlon64 dual core with
   a GeForce 6150SE nForce 430 (rev a2) GPU); kernel is 32bit there,
   if it can matter

I am attaching lspci output, kernel config and dmesg output w/
NOUVEAU_DEBUG_DEFAULT=6

I will be happy to send in any other information which might be useful

thank you very much  in advance

ciao
gabriele
Comment 1 gabriele balducci 2012-12-26 10:18:28 UTC
Created attachment 72130 [details]
kernel config
Comment 2 gabriele balducci 2012-12-26 10:24:20 UTC
Created attachment 72131 [details]
dmesg output (part 01)

NOTE: dmesg broken in 2 files to be able to upload
Comment 3 gabriele balducci 2012-12-26 10:25:38 UTC
Created attachment 72132 [details]
dmesg output (part 02)

NOTE: dmesg output broken in 2 parts to be able to upload
Comment 4 gabriele balducci 2012-12-27 19:14:58 UTC
I have done some more experiments.

Following discussions in bug#46557 and bug#54988, I have tried both
https://bugs.freedesktop.org/attachment.cgi?id=63137 and the mem=2G
kernel opt.

The former does not have any effect, while the latter works (with
mem=2G I can boot fine and also X11 runs nicely)

I would not be in favor of a buggy hardware, since the GPU works without any
problem with kernel 3.6.x (i.e., no need to append any mem=2G
option).

I would rather think about changes between 3.6/3.7, but, as I
told above, my knowledge about the whole matter is less than poor...
	
thank you very much in advance for any help

ciao
gabriele
Comment 5 gabriele balducci 2013-01-02 10:55:39 UTC
For the records:

just for the fun of it, I have also tried this one:

   https://bugs.freedesktop.org/attachment.cgi?id=72201&action=edit

without any success, though (behavior exactly the same)

So: up to now, the only workaround is the mem=2G kernel opt; but that
means running with half of the total available memory...


ciao
gabriele
Comment 6 Marcin Slusarz 2013-01-02 13:54:28 UTC
Please attach dmesg from 3.6 kernel.

2 things to try:
1) Dealing with this:
[    0.000000] No AGP bridge found
[    0.000000] Node 0: aperture @ c4000000 size 32 MB
[    0.000000] Aperture pointing to e820 RAM. Ignoring.
[    0.000000] Your BIOS doesn't leave a aperture memory hole
[    0.000000] Please enable the IOMMU option in the BIOS setup
[    0.000000] This costs you 64 MB of RAM
[    0.000000] Mapping aperture over 65536 KB of RAM @ c4000000

2) Booting with nouveau.vram_pushbuf=1
Comment 7 gabriele balducci 2013-01-03 14:00:17 UTC
Thanks a lot.

1) I cannot find (or I am not able to recognize) any option related to
   AGP/IOMMU in my BIOS menu.
   Following advice in
   http://www.kernel.org/doc/Documentation/x86/x86_64/boot-options.txt,
   I have booted with iommu=soft, which apparently makes the IOMMU
   complaint go away from demsg, but I don't know if this is really a
   solution...
   In any case, booting with (only) iommu=soft, does not make any
   difference: i.e. the machine freezes as above

2) The nouveau.vram_pushbuf=1 option allows me to boot fine (with or
   without the iommu=soft option above)!
   However, now X11 has problems.
   After starting X11, text in menus (e.g. emacs, firefox) and in
   window borders is made of unreadable black rectangles; text in
   windows (e.g. xterm text lines, text in emacs buffers etc) are fine, though.
   This problem goes away if I boot into X11 with
                    Option  "Accel"   "Off"
   i.e.: if I switch acceleration off, everything works nicely (but,
   of course, I have no acceleration)

I have attached dmesg for kernel 3.6.10 (which works perfectly), the
new dmesg for kernel 3.7.1 with iommu=soft and nouveau.vram_pushbuf=1,
my xorg.conf and xorg.0.log for kernel 3.6.10 and 3.7.1 (with/without
accel)

I'll be happy to send any other information which might help to
clarify this problem

thank you very much again

ciao
gabriele
Comment 8 gabriele balducci 2013-01-03 14:02:28 UTC
Created attachment 72452 [details]
demsg for kernel-3.6.10
Comment 9 gabriele balducci 2013-01-03 14:19:50 UTC
Created attachment 72453 [details]
dmesg for kernel-3.7.1 with iommu=soft nouveau.vram_pushbuf=1
Comment 10 gabriele balducci 2013-01-03 14:24:48 UTC
Created attachment 72454 [details]
xorg.conf
Comment 11 gabriele balducci 2013-01-03 14:27:26 UTC
Created attachment 72455 [details]
Xorg.0.log for kernel-3.6.10
Comment 12 gabriele balducci 2013-01-03 14:30:19 UTC
Created attachment 72456 [details]
Xorg.0.log for kernel-3.7.1 without acceleration
Comment 13 gabriele balducci 2013-01-03 14:32:12 UTC
Created attachment 72457 [details]
Xorg.0.log with acceleration
Comment 14 Marcin Slusarz 2013-01-03 15:23:16 UTC
(For some reason GPU reads garbage from GART. With vram_pushbuf=1 we moved main push buffer from GART to VRAM, so it at least starts. But we really need GART.)

Can you bisect it? To speed it up you can use drivers/gpu/drm/nouveau/ as bisect path.
Comment 15 gabriele balducci 2013-01-07 11:57:58 UTC
hi there,

sorry for the delay (busy time at work and first time doing a git
bisect).

Here is the final result of the bisection:

---8<-------8<-------8<----

e5f186c4f9812eccbc291da6dfe8b15da546f961 is the first bad commit
commit e5f186c4f9812eccbc291da6dfe8b15da546f961
Author: Ben Skeggs <bskeggs@redhat.com>
Date:   Thu Sep 27 08:55:29 2012 +1000

    drm/nv44/vm: fix and enable use of "real" pciegart
    
    Something seems to be missing in regards to flushing specific ranges of
    the TLB.  For the moment, flushing the entire thing seems to make it
    work alright.
    
    Should give 39-bit DMA addressing on the relevant chipsets.
    
    v2: allocate contig 16KiB for dummy pages, reported by mwk on irc
    
    Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

:040000 040000 7dd13fec39439ac29561ada8b22f6d2666f46ff5 cf817ecb40b80ebb8446457ea4420005438ebd8e M      drivers

---8<-------8<-------8<----

Don't hesitate to ask me any other piece of information that might be
useful


I thank you very much for your assistance

ciao
gabriele
Comment 16 Marcin Slusarz 2013-01-07 17:24:04 UTC
Lovely, the fix for bug 46557 breaks your card.
I guess booting with nouveau.config=NvPCIE=0 works?

CC'ing Ben.
Comment 17 gabriele balducci 2013-01-07 18:03:33 UTC
(In reply to comment #16)
> Lovely, the fix for bug 46557 breaks your card.
> I guess booting with nouveau.config=NvPCIE=0 works?
> 

yep!

With nouveau.config=NvPCIE=0 I can boot 3.7.1 without problems and also X11 (with acceleration, as far as I can say) works fine

thanks a lot

I'll be happy to make any other test and/or gather any other piece of information if needed

ciao
gabriele
Comment 18 Ilia Mirkin 2013-08-20 01:44:33 UTC
I'd be rather curious to know if this issue has fixed itself, although I would guess not. Would you mind checking 3.10 or 3.11-rc6?
Comment 19 gabriele balducci 2013-08-20 06:09:57 UTC
Thanks for your interest

Unfortunately, that machine is dead since my report and I am now using a different hardware...

thanks again
ciao
gabriele
Comment 20 Ilia Mirkin 2013-08-20 14:15:26 UTC
OK, closing as invalid since you can no longer test patches.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.