Summary: | Kernel Panic on Linux Kernel 4.4 when loading KDE/KDM on Nvidia GeForce 7025 / nForce 630a | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | xorg | Reporter: | xgfwtvdh | ||||||
Component: | Driver/nouveau | Assignee: | Nouveau Project <nouveau> | ||||||
Status: | RESOLVED MOVED | QA Contact: | Xorg Project Team <xorg-team> | ||||||
Severity: | blocker | ||||||||
Priority: | medium | CC: | mirh | ||||||
Version: | unspecified | ||||||||
Hardware: | x86-64 (AMD64) | ||||||||
OS: | Linux (All) | ||||||||
Whiteboard: | |||||||||
i915 platform: | i915 features: | ||||||||
Attachments: |
|
Description
xgfwtvdh
2016-01-02 00:00:36 UTC
Reading symbols from /lib/modules/4.4.0-rc6-amd64/kernel/drivers/gpu/drm/nouveau/nouveau.ko...(no debugging symbols found)...done. (gdb) disassemble nv40_gr_intr Dump of assembler code for function nv40_gr_intr: 0x00000000000712c0 <+0>: callq 0x712c5 <nv40_gr_intr+5> 0x00000000000712c5 <+5>: push %r15 0x00000000000712c7 <+7>: push %r14 0x00000000000712c9 <+9>: push %r13 0x00000000000712cb <+11>: push %r12 0x00000000000712cd <+13>: push %rbp 0x00000000000712ce <+14>: push %rbx 0x00000000000712cf <+15>: mov %rdi,%rbx 0x00000000000712d2 <+18>: lea 0x58(%rbx),%r13 0x00000000000712d6 <+22>: sub $0x208,%rsp 0x00000000000712dd <+29>: mov 0x18(%rdi),%r14 0x00000000000712e1 <+33>: mov %gs:0x28,%rax 0x00000000000712ea <+42>: mov %rax,0x200(%rsp) 0x00000000000712f2 <+50>: xor %eax,%eax 0x00000000000712f4 <+52>: mov 0x80(%r14),%rax 0x00000000000712fb <+59>: lea 0x400100(%rax),%rdi 0x0000000000071302 <+66>: callq 0x71307 <nv40_gr_intr+71> 0x0000000000071307 <+71>: mov %eax,%r12d 0x000000000007130a <+74>: mov 0x80(%r14),%rax 0x0000000000071311 <+81>: lea 0x400108(%rax),%rdi 0x0000000000071318 <+88>: callq 0x7131d <nv40_gr_intr+93> 0x000000000007131d <+93>: mov %eax,0x58(%rsp) 0x0000000000071321 <+97>: mov 0x80(%r14),%rax 0x0000000000071328 <+104>: lea 0x400104(%rax),%rdi 0x000000000007132f <+111>: callq 0x71334 <nv40_gr_intr+116> 0x0000000000071334 <+116>: mov %eax,0x5c(%rsp) 0x0000000000071338 <+120>: mov 0x80(%r14),%rax 0x000000000007133f <+127>: lea 0x40032c(%rax),%rdi 0x0000000000071346 <+134>: callq 0x7134b <nv40_gr_intr+139> 0x000000000007134b <+139>: and $0xfffff,%eax 0x0000000000071350 <+144>: mov %eax,%r15d 0x0000000000071353 <+147>: mov 0x80(%r14),%rax 0x000000000007135a <+154>: lea 0x400704(%rax),%rdi 0x0000000000071361 <+161>: callq 0x71366 <nv40_gr_intr+166> 0x0000000000071366 <+166>: mov %eax,0x60(%rsp) 0x000000000007136a <+170>: mov %eax,%ebp 0x000000000007136c <+172>: mov 0x80(%r14),%rax 0x0000000000071373 <+179>: and $0x70000,%ebp 0x0000000000071379 <+185>: shr $0x10,%ebp 0x000000000007137c <+188>: lea 0x400708(%rax),%rdi 0x0000000000071383 <+195>: mov %ebp,0x64(%rsp) 0x0000000000071387 <+199>: callq 0x7138c <nv40_gr_intr+204> 0x000000000007138c <+204>: lea 0x400160(,%rbp,4),%edi 0x0000000000071393 <+211>: add 0x80(%r14),%rdi 0x000000000007139a <+218>: mov %eax,0x68(%rsp) 0x000000000007139e <+222>: callq 0x713a3 <nv40_gr_intr+227> 0x00000000000713a3 <+227>: mov %r13,%rdi 0x00000000000713a6 <+230>: mov %eax,0x6c(%rsp) 0x00000000000713aa <+234>: callq 0x713af <nv40_gr_intr+239> 0x00000000000713af <+239>: mov %rax,0x50(%rsp) 0x00000000000713b4 <+244>: mov 0x68(%rbx),%rax 0x00000000000713b8 <+248>: lea 0x68(%rbx),%rsi 0x00000000000713bc <+252>: cmp %rsi,%rax ---Type <return> to continue, or q <return> to quit--- 0x00000000000713bf <+255>: je 0x713fd <nv40_gr_intr+317> 0x00000000000713c1 <+257>: mov -0x8(%rax),%edx 0x00000000000713c4 <+260>: lea -0x88(%rax),%rbp 0x00000000000713cb <+267>: shr $0x4,%edx 0x00000000000713ce <+270>: cmp %edx,%r15d 0x00000000000713d1 <+273>: jne 0x713e7 <nv40_gr_intr+295> 0x00000000000713d3 <+275>: jmpq 0x715c2 <nv40_gr_intr+770> 0x00000000000713d8 <+280>: mov -0x8(%rdx),%edx 0x00000000000713db <+283>: shr $0x4,%edx 0x00000000000713de <+286>: cmp %edx,%r15d 0x00000000000713e1 <+289>: je 0x715c2 <nv40_gr_intr+770> 0x00000000000713e7 <+295>: mov 0x88(%rbp),%rdx 0x00000000000713ee <+302>: cmp %rsi,%rdx 0x00000000000713f1 <+305>: lea -0x88(%rdx),%rbp 0x00000000000713f8 <+312>: mov %rdx,%rax 0x00000000000713fb <+315>: jne 0x713d8 <nv40_gr_intr+280> 0x00000000000713fd <+317>: xor %ebp,%ebp 0x00000000000713ff <+319>: test $0x100000,%r12d 0x0000000000071406 <+326>: je 0x7158a <nv40_gr_intr+714> 0x000000000007140c <+332>: testl $0x10000,0x58(%rsp) 0x0000000000071414 <+340>: jne 0x715ef <nv40_gr_intr+815> 0x000000000007141a <+346>: mov 0x80(%r14),%rax 0x0000000000071421 <+353>: mov %r12d,%edi 0x0000000000071424 <+356>: lea 0x400100(%rax),%rsi 0x000000000007142b <+363>: callq 0x71430 <nv40_gr_intr+368> 0x0000000000071430 <+368>: mov 0x80(%r14),%rsi 0x0000000000071437 <+375>: mov $0x1,%edi 0x000000000007143c <+380>: add $0x400720,%rsi 0x0000000000071443 <+387>: callq 0x71448 <nv40_gr_intr+392> 0x0000000000071448 <+392>: lea 0x80(%rsp),%rdi 0x0000000000071450 <+400>: mov %r12d,%ecx 0x0000000000071453 <+403>: mov $0x0,%rdx 0x000000000007145a <+410>: mov $0x80,%esi 0x000000000007145f <+415>: callq 0x71464 <nv40_gr_intr+420> 0x0000000000071464 <+420>: mov 0x58(%rsp),%ecx 0x0000000000071468 <+424>: lea 0x100(%rsp),%rdi 0x0000000000071470 <+432>: mov $0x0,%rdx 0x0000000000071477 <+439>: mov $0x80,%esi 0x000000000007147c <+444>: callq 0x71481 <nv40_gr_intr+449> 0x0000000000071481 <+449>: mov 0x5c(%rsp),%ecx 0x0000000000071485 <+453>: lea 0x180(%rsp),%rdi 0x000000000007148d <+461>: mov $0x0,%rdx 0x0000000000071494 <+468>: mov $0x80,%esi 0x0000000000071499 <+473>: callq 0x7149e <nv40_gr_intr+478> 0x000000000007149e <+478>: mov 0x50(%rbx),%eax 0x00000000000714a1 <+481>: test %eax,%eax 0x00000000000714a3 <+483>: je 0x71554 <nv40_gr_intr+660> 0x00000000000714a9 <+489>: test %rbp,%rbp 0x00000000000714ac <+492>: je 0x7161c <nv40_gr_intr+860> 0x00000000000714b2 <+498>: mov 0x78(%rbp),%rdx 0x00000000000714b6 <+502>: mov %r15d,%r9d 0x00000000000714b9 <+505>: shl $0x4,%r9d 0x00000000000714bd <+509>: mov 0x20(%rdx),%rax 0x00000000000714c1 <+513>: movzwl 0x98(%rdx),%esi ---Type <return> to continue, or q <return> to quit--- 0x00000000000714c8 <+520>: lea 0x70(%rax),%rcx 0x00000000000714cc <+524>: mov 0x20(%rbx),%eax 0x00000000000714cf <+527>: mov 0x60(%rsp),%r14d 0x00000000000714d4 <+532>: lea 0x80(%rsp),%r8 0x00000000000714dc <+540>: and $0x1ffc,%r14d 0x00000000000714e3 <+547>: mov 0x0(,%rax,8),%rdx 0x00000000000714eb <+555>: mov 0x18(%rbx),%rax 0x00000000000714ef <+559>: mov 0x10(%rax),%rdi 0x00000000000714f3 <+563>: mov 0x68(%rsp),%eax 0x00000000000714f7 <+567>: mov %r9d,0x20(%rsp) 0x00000000000714fc <+572>: mov 0x58(%rsp),%r9d 0x0000000000071501 <+577>: mov %rcx,0x28(%rsp) 0x0000000000071506 <+582>: mov %esi,0x18(%rsp) 0x000000000007150a <+586>: mov %r12d,%ecx 0x000000000007150d <+589>: mov %eax,0x48(%rsp) 0x0000000000071511 <+593>: movzwl 0x6c(%rsp),%eax 0x0000000000071516 <+598>: mov $0x0,%rsi 0x000000000007151d <+605>: mov %r14d,0x40(%rsp) 0x0000000000071522 <+610>: mov %eax,0x38(%rsp) 0x0000000000071526 <+614>: mov 0x64(%rsp),%eax 0x000000000007152a <+618>: mov %eax,0x30(%rsp) 0x000000000007152e <+622>: lea 0x180(%rsp),%rax 0x0000000000071536 <+630>: mov %rax,0x10(%rsp) 0x000000000007153b <+635>: mov 0x5c(%rsp),%eax 0x000000000007153f <+639>: mov %eax,0x8(%rsp) 0x0000000000071543 <+643>: lea 0x100(%rsp),%rax 0x000000000007154b <+651>: mov %rax,(%rsp) 0x000000000007154f <+655>: callq 0x71554 <nv40_gr_intr+660> 0x0000000000071554 <+660>: mov 0x50(%rsp),%rsi 0x0000000000071559 <+665>: mov %r13,%rdi 0x000000000007155c <+668>: callq 0x71561 <nv40_gr_intr+673> 0x0000000000071561 <+673>: mov 0x200(%rsp),%rax 0x0000000000071569 <+681>: xor %gs:0x28,%rax 0x0000000000071572 <+690>: jne 0x71634 <nv40_gr_intr+884> 0x0000000000071578 <+696>: add $0x208,%rsp 0x000000000007157f <+703>: pop %rbx 0x0000000000071580 <+704>: pop %rbp 0x0000000000071581 <+705>: pop %r12 0x0000000000071583 <+707>: pop %r13 0x0000000000071585 <+709>: pop %r14 0x0000000000071587 <+711>: pop %r15 0x0000000000071589 <+713>: retq 0x000000000007158a <+714>: mov 0x80(%r14),%rax 0x0000000000071591 <+721>: mov %r12d,%edi 0x0000000000071594 <+724>: lea 0x400100(%rax),%rsi 0x000000000007159b <+731>: callq 0x715a0 <nv40_gr_intr+736> 0x00000000000715a0 <+736>: mov 0x80(%r14),%rsi 0x00000000000715a7 <+743>: mov $0x1,%edi 0x00000000000715ac <+748>: add $0x400720,%rsi 0x00000000000715b3 <+755>: callq 0x715b8 <nv40_gr_intr+760> 0x00000000000715b8 <+760>: test %r12d,%r12d 0x00000000000715bb <+763>: je 0x71554 <nv40_gr_intr+660> 0x00000000000715bd <+765>: jmpq 0x71448 <nv40_gr_intr+392> 0x00000000000715c2 <+770>: mov %rax,%rdi ---Type <return> to continue, or q <return> to quit--- 0x00000000000715c5 <+773>: mov %rsi,0x78(%rsp) 0x00000000000715ca <+778>: mov %rax,0x70(%rsp) 0x00000000000715cf <+783>: callq 0x715d4 <nv40_gr_intr+788> 0x00000000000715d4 <+788>: mov 0x70(%rsp),%rax 0x00000000000715d9 <+793>: mov 0x68(%rbx),%rdx 0x00000000000715dd <+797>: mov 0x78(%rsp),%rsi 0x00000000000715e2 <+802>: mov %rax,%rdi 0x00000000000715e5 <+805>: callq 0x715ea <nv40_gr_intr+810> 0x00000000000715ea <+810>: jmpq 0x713ff <nv40_gr_intr+319> 0x00000000000715ef <+815>: mov 0x80(%r14),%rax 0x00000000000715f6 <+822>: lea 0x402000(%rax),%rdi 0x00000000000715fd <+829>: callq 0x71602 <nv40_gr_intr+834> 0x0000000000071602 <+834>: mov 0x80(%r14),%rcx 0x0000000000071609 <+841>: mov %eax,%edi 0x000000000007160b <+843>: lea 0x402000(%rcx),%rsi 0x0000000000071612 <+850>: callq 0x71617 <nv40_gr_intr+855> 0x0000000000071617 <+855>: jmpq 0x7141a <nv40_gr_intr+346> 0x000000000007161c <+860>: mov %r15d,%r9d 0x000000000007161f <+863>: mov $0x0,%rcx 0x0000000000071626 <+870>: mov $0xffffffff,%esi 0x000000000007162b <+875>: shl $0x4,%r9d 0x000000000007162f <+879>: jmpq 0x714cc <nv40_gr_intr+524> 0x0000000000071634 <+884>: callq 0x71639 End of assembler dump. This commit should fix the oops: https://github.com/skeggsb/nouveau/commit/b09b9c5b0f84a84d6ef99d999428042cbab93473 The bug is half solved! I compiled a new kernel image. I did not get the kernel panic any more but still get the full dmesg full of errors like: [ 1059.265197] nouveau 0000:00:0d.0: bus: MMIO write of 00000000 FAULT at 00b020 [ 1059.265516] nouveau 0000:00:0d.0: bus: MMIO write of 010d0001 FAULT at 00b010 [ 1060.724984] nouveau 0000:00:0d.0: bus: MMIO write of 00000000 FAULT at 00b010 [ 1060.847845] nouveau 0000:00:0d.0: bus: MMIO write of 010a0001 FAULT at 00b010 [ 1060.848340] nouveau 0000:00:0d.0: bus: MMIO write of 010c0001 FAULT at 00b020 [ 1061.073132] nouveau 0000:00:0d.0: bus: MMIO write of 00000000 FAULT at 00b010 [ 1061.073350] nouveau 0000:00:0d.0: bus: MMIO write of 00000000 FAULT at 00b020 [ 1061.265858] nouveau 0000:00:0d.0: bus: MMIO write of 01890001 FAULT at 00b010 [ 1061.266331] nouveau 0000:00:0d.0: bus: MMIO write of 018c0001 FAULT at 00b020 [ 1061.266954] nouveau 0000:00:0d.0: bus: MMIO write of 00000000 FAULT at 00b010 [ 1068.679841] nouveau 0000:00:0d.0: bus: MMIO write of 01890001 FAULT at 00b010 [ 1068.691605] nouveau 0000:00:0d.0: bus: MMIO write of 018b0001 FAULT at 00b030 [ 1069.007856] nouveau 0000:00:0d.0: bus: MMIO write of 00000000 FAULT at 00b010 [ 1069.008076] nouveau 0000:00:0d.0: bus: MMIO write of 00000000 FAULT at 00b030 [ 1069.009128] nouveau 0000:00:0d.0: bus: MMIO write of 04200001 FAULT at 00b010 [ 1071.267098] nouveau 0000:00:0d.0: bus: MMIO write of 01890001 FAULT at 00b030 [ 1071.267686] nouveau 0000:00:0d.0: bus: MMIO write of 018b0001 FAULT at 00b040 [ 1075.811130] nouveau 0000:00:0d.0: bus: MMIO write of 00000000 FAULT at 00b030 [ 1075.811472] nouveau 0000:00:0d.0: bus: MMIO write of 00000000 FAULT at 00b040 Could you quickly push the fix that is already done to solve the kernel panic in kernel 4.4 before it gets stable? (In reply to xgfwtvdh from comment #3) > The bug is half solved! > I compiled a new kernel image. I did not get the kernel panic any more but > still get the full dmesg full of errors like: > > [ 1059.265197] nouveau 0000:00:0d.0: bus: MMIO write of 00000000 FAULT at > 00b020 > [ 1059.265516] nouveau 0000:00:0d.0: bus: MMIO write of 010d0001 FAULT at > 00b010 > [ 1060.724984] nouveau 0000:00:0d.0: bus: MMIO write of 00000000 FAULT at > 00b010 > [ 1060.847845] nouveau 0000:00:0d.0: bus: MMIO write of 010a0001 FAULT at > 00b010 > [ 1060.848340] nouveau 0000:00:0d.0: bus: MMIO write of 010c0001 FAULT at > 00b020 > [ 1061.073132] nouveau 0000:00:0d.0: bus: MMIO write of 00000000 FAULT at > 00b010 > [ 1061.073350] nouveau 0000:00:0d.0: bus: MMIO write of 00000000 FAULT at > 00b020 > [ 1061.265858] nouveau 0000:00:0d.0: bus: MMIO write of 01890001 FAULT at > 00b010 > [ 1061.266331] nouveau 0000:00:0d.0: bus: MMIO write of 018c0001 FAULT at > 00b020 > [ 1061.266954] nouveau 0000:00:0d.0: bus: MMIO write of 00000000 FAULT at > 00b010 > [ 1068.679841] nouveau 0000:00:0d.0: bus: MMIO write of 01890001 FAULT at > 00b010 > [ 1068.691605] nouveau 0000:00:0d.0: bus: MMIO write of 018b0001 FAULT at > 00b030 > [ 1069.007856] nouveau 0000:00:0d.0: bus: MMIO write of 00000000 FAULT at > 00b010 > [ 1069.008076] nouveau 0000:00:0d.0: bus: MMIO write of 00000000 FAULT at > 00b030 > [ 1069.009128] nouveau 0000:00:0d.0: bus: MMIO write of 04200001 FAULT at > 00b010 > [ 1071.267098] nouveau 0000:00:0d.0: bus: MMIO write of 01890001 FAULT at > 00b030 > [ 1071.267686] nouveau 0000:00:0d.0: bus: MMIO write of 018b0001 FAULT at > 00b040 > [ 1075.811130] nouveau 0000:00:0d.0: bus: MMIO write of 00000000 FAULT at > 00b030 > [ 1075.811472] nouveau 0000:00:0d.0: bus: MMIO write of 00000000 FAULT at > 00b040 These are unrelated and should be nothing to worry about. They are REDLINED in dmesg. More/bigger warning is not possible for the users. It were something like a big red flashing "WARNING, MOTOR FAILURE" on the Screen of your car and then someone would say "nah, that warning is just fine, ignore it". I would like to help fixing them by testing something when you need me testing something. (In reply to xgfwtvdh from comment #5) > They are REDLINED in dmesg. More/bigger warning is not possible for the > users. > It were something like a big red flashing "WARNING, MOTOR FAILURE" on the > Screen of your car and then someone would say "nah, that warning is just > fine, ignore it". > > I would like to help fixing them by testing something when you need me > testing something. If you can get a mmiotrace[1] of the binary driver, that might be useful. Thanks, Ben. [1] http://nouveau.freedesktop.org/wiki/MmioTrace/ Created attachment 121076 [details]
mmiotrace nvidia 304.131 driver on ubuntu 14.04.3
System: Arch Linux with linux 4.8.4 and xf86-video-nouveau 1.0.13. Same problem as above.Kernel crashes after spamming MMIO write fault errors. Interestingly, problem goes away with kernel parameter "nouveau.noaccel=1". Of course this means no 2D acceleration. The MMIO Trace took me one day to find out how to create it. Would be great to hear if it was usefll Created attachment 140859 [details]
One taken with steps on the ubuntu page, the other with those from kernel docs
My traces, from manjaro with kernel 4.14 and nvidia 304.137
Mesa 18.1 isn't indeed causing kernel panics, but that MMIO fault error is everywhere
Can some developer please finally help with fixing this? -- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/xorg/driver/xf86-video-nouveau/issues/246. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.