Summary: | segfault in nouveau_fence_update | ||
---|---|---|---|
Product: | Mesa | Reporter: | Brian J. Murrell <brian> |
Component: | Drivers/DRI/nouveau | Assignee: | Nouveau Project <nouveau> |
Status: | RESOLVED INVALID | QA Contact: | |
Severity: | blocker | ||
Priority: | medium | ||
Version: | unspecified | ||
Hardware: | x86 (IA32) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: | glxinfo output |
Description
Brian J. Murrell
2012-03-14 06:03:28 UTC
Which mesa version are you using (glxinfo's OpenGL renderer string would tell) ? This shouldn't be possible, screen being NULL in fence_update, there's an explicit NULL-check in flush_notify, at least in recent versions. Created attachment 58435 [details]
glxinfo output
I have attached the full output of glxinfo to answer the previous question.
FYI: I'm getting a similar kernel oops under openSUSE 12.1 with kernel 3.1.9, but not with kernel 3.3.0-rc6: [ 3563.874043] BUG: unable to handle kernel NULL pointer dereference at 00000001 [ 3563.874073] IP: [<f80f38b9>] nouveau_fence_update+0x9/0xd0 [nouveau] [ 3563.874112] *pdpt = 000000002e22c001 *pde = 0000000000000000 [ 3563.874122] Oops: 0000 [#1] PREEMPT SMP [ 3563.874133] Modules linked in: nls_utf8 loop ip6t_LOG xt_tcpudp xt_pkttype ipt_LOG xt_limit af_packet ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw xt_NOTRACK ipt_REJECT iptable_raw iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables xt_conntrack nf_conntrack ip6table_filter ip6_tables x_tables cpufreq_conservative cpufreq_userspace cpufreq_powersave powernow_k8 mperf fuse arc4 tuner_simple tuner_types tda9887 snd_hda_codec_analog tda8290 tuner usblp rtl8187 mac80211 ir_lirc_codec lirc_dev snd_usb_audio snd_usbmidi_lib ir_mce_kbd_decoder ir_sony_decoder hdj_mod msp3400 snd_rawmidi snd_seq_device ir_jvc_decoder gspca_sn9c20x gspca_main ir_rc6_decoder cfg80211 bttv snd_hda_intel snd_hda_codec ir_rc5_decoder ir_nec_decoder snd_hwdep rfkill snd_pcm eeprom_93cx6 videobuf_dma_sg videobuf_core btcx_risc rc_core tveeprom snd_timer snd i2c_nforce2 v4l2_common sg sr_mod cdrom firewire_ohci firewire_core force deth videodev soundcore ppdev floppy crc_itu_t pcspkr snd_page_alloc k8temp parport_pc parport asus_atk0110 autofs4 nouveau ttm drm_kms_helper drm i2c_algo_bit mxm_wmi wmi video fan button thermal processor thermal_sys ata_generic pata_amd pata_jmicron sata_nv [ 3563.874386] Pid: 2357, comm: kwin Not tainted 3.1.9-1.4-desktop #1 System manufacturer System Product Name/M2N-VM DH [ 3563.874403] EIP: 0060:[<f80f38b9>] EFLAGS: 00210282 CPU: 0 [ 3563.874419] EIP is at nouveau_fence_update+0x9/0xd0 [nouveau] [ 3563.874427] EAX: 00000001 EBX: d8b4acc0 ECX: f2e02ac0 EDX: 00000001 [ 3563.874434] ESI: 000f4240 EDI: 00000001 EBP: 00000001 ESP: ee277d64 [ 3563.874441] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 [ 3563.874448] Process kwin (pid: 2357, ti=ee276000 task=f6f74cf0 task.ti=ee276000) [ 3563.874456] Stack: [ 3563.874460] 00000000 00000000 d8b4acc0 000f4240 f6f74cf0 00000001 f80f3c9f 00000000 [ 3563.874479] f80f3cf4 d8b4acc0 00000000 0031d92a f6f74cf0 000f4240 00000000 d8b4acc0 [ 3563.874498] d9a8abe0 ee38b400 00000002 f80f3e0e 00000000 f051b400 dc8d8e00 ebcdb680 [ 3563.874517] Call Trace: [ 3563.874552] [<f80f3c9f>] __nouveau_fence_signalled+0x1f/0x30 [nouveau] [ 3563.874584] [<f80f3cf4>] __nouveau_fence_wait+0x44/0xd0 [nouveau] [ 3563.874616] [<f80f3e0e>] nouveau_fence_sync+0x8e/0xf0 [nouveau] [ 3563.874648] [<f80f4416>] validate_list+0xd6/0x310 [nouveau] [ 3563.874680] [<f80f4ba3>] nouveau_gem_pushbuf_validate+0xf3/0x220 [nouveau] [ 3563.874714] [<f80f55b5>] nouveau_gem_ioctl_pushbuf+0x195/0x870 [nouveau] [ 3563.874753] [<f7fe1d5c>] drm_ioctl+0x2dc/0x380 [drm] [ 3563.874770] [<c033a3ee>] do_vfs_ioctl+0x7e/0x2a0 [ 3563.874781] [<c033a68e>] sys_ioctl+0x7e/0x90 [ 3563.874790] [<c070a66d>] syscall_call+0x7/0xb [ 3563.874801] [<b756c114>] 0xb756c113 [ 3563.874806] Code: 04 89 bb f4 01 00 00 8b 1c 24 8b 7c 24 08 83 c4 0c e9 3c ec ff ff 8d b6 00 00 00 00 8d bf 00 00 00 00 55 57 89 c7 56 53 83 ec 08 <8b> 18 8d 40 40 89 44 24 04 e8 f9 68 61 c8 8d 47 44 3b 47 44 89 [ 3563.875011] EIP: [<f80f38b9>] nouveau_fence_update+0x9/0xd0 [nouveau] SS:ESP 0068:ee277d64 [ 3563.875011] CR2: 0000000000000001 [ 3563.885383] ---[ end trace 0be1ac3389c89176 ]--- Hardware is GeForce 6150SE nForce 430, Xorg version is 7.6, Mesa version is 7.11. (In reply to comment #3) > FYI: I'm getting a similar kernel oops I'm not getting an oops in the kernel. I'm getting a segfault in userspace. under openSUSE 12.1 with kernel 3.1.9, > but not with kernel 3.3.0-rc6: I'm no expert but I don't think your issue is the same as mine. In fact searching for my issue I did run into lots of oopses in the kernel nouveau_fence_update() function but AFAIU, what I am running into is a userspace nouveau_fence_update() function segfault. Hello Gents Brian I believe your issue was addressed previously in bug 43428, can you please confirm with your distribution if mesa has the patches mentioned? Frank Can you take a look at bug 38931, if you believe it's that "fix" for the issue we should be possible to "backport"/cherrypick it Regards Emil (In reply to comment #5) > Hello Gents Hi Emil, > I believe your issue was addressed previously in bug 43428, can you please > confirm with your distribution if mesa has the patches mentioned? That patch was not part of Ubuntu (Oneiric)'s distribution so I have applied it and built the packages and installed them. I got a similar segfault though, this time with screen not being null: Thread 1 (Thread 0xb1688720 (LWP 20539)): #0 nouveau_fence_update (screen=0x4071d000, flushed=1 '\001') at nouveau_fence.c:141 fence = <optimized out> next = 0x0 sequence = <optimized out> #1 0xaae1dc77 in nv50_default_flush_notify (chan=0x8c62a38) at nv50_context.c:68 nv50 = 0x8d11158 #2 0xad196f50 in nouveau_pushbuf_flush (chan=0x8c62a38, min=0) at ../../nouveau/nouveau_pushbuf.c:276 nvdev = <optimized out> nvchan = 0x8c62a38 nvpb = 0x8c62b00 req = {channel = 6, nr_buffers = 11, buffers = 147217232, nr_relocs = 17, nr_push = 1, relocs = 147258200, push = 147204932, suffix0 = 0, suffix1 = 0, vram_available = 243978240, gart_available = 536383488} i = <optimized out> ret = 0 __PRETTY_FUNCTION__ = "nouveau_pushbuf_flush" #3 0xaae1dc0e in FIRE_RING (chan=<optimized out>) at /usr/include/nouveau/nouveau_pushbuf.h:101 No locals. #4 nv50_flush (pipe=0xb014c500, fence=0x0) at nv50_context.c:46 screen = 0x8c62758 #5 0xaa973951 in st_flush (st=0xb01fb780, fence=0x0) at state_tracker/st_cb_flush.c:92 No locals. #6 0xaa973990 in st_glFlush (ctx=0xb01b9950) at state_tracker/st_cb_flush.c:126 st = 0xb01fb780 #7 0xaabb24f0 in _mesa_flush (ctx=0xb01b9950) at main/context.c:1656 No locals. #8 0xaabb2c0e in _mesa_Flush () at main/context.c:1688 ctx = 0xb01b9950 #9 0xb6aeb899 in MythRenderOpenGL::Flush(bool) () from /usr/lib/libmythui-0.25.so.0 No symbol table info available. #10 0xb6af3d1a in MythRenderOpenGL::CreateTexture(QSize, bool, unsigned int, unsigned int, unsigned int, unsigned int, unsigned int, unsigned int) () from /usr/lib/libmythui-0.25.so.0 No symbol table info available. #11 0xb6ae98f3 in MythOpenGLPainter::GetTextureFromCache(MythImage*) () from /usr/lib/libmythui-0.25.so.0 No symbol table info available. #12 0xb6ae9da8 in MythOpenGLPainter::DrawImage(QRect const&, MythImage*, QRect const&, int) () from /usr/lib/libmythui-0.25.so.0 No symbol table info available. #13 0xb69f5a6f in MythUIImage::DrawSelf(MythPainter*, int, int, int, QRect) () from /usr/lib/libmythui-0.25.so.0 No symbol table info available. #14 0xb69ef0e4 in MythUIType::Draw(MythPainter*, int, int, int, QRect) () from /usr/lib/libmythui-0.25.so.0 No symbol table info available. #15 0xb69ef180 in MythUIType::Draw(MythPainter*, int, int, int, QRect) () from /usr/lib/libmythui-0.25.so.0 No symbol table info available. #16 0xb69ad068 in MythMainWindow::draw() () from /usr/lib/libmythui-0.25.so.0 No symbol table info available. #17 0xb69ad6f8 in MythMainWindow::drawScreen() () from /usr/lib/libmythui-0.25.so.0 No symbol table info available. #18 0xb69ad8bb in ?? () from /usr/lib/libmythui-0.25.so.0 No symbol table info available. #19 0xb5b0ef6e in QWidget::event(QEvent*) () from /usr/lib/i386-linux-gnu/libQtGui.so.4 No symbol table info available. #20 0xb234ffe2 in QGLWidget::event(QEvent*) () from /usr/lib/i386-linux-gnu/libQtOpenGL.so.4 No symbol table info available. #21 0xb5ab4d84 in QApplicationPrivate::notify_helper(QObject*, QEvent*) () from /usr/lib/i386-linux-gnu/libQtGui.so.4 No symbol table info available. #22 0xb5aba1d8 in QApplication::notify(QObject*, QEvent*) () from /usr/lib/i386-linux-gnu/libQtGui.so.4 No symbol table info available. #23 0xb56e519e in QCoreApplication::notifyInternal(QObject*, QEvent*) () from /usr/lib/i386-linux-gnu/libQtCore.so.4 No symbol table info available. #24 0xb5b0be1b in QWidgetPrivate::drawWidget(QPaintDevice*, QRegion const&, QPoint const&, int, QPainter*, QWidgetBackingStore*) () from /usr/lib/i386-linux-gnu/libQtGui.so.4 No symbol table info available. #25 0xb5cf039e in QWidgetPrivate::repaint_sys(QRegion const&) () from /usr/lib/i386-linux-gnu/libQtGui.so.4 No symbol table info available. #26 0xb5b01644 in QWidgetPrivate::syncBackingStore(QRegion const&) () from /usr/lib/i386-linux-gnu/libQtGui.so.4 No symbol table info available. #27 0xb5b402e4 in ?? () from /usr/lib/i386-linux-gnu/libQtGui.so.4 No symbol table info available. #28 0xb5b41488 in QApplication::x11ProcessEvent(_XEvent*) () from /usr/lib/i386-linux-gnu/libQtGui.so.4 No symbol table info available. #29 0xb5b6d28c in ?? () from /usr/lib/i386-linux-gnu/libQtGui.so.4 No symbol table info available. #30 0xb218925f in g_main_dispatch (context=0x8b17e70) at /build/buildd/glib2.0-2.30.0/./glib/gmain.c:2441 dispatch = 0xb5b6d0a0 was_in_call = 0 user_data = 0x0 callback = 0 cb_funcs = 0x0 cb_data = 0x0 current_source_link = {data = 0x8b18e68, next = 0x0} need_destroy = <optimized out> source = 0x8b18e68 current = 0x8b17410 i = <optimized out> #31 g_main_context_dispatch (context=0x8b17e70) at /build/buildd/glib2.0-2.30.0/./glib/gmain.c:3011 No locals. #32 0xb2189990 in g_main_context_iterate (context=0x8b17e70, block=-1306950880, dispatch=1, self=<optimized out>) at /build/buildd/glib2.0-2.30.0/./glib/gmain.c:3089 max_priority = 0 timeout = 0 some_ready = 1 nfds = <optimized out> allocated_nfds = <optimized out> fds = 0xa7e9d708 #33 0xb2189c2a in g_main_context_iteration (context=0x8b17e70, may_block=1) at /build/buildd/glib2.0-2.30.0/./glib/gmain.c:3152 retval = <optimized out> #34 0xb5713ada in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib/i386-linux-gnu/libQtCore.so.4 No symbol table info available. #35 0xb5b6ce7a in ?? () from /usr/lib/i386-linux-gnu/libQtGui.so.4 No symbol table info available. #36 0xb56e41dd in QEventLoop::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib/i386-linux-gnu/libQtCore.so.4 No symbol table info available. #37 0xb56e4421 in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib/i386-linux-gnu/libQtCore.so.4 No symbol table info available. #38 0xb56e919d in QCoreApplication::exec() () from /usr/lib/i386-linux-gnu/libQtCore.so.4 No symbol table info available. #39 0xb5ab2924 in QApplication::exec() () from /usr/lib/i386-linux-gnu/libQtGui.so.4 No symbol table info available. #40 0x0806d97e in ?? () No symbol table info available. #41 0xb5311113 in __libc_start_main (main=0x806c160, argc=1, ubp_av=0xbfc52ba4, init=0x829c060, fini=0x829c0d0, rtld_fini=0xb77d0ba0, stack_end=0xbfc52b9c) at libc-start.c:226 result = <optimized out> unwind_buf = {cancel_jmp_buf = {{jmp_buf = {-1253629964, 0, 0, 0, -1459841158, 1082856303}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x1, 0x806fe74}, data = {prev = 0x0, cleanup = 0x0, canceltype = 1}}} not_first_call = <optimized out> #42 0x0806fe95 in ?? () No symbol table info available. Backtrace stopped: Not enough registers or memory available to unwind further Here's what gdb has to say at the moment of the crash: #0 nouveau_fence_update (screen=0x4071d000, flushed=1 '\001') at nouveau_fence.c:141 141 u32 sequence = screen->fence.update(&screen->base); (gdb) print screen $1 = (struct nouveau_screen *) 0x4071d000 (gdb) print *screen Cannot access memory at address 0x4071d000 So, while screen is not null, it does not seem to be pointing at valid memory either. (In reply to comment #5) > > I believe your issue was addressed previously in bug 43428, can you please > confirm with your distribution if mesa has the patches mentioned? Hrm. This patch might actually be making it worse. Before it, I could likely start this opengl app. 1 in 5-10 times, but now it seems to be crashing every single time. Tried nearly a dozen times now and no go. I've now rolled back to the distribution's mesa and still getting the same results from gdb: nouveau_fence_update (screen=0x4071d000, flushed=1 '\001') at nouveau_fence.c:141 141 nouveau_fence.c: No such file or directory. in nouveau_fence.c (gdb) print screen $1 = (struct nouveau_screen *) 0x4071d000 (gdb) print *screen Cannot access memory at address 0x4071d000 Looks like memory corruption to me now, and comically 0x4071d000 could even be a valid address, just not on the CPU (on nv50 VRAM virtual addresses start at 0x40000000 and are at least 4 KiB aligned). If you want to debug it, I'd either put a watch point on the memory location with gdb once it's available, i.e. when chan->user_private = screen is set in nv50_screen_create, or add lots of debug prints in the nv50 code to bisect the point where it gets modified (usually it shouldn't until screen destruction). Or you could try mesa-8.0 and hope it's gone there. Btw. I hope you're not building with clang, it has issues with the linked-list implementation in gallium, and the fence code uses it ... (In reply to comment #5) > ... > Frank > > Can you take a look at bug 38931, if you believe it's that "fix" for the issue > we should be possible to "backport"/cherrypick it I'm not sure, looks similar. You're the expert ;-) Does this still occur with recent software versions? (Like Mesa 9.1.6 or Mesa 9.2) No response to retest request in a month. Closing as invalid. As mentioned, closing the bug as invalid. Feel free to reopen if you still experience the problem with recent software. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.