Summary: | Xorg freeze after using xrandr, drm debug error, with bare server | ||
---|---|---|---|
Product: | DRI | Reporter: | peter garrone <pgarrone> |
Component: | DRM/other | Assignee: | Default DRI bug account <dri-devel> |
Status: | RESOLVED INVALID | QA Contact: | |
Severity: | normal | ||
Priority: | medium | CC: | mmokrejs |
Version: | XOrg git | ||
Hardware: | x86 (IA32) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
Description
peter garrone
2008-12-08 18:32:49 UTC
When I loaded the i915 module with modeset set to 1, the problem dissappeared. The kernel configuration option CONFIG_DRM_I915_KMS is not set. So as long as I load i915 with modeset set to 1, the problem is resolved. Created attachment 20969 [details]
dmesg output with warnings
Created attachment 20970 [details]
lspci output
Created attachment 20973 [details]
gdb log output illustrating the resetting of variable causing loop.
The lock.hw_lock variable in the kernel that causes the infinite loop by being reset is being reset upon screen closure called upon an xrandr operation. An error message "error setting MTRR (base = 0x20000000, size = 0x10000000, type = 1) Invalid argument (22)" is emitted by both kernel/dmesg and by libpciaccess. Following this, there is a close, and in that close operation the flag is reset. The associated gdb log illustrates this activity. The 3rd pci region has the failing base address/size. Created attachment 21036 [details]
Code from drm_bufs.c showing DRM_SHM branch.
This is the code in drm/drm_bufs.c (kernel module) for the ioctl DRM_IOCTL_ADD_MAP, type DRM_SHM. This mapping must be invoked from user space before the opened dri device can be locked. It normally sets the
master->lock.hw_lock variable. However, if the function drm_find_matching_map returns something, then the ioctl returns 0, and the master->lock.hw_lock is never set, resulting in the infinite loop when locking the dri channel is attempted later from user space.
After a comment by Dave Airlie that if a map has been created already, then that lock should be the primary, I analysed the addition of maps to the "dev->maplist" list in the drm code. Each time randr is run, all drm file descriptors are closed and reopened, and the list accumulates map elements. However these elements are never removed, because, although there is code to delete elements from the list, that code is never executed. It could be if a user space ioctl were invoked, but that does not happen. The old elements in the list are generally not reused, because they have a "master" field that identifies them with the "master" structure active in the device when the map was created. These master structures are allocated and freed on each xrandr close/reopen cycle. However if, by chance, an old "master" structure is returned by the dynamic memory allocation function, then one of the elements is reused, and the branch is taken in the code that does not set the master hwlock. So the infinite loop is entered when a later attempt is made to lock the drm. It seems problematic to free dynamic master structures, without also freeing any elements that rely on a reference to that structure for correct operation. Actually it seems problematic having all these map structures hanging round anyway, because I cannot find where they are ever freed, so at least they represent a memory leak. Are you running a bare server (no other clients running) so that the server regenerates after each xrandr call? (In reply to comment #8) > Are you running a bare server (no other clients running) so that the server > regenerates after each xrandr call? > Yes. I am targeting an embedded system. Usually I like to work by remotely logging in using ssh and running xterms on a remote computer, while running a rudimentary display on the target system. However there is a requirement to run a full desktop on the target, for developers. When I debug the xserver, with gdb, I also run only an xorg, with gdb. Created attachment 21101 [details]
Patch with pringk's, and dmesg output hopefully illustrating the error.
This is the printk output from dmesg with my added printks in a patch. At module removal, there are 25 entries in the maplist.
Created attachment 21105 [details]
My shell script for launching problem. Command is sudo stx -min -gdb
To cause the error, I have compiled and installed X11 at prefix /usr/local/x11prefix, and I run this script with
$ sudo stx -min -gdb
On another terminal, I run xrandr, similar to what is in the script, but really just xrandr -q is necessary.
Generally no user environment involves server regens, so it may be in your best interest to avoid running a testing environment involving that code path. (Basically, run an xlogo or xterm or something before playing with xrandr). Still a bug. I can confirm that the error does not occur while xlogo is also running, because the "master" is not deallocated. If the /dev/dri/cardN file descriptor is held open by a paused process, then this freeze error does not occur either, (as well as in the situation of running an x application). However during xrandr operations, the i915 heap allocation is recalled, and since no heap deallocation has been invoked, errors occur in the dmesg output of the nature: [drm:i915_mem_heap_init] *ERROR* heap already initialised? (except that this error currently has no newline, so it runs onto the next dmesg message on output) Created attachment 21216 [details]
kernel patch that addresses error
This kernel patch to drm_stubs.c removes all maps in dev->maplist that reference the master when the master structure is being freed, after invocation of the device destroy callback. It doesn't appear to introduce any new quirks.
Use at your own risk.
Comment on attachment 21216 [details]
kernel patch that addresses error
What kernel version is the patch against?
I tried it vs 2.6.17.10 and 2.6.18-pre9, no joy, 1 of 1 hunks rejected.
[Bug 18967] Xorg freeze after using xrandr, drm debug error. bugzilla-daemon Tue, 16 Dec 2008 15:08:54 -0800 http://bugs.freedesktop.org/show_bug.cgi?id=18967 --- Comment #15 from peter garrone <pgarr...@optusnet.com.au> 2008-12-16 15:08:28 PST --- Created an attachment (id=21216) --> (http://bugs.freedesktop.org/attachment.cgi?id=21216) kernel patch that addresses error This kernel patch to drm_stubs.c removes all maps in dev->maplist that reference the master when the master structure is being freed, after invocation of the device destroy callback. It doesn't appear to introduce any new quirks. Use at your own risk. Thank you all for working on this, I suspect this may effect more than just the intel servers. What kernel version is the diff against? I tried it against 2.6.17.10 and 2.7.18-pre9 Hi, Freedesktop's Bugzilla instance is EOLed and open bugs are about to be migrated to http://gitlab.freedesktop.org. To avoid migrating out of date bugs, I am now closing all the bugs that did not see any activity in the past year. If the issue is still happening, please create a new bug in the relevant project at https://gitlab.freedesktop.org/drm (use misc by default). Sorry about the noise! |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.