Bug 41266

Summary: [SNB] Hang on showing copy dashed frame on a large cells interval in LibreOffice Calc [rc6-related]
Product: DRI Reporter: Mau <mavoga>
Component: DRM/IntelAssignee: Chris Wilson <chris>
Status: CLOSED DUPLICATE QA Contact:
Severity: major    
Priority: medium CC: ben, chris, daniel, eugeni, jbarnes
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
The relevant syslog part after the hang
none
i915_error_state from debugfs
none
X.Org log of a normal session
none
dmesg from a normal session
none
Complete X.Org log after the hang
none
dmesg with intel_iommu=off
none
dmidecode output
none
lspci -vvv output
none
config.log of my test none

Description Mau 2011-09-27 09:35:06 UTC
Created attachment 51664 [details]
The relevant syslog part after the hang

Overview:

X session systematically hangs in LibreOffice when the rotating dashed frame is shown around a large interval of copied cells (i.e. 1x700).

I couldn't get any useful information with the default setting i915.semaphores setting (=0) because without setting that parameter to 1 the system hangs hardly (SysRq not working) and no traces of the issue can be found in the logs.

I could reproduce this issue with KDE compositing both enabled and disabled.

Steps to reproduce:

- open LibreOffice Calc
- select a cell interval, say 1C x 700R
- Edit/Copy
- the rotating dashed frame appears around copied cells and after some time X hangs.

System info:

- Debian Wheezy/Sid
- Aptosid's 3.0-4.slh.6-aptosid-amd64 kernel [3.0-26]
  (the same happens with 3.0.0-1-amd64 [3.0.0-3] Debian kernel)
- X.Org X Server 1.10.4 - Release Date: 2011-08-19
- xserver-xorg-video-intel [2:2.15.0-3]
- libdrm-intel1 + libdrm2 [2.4.26-1]
- libgl*-mesa* [7.11-5]
- Sandy Bridge 2720QM
Comment 1 Mau 2011-09-27 09:36:03 UTC
Created attachment 51665 [details]
i915_error_state from debugfs
Comment 2 Mau 2011-09-27 09:42:37 UTC
Forgot to say that I could reproduce the issue with any LibreOffice version I could test (1:3.3.3-4~bpo60+1, 1:3.4.3-1, 1:3.4.3-2).

Thanks
Comment 3 Mau 2011-09-27 09:48:06 UTC
Created attachment 51666 [details]
X.Org log of a normal session
Comment 4 Mau 2011-09-27 09:52:29 UTC
Created attachment 51667 [details]
dmesg from a normal session
Comment 5 Mau 2011-09-27 10:08:41 UTC
Created attachment 51668 [details]
Complete X.Org log after the hang
Comment 6 Daniel Vetter 2011-09-28 01:10:45 UTC
Quick check: Can you try to hang your machine with DMAR/VT-d disabled? Disable it either in the BIOS or boot with intel_iommu=off (and check the dmesg). And please attach the lspci -nn line for your graphics card.

Thanks, Daniel
Comment 7 Mau 2011-09-28 08:52:54 UTC
intel_iommu=off doesn't change things (please see dmesg_2).

I have a new clue: I usually keep rc6 and fbc enabled in order to save battery power; during my test I excluded those parameters did matter because I made my tests after disabling them with

# echo -n 0 > /sys/module/i915/parameters/i915_enable_fbc
# echo -n 0 > /sys/module/i915/parameters/i915_enable_rc6

Today I noticed that after disabling rc6 in that way the CPU package temperature didn't increase (the difference should be about 10°C); I therefore disabled rc6 already at boot and found that the GPU doesn't hang anymore.

Shouldn't those parameters be read-only if they are not expected to have an effect when changed from userspace?
Comment 8 Mau 2011-09-28 08:53:57 UTC
Created attachment 51720 [details]
dmesg with intel_iommu=off
Comment 9 Mau 2011-09-28 16:40:03 UTC
I forgot to say that this laptop's BIOS doesn't have any option regarding DMAR/VT-d/virtualization.

This is the lspci -nn line for my graphic card:

00:02.0 VGA compatible controller [0300]: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller [8086:0126] (rev 09)

Thanks, Maurizio
Comment 10 Eugeni Dodonov 2011-10-03 13:01:25 UTC
Hi Maurizio,

could you provide your hardware specifications of your machine? E.g., vendor, model, bios version, and so on..

We were unable to reproduce the issue on any of our machines, so I strongly suspect that it could be something hardware or maybe configuration-specific..

Also, could you try running this test with a freshly-installed system, or from a live-cd to see if you can reproduce the problem?

Thanks!
Comment 11 Mau 2011-10-05 08:53:11 UTC
Hi Evgeni,

it's a Clevo W150HRM laptop, AMI BIOS 4.6.4 v1.07, 2x4GB 1600MHz RAM, i7-2720QM CPU (please see the outputs of dmidecode and lspci -vvv, or ask if you need further details).

I tried running a Gentoo 11.2 LiveDVD (kernel 3.0, intel_drv.so 2.15.0-r1, xorg-server 1.10.3, mesa 7.11, libdrm 2.4.26) with i915.i915_enable_rc6=1 and I confirm it doesn't hang where my Debian wheezy/sid does.

In the meanwhile, XOrg v1.11.1 entered Debian testing and intel_drv.so was updated to v2.16.0, but the hang is still there...

Thanks

Maurizio
Comment 12 Mau 2011-10-05 08:53:56 UTC
Created attachment 52012 [details]
dmidecode output
Comment 13 Mau 2011-10-05 08:54:31 UTC
Created attachment 52013 [details]
lspci -vvv output
Comment 14 Daniel Vetter 2011-10-06 07:17:40 UTC
Shoot in the dark: Can you please try whether booting with

memmap=2M#512M memmap=2M#1024M

appended to your kernel cmdline helps?
Comment 15 Mau 2011-10-07 09:05:10 UTC
(In reply to comment #14)
> Shoot in the dark: Can you please try whether booting with
> 
> memmap=2M#512M memmap=2M#1024M
> 
> appended to your kernel cmdline helps?

No, unfortunately it doesn't help.

Booting without the pcie_aspm=force and enable_mtrr_cleanup parameters didn't change anything too.

I played some more with Gentoo 11.2 LiveDVD and I discovered that I only had to wait more time while doing something else (scrolling the spreadsheet, opening konsole, switching desktop, enabling further desktop effects, ecc.): sooner or later the system hangs, and this happens only when I have that damned rotating dashed frame on the screen. Please note that Gentoo's LiveDVD has compositing enabled by default.

I realize that reproducing this bug can be tricky: sometimes the system hangs as soon as I press CTRL+C, sometimes I have to mess around some time before it happens.

Thanks

Maurizio
Comment 16 Eugeni Dodonov 2011-10-07 10:36:20 UTC
Just to rule out one possibility - is it possible to disable the nvidia board within bios, and see if the problem happens when only Intel VGA card is detected?
Comment 17 Mau 2011-10-07 18:01:56 UTC
(In reply to comment #16)
> Just to rule out one possibility - is it possible to disable the nvidia board
> within bios, and see if the problem happens when only Intel VGA card is
> detected?

Unfortunately this laptop doesn't offer any BIOS option to disable the discrete card. I managed to disable the dGPU by using the acpi_call module and echoing "\_SB.PCI0.PEG0.PEGP._OFF" to /proc/acpi/call; the dGPU keeps being listed in lspci though.

May I ask you why you feel this issue could be ACPI related?
Comment 18 Eugeni Dodonov 2011-10-08 13:48:21 UTC
> May I ask you why you feel this issue could be ACPI related?

This is more a guess.. We were unable to reproduce this issue on any other machines, and one biggest difference we have with yours is that your has 2 graphical adapters. So I just wanted to rule out the possibility if it could influence the problem somehow..
Comment 19 Daniel Vetter 2011-10-17 09:50:44 UTC
Hi Matthew, it seems although we still haven't got a clue as to why it dies, we
have a workaround that doesn't penalise too much. Can you please retry with the
current master of xf86-video-intel?

commit 46f97127c22ea42bc8fdae59d2a133e4b8b6c997
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Sun Oct 16 21:40:15 2011 +0100

snb,ivb: Workaround unknown blitter death

The first workaround was a performance killing MI_FLUSH_DW after every
op. This workaround appears to be a stable compromise instead, only
requiring a redundant command after every BLT command with little
impact on throughput.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=27892
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=39524
Tested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Comment 20 Daniel Vetter 2011-10-17 09:52:35 UTC
Oops, copy&pasted too much. Mau, can you please try latest xf86-vidoe-intel (SNA disabled) and see whether that prevents your machine from hanging in LibreOffice?

Thanks, Daniel
Comment 21 Mau 2011-10-18 03:03:05 UTC
(In reply to comment #20)
> Oops, copy&pasted too much. Mau, can you please try latest xf86-vidoe-intel
> (SNA disabled) and see whether that prevents your machine from hanging in
> LibreOffice?
> 
> Thanks, Daniel

Yes, current master seems to prevent the hangs. I hope I did everything correctly, attaching config.log if you want to check it.

Thanks

Mau
Comment 22 Mau 2011-10-18 03:05:26 UTC
Created attachment 52464 [details]
config.log of my test
Comment 23 Chris Wilson 2011-10-18 06:29:09 UTC

*** This bug has been marked as a duplicate of bug 39524 ***

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.