Bug 69784

Summary: [HSW]igt/gem_tiled_swapping randomly causes OOM killer
Product: DRI Reporter: lu hua <huax.lu>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED WORKSFORME QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium CC: intel-gfx-bugs
Version: unspecified   
Hardware: All   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
dmesg none

Description lu hua 2013-09-25 03:48:42 UTC
Created attachment 86497 [details]
dmesg

System Environment:
--------------------------
Platform:  Haswell
Kernel:  ( drm-intel-next-queued)1b068ee25776533074251f1c6276c5f720c0284b  

Bug detailed description:
-----------------------------
It causes OOM killer on haswell with -queued and -nightly kernel.It works well on -fixes kernel.
It happens 3 in 5 runs. 

[   43.206658] Call Trace:
[   43.206671]  [<ffffffff816eea1b>] ? dump_stack+0x41/0x51
[   43.206690]  [<ffffffff816eb716>] ? dump_header.isra.7+0x69/0x194
[   43.206712]  [<ffffffff81065605>] ? ktime_get_ts+0x49/0xab
[   43.206732]  [<ffffffff812c3d3e>] ? ___ratelimit+0xae/0xc8
[   43.206751]  [<ffffffff810a1575>] ? oom_kill_process+0x7c/0x313
[   43.206772]  [<ffffffff810a1337>] ? oom_unkillable_task.isra.4+0x6d/0x7e
[   43.206793]  [<ffffffff810a1d20>] ? out_of_memory+0x3b2/0x3e5
[   43.206814]  [<ffffffff810a52e2>] ? __alloc_pages_nodemask+0x5e7/0x6f8
[   43.206839]  [<ffffffff810cd1a1>] ? alloc_pages_current+0xc5/0xe2
[   43.206860]  [<ffffffff8102cced>] ? pte_alloc_one+0x11/0x38
[   43.206881]  [<ffffffff810b6427>] ? __pte_alloc+0x12/0x88
[   43.206899]  [<ffffffff810b9633>] ? handle_mm_fault+0x181/0x19c
[   43.206919]  [<ffffffff816f6e96>] ? __do_page_fault+0x401/0x449
[   43.206940]  [<ffffffff810be3da>] ? do_mmap_pgoff+0x2b2/0x341
[   43.206958]  [<ffffffff810b07fe>] ? vm_mmap_pgoff+0x82/0xab
[   43.206977]  [<ffffffff810e5194>] ? do_vfs_ioctl+0x3ad/0x3ef
[   43.206998]  [<ffffffff816f42b2>] ? page_fault+0x22/0x30


[   43.237426] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[   43.238548] [ 2427]     0  2427    74842       97     120        0             0 systemd-journal
[   43.239706] [ 2908]     0  2908    10366      268      23        0         -1000 systemd-udevd
[   43.240835] [ 3319]     0  3319    24467       48      25        0             0 lvmetad
[   43.241958] [ 3337]     0  3337    12233       91      24        0         -1000 auditd
[   43.243145] [ 3359]     0  3359    20054       34       8        0             0 audispd
[   43.244258] [ 3364]     0  3364     5996       45      24        0             0 sedispatch
[   43.245368] [ 3382]     0  3382     1076       20       9        0             0 rngd
[   43.246480] [ 3383]     0  3383     4779       54      21        0             0 irqbalance
[   43.247665] [ 3384]     0  3384    35148       68      38        0             0 abrtd
[   43.248772] [ 3386]     0  3386    34619       64      37        0             0 abrt-watch-log
[   43.249874] [ 3391]     0  3391     6070      152      16        0             0 smartd
[   43.250975] [ 3394]     0  3394    92453      338      63        0             0 NetworkManager
[   43.252128] [ 3396]     0  3396    34619       66      37        0             0 abrt-watch-log
[   43.253230] [ 3401]     0  3401     8250       79      21        0             0 systemd-logind
[   43.254355] [ 3402]     0  3402    66810      105      34        0             0 rsyslogd
[   43.255461] [ 3405]    70  3405     6986       58      29        0             0 avahi-daemon
[   43.256567] [ 3411]    81  3411     6122      108      16        0          -900 dbus-daemon
[   43.257745] [ 3417]     0  3417     1619       30      10        0             0 mcelog
[   43.258850] [ 3422]   993  3422     7224       69      18        0             0 chronyd
[   43.259952] [ 3423]     0  3423     5933       47      17        0             0 atd
[   43.261057] [ 3425]     0  3425    31022      153      14        0             0 crond
[   43.262228] [ 3426]    70  3426     6986       53      26        0             0 avahi-daemon
[   43.263328] [ 3453]   999  3453   132032      835      52        0             0 polkitd
[   43.264425] [ 3456]     0  3456    27499       32      11        0             0 agetty
[   43.265533] [ 3467]     0  3467    40411      197      64        0          -900 modem-manager
[   43.266697] [ 3470]     0  3470   132712     1154     141        0             0 libvirtd
[   43.267824] [ 3483]     0  3483    20107      200      40        0         -1000 sshd
[   43.268970] [ 3492]     0  3492    21900      294      45        0             0 sendmail
[   43.270096] [ 3499]    32  3499     9423       93      22        0             0 rpcbind
[   43.271274] [ 3601]     0  3601    25513     3114      51        0             0 dhclient
[   43.272432] [ 3607]     0  3607    32804      288      67        0             0 sshd
[   43.273569] [ 3611]     0  3611    29206      448      18        0             0 bash
[   43.274702] [ 3697]     0  3697    32769      287      67        0             0 sshd
[   43.275837] [ 3701]     0  3701    29234      481      18        0             0 bash
[   43.277060] [ 3794]     0  3794    15467      125      44        0             0 gem_tiled_swapp
[   43.278208] [ 3804]     0  3804    39590      189      32        0             0 crond
[   43.279359] Out of memory: Kill process 3453 (polkitd) score 0 or sacrifice child
[   43.280516] Killed process 3453 (polkitd) total-vm:528128kB, anon-rss:3340kB, file-rss:0kB
[   43.284048] gem_tiled_swapp invoked oom-killer: gfp_mask=0x2084d0, order=0, oom_score_adj=0
[   43.285271] gem_tiled_swapp cpuset=/ mems_allowed=0

Reproduce steps:
----------------------------
1. ./gem_tiled_swapping
Comment 1 Daniel Vetter 2013-09-25 10:30:06 UTC
I've just rebased my trees around and both -fixes and dinq are now based on -rc2. Can you please retest that -fixes still works and if so try to bisect where the regression has been introduced in dinq?
Comment 2 Daniel Vetter 2013-09-25 12:22:43 UTC
Please also test the latest patch attached to bug 69247
Comment 3 lu hua 2013-09-26 07:56:28 UTC
(In reply to comment #2)
> Please also test the latest patch attached to bug 69247

Run 5 cycles with this patch, it works well.
Comment 4 Daniel Vetter 2013-09-26 08:12:42 UTC
Just to check: Does latest -nightly based on 3.12-rc2 still blow up with OOM?
Comment 5 lu hua 2013-09-27 05:46:15 UTC
Run 5 cycles on -nightly kernel(commit:24c8329416b54b79655afe45370cf3d46f41e283), It works well.
Comment 6 Chris Wilson 2013-09-30 08:16:34 UTC
Ok, lets mark this instance as working as we still have plenty of evidence for bad oom behaviour in other bugs.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.