Created attachment 109787 [details] dmesg ==System Environment== -------------------------- Regression: Yes, Good commit on -next-queued: 77c1aa84de0096792de673aa1c64c36b38553cf5(2014_11_19) Non-working platforms: HSW ==kernel== -------------------------- origin/drm-intel-nightly: 18748be7c96accc27327423c384f86a8fae99c35(fails) drm-intel-nightly: 2014y-11m-20d-21h-58m-44s UTC integration manifest origin/drm-intel-next-queued: 89a35ecdc6aa5a88165313ca5cfd52b8e8e7fbbd(fails) drm/i915/g4x: fix g4x infoframe readout origin/drm-intel-fixes: 0485c9dc24ec0939b42ca5104c0373297506b555(another bug 80517) drm/i915: Kick fbdev before vgacon ==Bug detailed description== igt/drv_module_reload causes "WARNING: CPU: 5 PID: 4025 at drivers/gpu/drm/drm_mm.c:765 i915_global_gtt_cleanup+0x3a/0x80 [i915]()" Output: [root@x-hsw24 tests]# ./drv_module_reload unbinding /sys/class/vtconsole/vtcon0/: (M) frame buffer device module successfully unloaded module successfully loaded again [root@x-hsw24 tests]# echo $? 0 [root@x-hsw24 tests]# dmesg -r|egrep ""<[1-4]>""|grep drm <4>[ 198.502172] WARNING: CPU: 5 PID: 4025 at drivers/gpu/drm/drm_mm.c:765 i915_global_gtt_cleanup+0x3a/0x80 [i915]() <4>[ 198.502175] Modules linked in: ip6table_filter ip6_tables ebtable_nat ebtables nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 iptable_mangle xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc ipv6 dm_mod iTCO_wdt iTCO_vendor_support dcdbas snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi serio_raw pcspkr i2c_i801 snd_hda_controller snd_hda_codec lpc_ich snd_hwdep mfd_core snd_pcm shpchp snd_timer snd soundcore battery acpi_cpufreq i915(-) button video drm_kms_helper drm cfbfillrect cfbimgblt cfbcopyarea [last unloaded: snd_hda_intel] <4>[ 198.502199] CPU: 5 PID: 4025 Comm: rmmod Not tainted 3.18.0-rc5_drm-intel-nightly_18748b_20141121+ #1774 <4>[ 198.502249] [<ffffffffa0023959>] ? drm_modeset_unlock_all+0x41/0x50 [drm] <4>[ 198.502254] [<ffffffffa001308d>] ? drm_dev_unregister+0x1e/0x8b [drm] <4>[ 198.502259] [<ffffffffa001394f>] ? drm_put_dev+0x3e/0x47 [drm] <4>[ 198.502280] [<ffffffffa00151a1>] ? drm_pci_exit+0x39/0x9c [drm] <4>[ 198.502337] CPU: 5 PID: 4025 Comm: rmmod Tainted: G B W 3.18.0-rc5_drm-intel-nightly_18748b_20141121+ #1774 <4>[ 198.502381] [<ffffffffa0023959>] ? drm_modeset_unlock_all+0x41/0x50 [drm] <4>[ 198.502385] [<ffffffffa001308d>] ? drm_dev_unregister+0x1e/0x8b [drm] <4>[ 198.502390] [<ffffffffa001394f>] ? drm_put_dev+0x3e/0x47 [drm] <4>[ 198.502408] [<ffffffffa00151a1>] ? drm_pci_exit+0x39/0x9c [drm] <4>[ 198.502422] CPU: 5 PID: 4025 Comm: rmmod Tainted: G B W 3.18.0-rc5_drm-intel-nightly_18748b_20141121+ #1774 <4>[ 198.502447] [<ffffffffa0023959>] ? drm_modeset_unlock_all+0x41/0x50 [drm] <4>[ 198.502452] [<ffffffffa001308d>] ? drm_dev_unregister+0x1e/0x8b [drm] <4>[ 198.502456] [<ffffffffa001394f>] ? drm_put_dev+0x3e/0x47 [drm] <4>[ 198.502474] [<ffffffffa00151a1>] ? drm_pci_exit+0x39/0x9c [drm] ==Reproduce steps== ---------------------------- 1. ./drv_module_reload
Hm, I've noticed similar "Memory manager not clean" issues when reloading i915.ko on my snb just this week. Do you see this on other platforms, too? Btw for WARNING there's often a 2nd line with some explanation, that should be the bug summary according to bug filing BKM. I've fixed it. Bisect should definitely help here since it looks like a leak somewhere.
dcb4c12a687710ab745c2cdee8298c3e97f6f707 is the first bad commit Author: Oscar Mateo <oscar.mateo@intel.com> AuthorDate: Thu Nov 13 10:28:10 2014 +0000 Commit: Daniel Vetter <daniel.vetter@ffwll.ch> CommitDate: Wed Nov 19 19:32:58 2014 +0100 drm/i915/bdw: Pin the context backing objects to GGTT on-demand Up until now, we have pinned every logical ring context backing object during creation, and left it pinned until destruction. This made my life easier, but it's a harmful thing to do, because we cause fragmentation of the GGTT (and, eventually, we would run out of space). This patch makes the pinning on-demand: the backing objects of the two contexts that are written to the ELSP are pinned right before submission and unpinned once the hardware is done with them. The only context that is still pinned regardless is the global default one, so that the HWS can still be accessed in the same way (ring->status_page). v2: In the early version of this patch, we were pinning the context as we put it into the ELSP: on the one hand, this is very efficient because only a maximum two contexts are pinned at any given time, but on the other hand, we cannot really pin in interrupt time :( v3: Use a mutex rather than atomic_t to protect pin count to avoid races. Do not unpin default context in free_request. v4: Break out pin and unpin into functions. Fix style problems reported by checkpatch v5: Remove unpin_lock as all pinning and unpinning is done with the struct mutex already locked. Add WARN_ONs to make sure this is the case in future. Issue: VIZ-4277 Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Thomas Daniel <thomas.daniel@intel.com> Reviewed-by: Akash Goel <akash.goels@gmail.com> Reviewed-by: Deepak S<deepak.s@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> :040000 040000 b99fffd2ff94e9c66c4797886726deb6cdf9d502 d3501260bc640ac87a2a95d13ecfc379caaf4d41 M drivers On it's parents commit(c86ee3a9f8cddcf2e637da19d6e7c05bdea11a96), another dmseg warning reproduced. [root@x-hsw24 tests]# ./drv_module_reload unbinding /sys/class/vtconsole/vtcon0/: (M) frame buffer device module successfully unloaded [root@x-hsw24 tests]# dmesg -r|egrep "<[1-4]>"|grep drm <4>[ 48.255113] WARNING: CPU: 5 PID: 3981 at drivers/gpu/drm/i915/intel_pm.c:6207 intel_disable_gt_powersave+0x33/0x37a [i915]() <4>[ 48.255117] dm_mod snd_hda_codec_realtek snd_hda_codec_generic iTCO_wdt iTCO_vendor_support snd_hda_codec_hdmi dcdbas serio_raw pcspkr i2c_i801 snd_hda_controller snd_hda_codec snd_hwdep lpc_ich shpchp mfd_core snd_pcm snd_timer snd soundcore battery acpi_cpufreq i915(-) button video drm_kms_helper drm [last unloaded: snd_hda_intel] <4>[ 48.255165] [<ffffffffa0004dd3>] ? vblank_disable_and_save+0x170/0x17f [drm] <4>[ 48.255197] [<ffffffffa0007527>] ? drm_dev_unregister+0x1e/0x8b [drm] <4>[ 48.255202] [<ffffffffa0007757>] ? drm_put_dev+0x3e/0x47 [drm] <4>[ 48.255222] [<ffffffffa000917a>] ? drm_pci_exit+0x38/0x98 [drm]
(In reply to Daniel Vetter from comment #1) > Hm, I've noticed similar "Memory manager not clean" issues when reloading > i915.ko on my snb just this week. Do you see this on other platforms, too? > > Btw for WARNING there's often a 2nd line with some explanation, that should > be the bug summary according to bug filing BKM. I've fixed it. > > Bisect should definitely help here since it looks like a leak somewhere. I checked on ILK SNB and BYT platforms, and only reproduce this on BYT platform.
(In reply to Guo Jinxian from comment #3) > (In reply to Daniel Vetter from comment #1) > > Hm, I've noticed similar "Memory manager not clean" issues when reloading > > i915.ko on my snb just this week. Do you see this on other platforms, too? > > > > Btw for WARNING there's often a 2nd line with some explanation, that should > > be the bug summary according to bug filing BKM. I've fixed it. > > > > Bisect should definitely help here since it looks like a leak somewhere. > > I checked on ILK SNB and BYT platforms, and only reproduce this on BYT > platform. Hm, BYT is likely a different bug since the bisected commit is for gen8+ (bdw/bsw) only. Can you please file a new bug report for BYT? Also please check whether it's a regression and bisect if so.
commit 958f8cd96f979b20c45c55cba14bf8d8fbeca64f Author: Thomas Daniel <thomas.daniel@intel.com> Date: Tue Nov 25 10:39:25 2014 +0000 drm/i915: Fix context object leak for legacy contexts
Verified on latest -nightly(0db9cf7742874ee2c09a35b640c1bb04cb379eb6) [root@x-hsw24 tests]# ./drv_module_reload unbinding /sys/class/vtconsole/vtcon0/: (M) frame buffer device module successfully unloaded [root@x-hsw24 tests]# dmesg -r|egrep "<[1-4]>"|grep drm
Closing old verified.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.