Created attachment 103430 [details] dmesg ==System Environment== -------------------------- Regression: No The fail rate about 1 out of 5 Non-working platforms: BSW ==kernel== -------------------------- origin/drm-intel-nightly: 089bea2664988de48fccc81407ebe12115156731(fails) drm-intel-nightly: 2014y-07m-24d-09h-55m-38s integration manifest origin/drm-intel-next-queued: 2b8ee5710915a244b807fbe3427299d007d19e89(fails) drm/i915: Allow overlapping userptr objects origin/drm-intel-fixes: eedd10f45bdcb2a5b2afa35f845e080c3bc984f2(fails) drm/i915: Simplify i915_gem_release_all_mmaps() ==Bug detailed description== ----------------------------- igt/gem_exec_lut_handle cost long time to execute sporadically Output: [root@x-bsw01 tests]# time ./gem_exec_lut_handle IGT-Version: 1.7-g4d4f4b2 (x86_64) (Linux: 3.16.0-rc6_drm-intel-nightly_089bea_20140725_debug+ x86_64) relocation: buffers= 1: old= 105904 + 180.9*reloc, lut= 102425 + 173.7*reloc (ns) relocation: buffers= 2: old= 104746 + 183.0*reloc, lut= 104779 + 173.9*reloc (ns) relocation: buffers= 4: old= 107092 + 182.0*reloc, lut= 107067 + 173.6*reloc (ns) relocation: buffers= 8: old= 113138 + 181.9*reloc, lut= 111362 + 173.7*reloc (ns) relocation: buffers= 16: old= 122958 + 182.4*reloc, lut= 122497 + 173.9*reloc (ns) relocation: buffers= 32: old= 144296 + 182.9*reloc, lut= 143878 + 174.1*reloc (ns) relocation: buffers= 64: old= 188723 + 185.5*reloc, lut= 187134 + 176.3*reloc (ns) relocation: buffers= 128: old= 277146 + 190.2*reloc, lut= 275321 + 179.9*reloc (ns) relocation: buffers= 256: old= 458424 + 199.5*reloc, lut= 454968 + 186.3*reloc (ns) relocation: buffers= 512: old= 803739 + 226.0*reloc, lut= 798157 + 195.7*reloc (ns) relocation: buffers=1024: old=1865623 + 323.2*reloc, lut=1904592 + 260.9*reloc (ns) relocation: buffers=2048: old=7074389 + 436.8*reloc, lut=7076460 + 315.6*reloc (ns) skip-relocs: buffers= 1: old= 100994 + 91.6*reloc, lut= 100750 + 85.5*reloc (ns) skip-relocs: buffers= 2: old= 102906 + 91.5*reloc, lut= 101958 + 85.5*reloc (ns) skip-relocs: buffers= 4: old= 105166 + 91.3*reloc, lut= 104126 + 85.5*reloc (ns) skip-relocs: buffers= 8: old= 111004 + 91.3*reloc, lut= 109197 + 85.3*reloc (ns) skip-relocs: buffers= 16: old= 121325 + 93.1*reloc, lut= 121063 + 85.2*reloc (ns) skip-relocs: buffers= 32: old= 142916 + 91.7*reloc, lut= 142387 + 85.2*reloc (ns) skip-relocs: buffers= 64: old= 186514 + 93.1*reloc, lut= 185571 + 85.6*reloc (ns) skip-relocs: buffers= 128: old= 274873 + 97.1*reloc, lut= 273076 + 88.7*reloc (ns) skip-relocs: buffers= 256: old= 456788 + 106.4*reloc, lut= 453750 + 94.1*reloc (ns) skip-relocs: buffers= 512: old= 806428 + 130.0*reloc, lut= 802482 + 103.1*reloc (ns) skip-relocs: buffers=1024: old=1895140 + 212.7*reloc, lut=1929344 + 159.2*reloc (ns) skip-relocs: buffers=2048: old=7058420 + 321.6*reloc, lut=7064467 + 214.6*reloc (ns) no-relocs: buffers= 1: old= 100038 + 30.8*reloc, lut= 100159 + 23.7*reloc (ns) no-relocs: buffers= 2: old= 102166 + 30.7*reloc, lut= 101549 + 23.9*reloc (ns) no-relocs: buffers= 4: old= 104208 + 30.7*reloc, lut= 104186 + 23.9*reloc (ns) no-relocs: buffers= 8: old= 109841 + 30.8*reloc, lut= 108595 + 23.9*reloc (ns) no-relocs: buffers= 16: old= 121203 + 30.4*reloc, lut= 119996 + 23.8*reloc (ns) no-relocs: buffers= 32: old= 141550 + 30.6*reloc, lut= 140940 + 23.7*reloc (ns) no-relocs: buffers= 64: old= 184514 + 30.6*reloc, lut= 183211 + 23.6*reloc (ns) no-relocs: buffers= 128: old= 270394 + 30.4*reloc, lut= 268552 + 23.4*reloc (ns) no-relocs: buffers= 256: old= 448667 + 30.4*reloc, lut= 445651 + 22.6*reloc (ns) no-relocs: buffers= 512: old= 785921 + 33.1*reloc, lut= 779181 + 24.3*reloc (ns) no-relocs: buffers=1024: old=1826164 + 59.9*reloc, lut=1847018 + 49.0*reloc (ns) no-relocs: buffers=2048: old=6972076 + 41.9*reloc, lut=6975994 + 30.0*reloc (ns) real 15m53.222s user 0m20.729s sys 15m29.287s Reproduce steps: ------------------------- 1. ./gem_exec_lut_handle
How does the pass compare? (i.e. please attach the good output and dmesg)
(In reply to comment #1) > How does the pass compare? (i.e. please attach the good output and dmesg) Here is the output on latest -next-queued which passed. [root@x-bsw01 tests]# time ./gem_exec_lut_handle IGT-Version: 1.7-g4d4f4b2 (x86_64) (Linux: 3.16.0-rc4_drm-intel-next-queued_2b8ee5_20140725+ x86_64) relocation: buffers= 1: old= 20213 + 160.6*reloc, lut= 21170 + 150.9*reloc (ns) relocation: buffers= 2: old= 23548 + 160.3*reloc, lut= 26650 + 149.6*reloc (ns) relocation: buffers= 4: old= 23450 + 160.0*reloc, lut= 24112 + 150.5*reloc (ns) relocation: buffers= 8: old= 28316 + 159.0*reloc, lut= 30131 + 149.1*reloc (ns) relocation: buffers= 16: old= 29774 + 160.0*reloc, lut= 31269 + 150.0*reloc (ns) relocation: buffers= 32: old= 35678 + 161.4*reloc, lut= 35133 + 151.7*reloc (ns) relocation: buffers= 64: old= 60752 + 162.4*reloc, lut= 62951 + 151.1*reloc (ns) relocation: buffers= 128: old= 95711 + 172.6*reloc, lut= 94049 + 160.7*reloc (ns) relocation: buffers= 256: old= 188186 + 178.5*reloc, lut= 184888 + 165.0*reloc (ns) relocation: buffers= 512: old= 362717 + 195.3*reloc, lut= 356609 + 168.5*reloc (ns) relocation: buffers=1024: old= 752565 + 258.9*reloc, lut= 737432 + 204.8*reloc (ns) relocation: buffers=2048: old=4130880 + 375.5*reloc, lut=4117357 + 279.1*reloc (ns) skip-relocs: buffers= 1: old= 7075 + 87.0*reloc, lut= 6909 + 80.3*reloc (ns) skip-relocs: buffers= 2: old= 7721 + 87.0*reloc, lut= 7554 + 80.3*reloc (ns) skip-relocs: buffers= 4: old= 8532 + 87.0*reloc, lut= 8321 + 80.3*reloc (ns) skip-relocs: buffers= 8: old= 10332 + 87.2*reloc, lut= 10062 + 80.5*reloc (ns) skip-relocs: buffers= 16: old= 14190 + 87.4*reloc, lut= 13790 + 80.7*reloc (ns) skip-relocs: buffers= 32: old= 22911 + 87.4*reloc, lut= 22027 + 80.8*reloc (ns) skip-relocs: buffers= 64: old= 43343 + 89.4*reloc, lut= 41712 + 82.4*reloc (ns) skip-relocs: buffers= 128: old= 94919 + 92.8*reloc, lut= 93175 + 83.8*reloc (ns) skip-relocs: buffers= 256: old= 186820 + 99.9*reloc, lut= 183693 + 88.6*reloc (ns) skip-relocs: buffers= 512: old= 361709 + 115.8*reloc, lut= 355697 + 92.4*reloc (ns) skip-relocs: buffers=1024: old= 743773 + 174.1*reloc, lut= 735623 + 126.8*reloc (ns) skip-relocs: buffers=2048: old=4132826 + 283.8*reloc, lut=4114821 + 194.7*reloc (ns) no-relocs: buffers= 1: old= 6812 + 30.2*reloc, lut= 6704 + 23.5*reloc (ns) no-relocs: buffers= 2: old= 7248 + 30.3*reloc, lut= 7071 + 23.6*reloc (ns) no-relocs: buffers= 4: old= 8168 + 30.3*reloc, lut= 7906 + 23.7*reloc (ns) no-relocs: buffers= 8: old= 9970 + 30.3*reloc, lut= 9437 + 23.7*reloc (ns) no-relocs: buffers= 16: old= 13394 + 30.6*reloc, lut= 13017 + 23.9*reloc (ns) no-relocs: buffers= 32: old= 21729 + 30.6*reloc, lut= 21105 + 24.0*reloc (ns) no-relocs: buffers= 64: old= 40860 + 31.5*reloc, lut= 39069 + 25.0*reloc (ns) no-relocs: buffers= 128: old= 90798 + 29.8*reloc, lut= 88936 + 22.9*reloc (ns) no-relocs: buffers= 256: old= 178076 + 29.7*reloc, lut= 174850 + 22.4*reloc (ns) no-relocs: buffers= 512: old= 343969 + 31.0*reloc, lut= 337410 + 22.4*reloc (ns) no-relocs: buffers=1024: old= 714549 + 46.5*reloc, lut= 704009 + 37.8*reloc (ns) no-relocs: buffers=2048: old=4040396 + 49.3*reloc, lut=4017948 + 37.6*reloc (ns) real 8m19.540s user 0m18.975s sys 7m59.965s
Created attachment 103431 [details] dmesg on -next-queued branch (In reply to comment #2) > (In reply to comment #1) > > How does the pass compare? (i.e. please attach the good output and dmesg) > > Here is the output on latest -next-queued which passed. > > [root@x-bsw01 tests]# time ./gem_exec_lut_handle > IGT-Version: 1.7-g4d4f4b2 (x86_64) (Linux: > 3.16.0-rc4_drm-intel-next-queued_2b8ee5_20140725+ x86_64) > relocation: buffers= 1: old= 20213 + 160.6*reloc, lut= 21170 + > 150.9*reloc (ns) > relocation: buffers= 2: old= 23548 + 160.3*reloc, lut= 26650 + > 149.6*reloc (ns) > relocation: buffers= 4: old= 23450 + 160.0*reloc, lut= 24112 + > 150.5*reloc (ns) > relocation: buffers= 8: old= 28316 + 159.0*reloc, lut= 30131 + > 149.1*reloc (ns) > relocation: buffers= 16: old= 29774 + 160.0*reloc, lut= 31269 + > 150.0*reloc (ns) > relocation: buffers= 32: old= 35678 + 161.4*reloc, lut= 35133 + > 151.7*reloc (ns) > relocation: buffers= 64: old= 60752 + 162.4*reloc, lut= 62951 + > 151.1*reloc (ns) > relocation: buffers= 128: old= 95711 + 172.6*reloc, lut= 94049 + > 160.7*reloc (ns) > relocation: buffers= 256: old= 188186 + 178.5*reloc, lut= 184888 + > 165.0*reloc (ns) > relocation: buffers= 512: old= 362717 + 195.3*reloc, lut= 356609 + > 168.5*reloc (ns) > relocation: buffers=1024: old= 752565 + 258.9*reloc, lut= 737432 + > 204.8*reloc (ns) > > relocation: buffers=2048: old=4130880 + 375.5*reloc, lut=4117357 + > 279.1*reloc (ns) > skip-relocs: buffers= 1: old= 7075 + 87.0*reloc, lut= 6909 + > 80.3*reloc (ns) > skip-relocs: buffers= 2: old= 7721 + 87.0*reloc, lut= 7554 + > 80.3*reloc (ns) > skip-relocs: buffers= 4: old= 8532 + 87.0*reloc, lut= 8321 + > 80.3*reloc (ns) > skip-relocs: buffers= 8: old= 10332 + 87.2*reloc, lut= 10062 + > 80.5*reloc (ns) > skip-relocs: buffers= 16: old= 14190 + 87.4*reloc, lut= 13790 + > 80.7*reloc (ns) > skip-relocs: buffers= 32: old= 22911 + 87.4*reloc, lut= 22027 + > 80.8*reloc (ns) > skip-relocs: buffers= 64: old= 43343 + 89.4*reloc, lut= 41712 + > 82.4*reloc (ns) > skip-relocs: buffers= 128: old= 94919 + 92.8*reloc, lut= 93175 + > 83.8*reloc (ns) > skip-relocs: buffers= 256: old= 186820 + 99.9*reloc, lut= 183693 + > 88.6*reloc (ns) > skip-relocs: buffers= 512: old= 361709 + 115.8*reloc, lut= 355697 + > 92.4*reloc (ns) > skip-relocs: buffers=1024: old= 743773 + 174.1*reloc, lut= 735623 + > 126.8*reloc (ns) > skip-relocs: buffers=2048: old=4132826 + 283.8*reloc, lut=4114821 + > 194.7*reloc (ns) > no-relocs: buffers= 1: old= 6812 + 30.2*reloc, lut= 6704 + 23.5*reloc > (ns) > no-relocs: buffers= 2: old= 7248 + 30.3*reloc, lut= 7071 + 23.6*reloc > (ns) > no-relocs: buffers= 4: old= 8168 + 30.3*reloc, lut= 7906 + 23.7*reloc > (ns) > no-relocs: buffers= 8: old= 9970 + 30.3*reloc, lut= 9437 + 23.7*reloc > (ns) > no-relocs: buffers= 16: old= 13394 + 30.6*reloc, lut= 13017 + 23.9*reloc > (ns) > no-relocs: buffers= 32: old= 21729 + 30.6*reloc, lut= 21105 + 24.0*reloc > (ns) > no-relocs: buffers= 64: old= 40860 + 31.5*reloc, lut= 39069 + 25.0*reloc > (ns) > no-relocs: buffers= 128: old= 90798 + 29.8*reloc, lut= 88936 + 22.9*reloc > (ns) > no-relocs: buffers= 256: old= 178076 + 29.7*reloc, lut= 174850 + 22.4*reloc > (ns) > no-relocs: buffers= 512: old= 343969 + 31.0*reloc, lut= 337410 + 22.4*reloc > (ns) > no-relocs: buffers=1024: old= 714549 + 46.5*reloc, lut= 704009 + 37.8*reloc > (ns) > no-relocs: buffers=2048: old=4040396 + 49.3*reloc, lut=4017948 + 37.6*reloc > (ns) > > real 8m19.540s > user 0m18.975s > sys 7m59.965s
So looks like there is fixed overhead for each ioctl in the failure case (from 6us to 100us). Could this just be related to kernel debug options? But then it would not be so sporadic. If you can find a way to reproduce it consistently, the overhead should show up in a perf profile.
Jinxian, can you collect a perf profile when this happens? That should narrow things down pretty quickly.
(In reply to Jesse Barnes from comment #5) > Jinxian, can you collect a perf profile when this happens? That should > narrow things down pretty quickly. The failure unable to reproduce on latest -nightly(bfdd01aa1825aa0068f9236b21362b550f6d630f) root@x-bsw01:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# time ./gem_exec_lut_handle IGT-Version: 1.8-g819e68f (x86_64) (Linux: 3.18.0-rc7_drm-intel-nightly_bfdd01_20141208+ x86_64) real 7m57.688s user 0m19.124s sys 7m37.680s real 7m44.798s user 0m18.992s sys 7m24.948s real 7m58.919s user 0m19.353s sys 7m38.629s real 7m55.662s user 0m18.616s sys 7m36.181s real 7m49.988s user 0m19.228s sys 7m29.939s
Closing old verified.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.