A change in drm-intel-nightly causes ~10% performance drop in benchmarks: - 10-12% in GpuTest v0.7 "pixmark" (piano, volplosion, Julia32) tests - 9-11% in GfxBench 4.x ALU, ALU2 and tessellation tests, both onscreen & offscreen - 9-12% in SynMark batch, geometry, pixel, vertex and compute shader tests - 11% in GLB 2.7 Fill tests, 8% in T-Rex & Egypt onscreen, 2% in Egypt & T-Rex offscreen - 5-6% in Unigine Heaven & Valley GpuTest and GLB 2.7 tests are run as windowed, rest is run as fullscreen in monitor native resolution. Basically everything except CPU bound 3D tests dropped, regardless of whether it's: - mostly GPU ALU or memory bandwidth limited - fullscreened or windowed - onscreen or offscreen Drop happens only on SKL GT3e, there's no drop e.g. on SKL GT2 or HSW GT3e. Drop happened between following 13th and 15th of July drm-intel-nightly commits: - Good: 5cd2699dfcf12a399553c3186b718667523a19fc - Bad: 5cd2699dfcf12a399553c3186b718667523a19fc It's fully reproducible just by changing kernel, but it cannot be automatically bisected by Jenkins because large number of commits in-between are in non-buildable state.
Good: 5cd2699dfcf12a399553c3186b718667523a19fc 2016-07-13_14-43-55 drm-intel-nightly: 2016y-07m-13d-14h-43m-27s UTC integration manifest Doesn't compile: 2d854c67e3af36b190e8499a3bfad7cdccde0f67 2016-07-14_14-26-01 drm-intel-nightly: 2016y-07m-14d-14h-25m-35s UTC integration manifest Bad: 30eabcaa6dcea5ca21f7e6c00da7ab4b1910396c 2016-07-15_14-53-45 drm-intel-nightly: 2016y-07m-15d-14h-53m-25s UTC integration manifest
(In reply to Tomi Sarvela from comment #1) > Good: 5cd2699dfcf12a399553c3186b718667523a19fc > 2016-07-13_14-43-55 drm-intel-nightly: 2016y-07m-13d-14h-43m-27s UTC > integration manifest > > Doesn't compile: 2d854c67e3af36b190e8499a3bfad7cdccde0f67 > 2016-07-14_14-26-01 drm-intel-nightly: 2016y-07m-14d-14h-25m-35s UTC > integration manifest Seriously? What's the compilation error. > Bad: 30eabcaa6dcea5ca21f7e6c00da7ab4b1910396c > 2016-07-15_14-53-45 drm-intel-nightly: 2016y-07m-15d-14h-53m-25s UTC > integration manifest I don't have that commit. What is the shortlog between good/bad?
As requested: $ git shortlog 5cd2699dfcf12a399553c3186b718667523a19fc..30eabcaa6dcea5ca21f7e6c00da7ab4b1910396c Aaron Campbell (1): iommu/vt-d: Fix infinite loop in free_all_cpu_cached_iovas Al Viro (2): Use the right predicate in ->atomic_open() instances nfs_atomic_open(): prevent parallel nfs_lookup() on a negative hashed Alan Stern (1): SCSI: fix new bug in scsi_dev_info_list string matching Alex Deucher (49): drm/amdgpu: load different smc firmware on some CI variants drm/radeon: load different smc firmware on some SI variants drm/radeon: load different smc firmware on some CI variants drm/amdgpu/gfx7: expand cp jt size to handle GDS as well drm/radeon/gfx7: expand cp jt size to handle GDS as well drm/amdgpu/gfx8: add state setup for CZ/ST GFX power gating drm/amdgpu/gfx8: rename some pg functions drm/amdgpu: add new GFX powergating types drm/amdgpu/gfx8: add powergating support for CZ/ST drm/amdgpu/gfx8: clean up polaris11 PG enable drm/amdgpu: disable power control on hybrid laptops drm/amdgpu: clean up atpx power control handling drm/amdgpu: add a delay after ATPX dGPU power off drm/amdgpu/atpx: add a query for ATPX dGPU power control drm/amdgpu: use PCI_D3hot for PX systems without dGPU power control drm/amdgpu/atpx: drop forcing of dGPU power control drm/radeon: disable power control on hybrid laptops drm/radeon: clean up atpx power control handling drm/radeon: add a delay after ATPX dGPU power off drm/radeon/atpx: add a query for ATPX dGPU power control drm/radeon: use PCI_D3hot for PX systems without dGPU power control drm/radeon/atpx: drop forcing of dGPU power control drm/amdgpu/atpx: track whether if this is a hybrid graphics platform drm/amdgpu/atpx: hybrid platforms use d3cold drm/amdgpu: drop explicit pci D3/D0 setting for ATPX power control drm/radeon/atpx: track whether if this is a hybrid graphics platform drm/radeon/atpx: hybrid platforms use d3cold drm/radeon: drop explicit pci D3/D0 setting for ATPX power control drm/amdgpu: work around lack of upstream ACPI support for D3cold drm/radeon: work around lack of upstream ACPI support for D3cold drm/amdgpu: properly clean up runtime pm drm/amdgpu/gfx8: fix CP jump table size drm/amdgpu/gfx7: fix CP jump table size drm/radeon/cik: fix CP jump table size drm/amdgpu: disable compute pipeline sync workaround when using fixed fw drm/amdgpu/gmc: make some functions static drm/amdgpu: drop wait_for_mc_idle asic callback drm/amdgpu: move get_gpu_clock_counter into the gfx struct drm/amdgpu: move select_se_sh into the gfx struct drm/amdgpu/gfx7: switch to using the existing rlc callbacks drm/amdgpu/gfx7: make gfx_v7_0_rlc_stop static drm/amdgpu/dce11: update async flip update time drm/amdgpu/powerplay/cz: add missing call to powergate VCE drm/amdgpu: add IP helpers for wait_for_idle and is_idle drm/amdgpu: add missing breaks drm/amdgpu: skip invalid ip blocks in ip helpers drm/amdgpu/gmc8: remove duplicate wait_for_idle functions drm/amdgpu/gmc7: remove duplicate wait_for_idle functions drm/amdgpu: remove more of the ring backup code Alex Xie (3): drm/amdgpu: Change some variable names to make code easier understood drm/amdgpu: Add comment to describe the purpose of one difficult if statement drm/amdgpu: Initialize the variables in a straight-forward way Alexandre Courbot (21): drm/nouveau/tegra: fetch gpu_speedo_id drm/nouveau/volt/gk20a: make unused public functions static drm/nouveau/volt/gk20a: constify and name v_scale drm/nouveau/volt/gk20a: rename constructor drm/nouveau/volt/gm20b: add support for vmin parameter drm/nouveau/clk/gk20a: properly protect macro argument drm/nouveau/clk/gk20a: setup slide once during init drm/nouveau/clk/gk20a: reorganize MNP calculation a bit drm/nouveau/clk/gk20a: use nvkm_ functions in slide() drm/nouveau/clk/gk20a: add and use MNP programming functions drm/nouveau/clk/gk20a: parameterize PLL settings drm/nouveau/clk/gk20a: factorize n_lo computation code drm/nouveau/clk/gk20a: improve MNP programming drm/nouveau/clk/gk20a: rename constructor drm/nouveau/clk/gm20b: add glitchless and DFS support drm/nouveau/secboot: fix kerneldoc for secure boot structures drm/nouveau/gr/gf100: handle secure boot errors drm/nouveau/secboot/gm200: make firmware loading re-callable drm/nouveau/secboot: lazy-load firmware and be more resilient drm/nouveau/ttm: remove special handling of coherent objects drm/nouveau/bus: remove cpu_coherent flag Alexandre Demers (2): drm/amd/powerplay: fix trivial typo and tidy comment drm/amd/powerplay: fix typos in comment in polaris' hwmgr Alexey Dobriyan (1): posix_cpu_timer: Exit early when process has been reaped Arindam Nath (2): drm/amd/amdgpu: make sure VCE is disabled by default drm/amd/powerplay: make sure VCE is disabled by default Arnd Bergmann (1): amdgpu: use NULL instead of 0 for pointer Aviv Heller (1): bonding: fix enslavement slave link notifications Axel Lin (1): regulator: qcom_smd: Remove list_voltage callback for rpm_smps_ldo_ops_fixed Ben Skeggs (71): drm/nouveau/top: take nvkm_device as argument to public functions drm/nouveau/top: add function to lookup interrupt mask for a given device drm/nouveau/mc: allow construction of subclassed device drm/nouveau/mc: take nvkm_device as argument to public functions drm/nouveau/mc: expose device enable/disable separately, as well as reset drm/nouveau/mc: s/intr_mask/intr_stat/ drm/nouveau/mc: support for temporarily masking interrupts from a specific device drm/nouveau/mc/gt215: support for masking interrupts drm/nouveau/mc/gf100-: support for masking interrupts drm/nouveau/mc/gk104-: add pmu reset mask drm/nouveau/secboot: use nvkm_mc_intr_mask/unmask() drm/nouveau/secboot: use nvkm_mc_enable/disable() drm/nouveau/ltc/gm107-: decode interrupt status to human-readable strings drm/nouveau/disp/nv50-: fix lookup of udisp table under certain circumstances drm/nouveau/fifo/gk104-: translate engidx into human-readable name in debug output drm/nouveau/bios: guard against out-of-bounds accesses to image drm/nouveau/bios: pointers beyond end of first image need special handling drm/nouveau/disp/g94: implement workaround for dvi issue on fx380 drm/nouveau: prevent oops if no mmu subdev present drm/nouveau/fb/gf100-: allow selection of an alternate big page size drm/nouveau/core: increase maximum ce instances to 6 drm/nouveau/core: increase maximum nvenc instances to 3 drm/nouveau/core: recognise GP100 chipset drm/nouveau/top/gp100: initial support drm/nouveau/mc/gp100: initial support drm/nouveau/pci/gp100: initial support drm/nouveau/tmr/gp100: initial support drm/nouveau/bios/gp100: initial support drm/nouveau/bios/dp: initial support for 4.2 drm/nouveau/bios/pll: initial support for BIT 'C' version 2 drm/nouveau/bios/rammap: 32-bit bios pointers drm/nouveau/devinit/gp100: initial support drm/nouveau/imem/gp100: initial implementation drm/nouveau/fb/gp100: initial support drm/nouveau/mmu/gp100: initial support drm/nouveau/bar/gp100: initial support drm/nouveau/bus/gp100: initial support drm/nouveau/fuse/gp100: initial support drm/nouveau/gpio/gp100: initial support drm/nouveau/i2c/gm204: initial support drm/nouveau/ibus/gp100: initial support drm/nouveau/ltc/gp100: initial support drm/nouveau/secboot/gm200: initial support drm/nouveau/dma/gp100: initial implementation drm/nouveau/disp/gp100: initial support drm/nouveau/fifo/gp100: initial support drm/nouveau/ce/gp100: initial support drm/nouveau/gr/gp100: initial support drm/nouveau/sw/gp100: initial support drm/nouveau/core: recognise GP104 chipset drm/nouveau/top/gp104: initial support drm/nouveau/mc/gp104: initial support drm/nouveau/pci/gp104: initial support drm/nouveau/tmr/gp104: initial support drm/nouveau/bios/gp104: initial support drm/nouveau/devinit/gp104: initial support drm/nouveau/imem/gp104: initial support drm/nouveau/fb/gp104: initial support drm/nouveau/mmu/gp104: initial support drm/nouveau/bar/gp104: initial support drm/nouveau/bus/gp104: initial support drm/nouveau/fuse/gp104: initial support drm/nouveau/gpio/gp104: initial support drm/nouveau/i2c/gp104: initial support drm/nouveau/ibus/gp104: initial support drm/nouveau/ltc/gp104: initial support drm/nouveau/dma/gp104: initial support drm/nouveau/disp/gp104: initial support drm/nouveau/fifo/gp104: initial support drm/nouveau/ce/gp104: initial support drm/nouveau: check for supported chipset before booting fbdev off the hw Bhaktipriya Shridhar (1): drm/amdkfd: Remove create_workqueue() Bjørn Mork (1): cdc_ncm: workaround for EM7455 "silent" data interface Bob Liu (1): xen-blkfront: save uncompleted reqs in blkfront_resume() Borislav Petkov (1): x86/amd_nb: Fix boot crash on non-AMD systems Brian King (1): ipr: Clear interrupt on croc/crocodile when running with LSI Bruno Prémont (1): qla2xxx: Fix NULL pointer deref in QLA interrupt Chris J Arges (1): ecryptfs: fix spelling mistakes Chris Wilson (13): drm: Don't overwrite user ioctl arg unless requested drm/i915: Update ifdeffery for mutex->owner drm/i915/breadcrumbs: Queue hangcheck before sleeping drm/i915: Flush GT idle status upon reset drm/i915: Preserve current RPS frequency across init drm/i915: Perform static RPS frequency setup before userspace drm/i915: Move overclocking detection to alongside RPS frequency detection drm/i915: Define a separate variable and control for RPS waitboost frequency drm/i915: Remove superfluous powersave work flushing drm/i915: Defer enabling rc6 til after we submit the first batch/context drm/i915: Hide gen6_update_ring_freq() drm/i915/fbdev: Drain the suspend worker on retiring drm/i915/fbdev: Check for the framebuffer before use Christian König (44): drm/amdgpu: fix coding style in the scheduler v2 drm/amdgpu: remove begin_job/finish_job drm/amdgpu: remove duplicated timeout callback drm/amdgpu: fix coding style in amdgpu_job_free drm/amdgpu: remove use_shed hack in job cleanup drm/amdgpu: properly abstract scheduler timeout handling drm/amdgpu: move locking into the functions who need it drm/amdgpu: fix and cleanup job destruction drm/amdgpu: document amdgpu_sync_get_fence drm/amdgpu: generalize the scheduler fence drm/amdgpu: remove amdgpu_sync_wait drm/amdgpu: add optional ring to amdgpu_sync_is_idle drm/amdgpu: prefer VMIDs idle on the current ring drm/amdgpu: reuse VMIDs assigned to a VM only if there is also a free one drm/amdgpu: use a fence array for VMID management drm/amdgpu: remove now unnecessary checks drm/amdgpu: stop trying to schedule() with a spin held drm/ttm: cleanup ttm_tt_(unbind|destroy) drm/ttm: remove NULL checks when calling ttm_tt_destroy drm/ttm: remove dummy bo_move implementations drm/ttm: add wait for idle in all drivers bo_move functions drm/ttm: wait for BO idle in ttm_bo_move_memcpy drm/ttm: drop wait for idle in ttm_bo_move_buffer drm/ttm: drop waiting for idle in ttm_bo_evict. drm/ttm: wait for BO idle after the move in ttm_bo_swapout drm/amdgpu: sync to buffer moves before VM updates drm/amdgpu: remove pre move wait drm/ttm: remove no_gpu_wait param from ttm_bo_move_accel_cleanup drm/ttm: remove TTM_BO_PRIV_FLAG_MOVING drm/ttm: simplify ttm_bo_wait drm/ttm: add the infrastructure for pipelined evictions drm/amdgpu: save the PD addr before scheduling the job drm/amdgpu: pipeline evictions as well drm/amdgpu: add eviction counter drm/amdgpu: validate VM PTs only on eviction drm/amdgpu: implement HDP functions for UVD v2 drm/amdgpu: don't update page tables for VM emulation drm/ttm: wait for eviction in ttm_bo_force_list_clean drm/ttm: fix stupid parameter inversion in the pipeline code drm/amdgpu: stop disabling irqs when it isn't neccessary drm/amdgpu: fix user fence handling once more drm/amdgpu: shorten amdgpu_job_free_resources drm/amdgpu: earlier free SA resources drm/amdgpu: remove fence parameter from amd_sched_job_init Christophe Jaillet (1): fsl/fman: fix error handling Chunming Zhou (22): drm/amdgpu: add gpu reset to timeout handler drm/amdgpu: add return value for pci config reset drm/amdgpu: enable BUS master after pci reset drm/amdgpu: block scheduler when gpu reset drm/amdgpu: evict vram when gpu reset drm/amdgpu: add amdgpu_irq_gpu_reset_resume_helper drm/amdgpu: must update page table after gpu reset drm/amdgpu: save/restore bios scratch when gpu reset drm/amdgpu: must update page table after gpu reset drm/amdgpu: stop/resume fb access when gpu reset V3 drm/amdgpu: put old hw fence of job if gpu reset drm/amdgpu: remove evict vram drm/amd: add parent for sched fence drm/amd: add amd_sched_hw_job_reset drm/amdgpu: block ttm first before parking scheduler drm/amdgpu: force completion for gpu reset drm/amdgpu: add amd_sched_job_recovery drm/amdgpu: add a bool to specify if needing vm flush V2 drm/amdgpu: abstract amdgpu_vm_is_gpu_reset drm/amdgpu: recovery hw jobs when gpu reset V3 drm/amdgpu: ib test first after gpu reset drm/amdgpu: clean up ring_backup code, no need more Colin Ian King (2): drm/vc4: clean up error exit path on failed dpi_connector allocation drm/vc4: remove redundant ret status check Colin Pitrat (1): gpio: sch: Fix Oops on module load on Asus Eee PC 1201 Dan Carpenter (1): platform/chrome: cros_ec_dev - double fetch bug in ioctl Daniel Borkmann (1): macsec: set actual real device for xmit when !protect_frames Daniel Jurgens (5): net/mlx5: Fix incorrect page count when in internal error net/mlx5: Fix wait_vital for VFs and remove fixed sleep net/mlx5e: Timeout if SQ doesn't flush during close net/mlx5e: Implement ndo_tx_timeout callback net/mlx5e: Handle RQ flush in error cases Daniel Vetter (9): Revert "drm: Resurrect atomic rmfb code" Merge remote-tracking branch 'origin/drm-intel-next-fixes' into drm-intel-nightly Merge remote-tracking branch 'origin/drm-intel-next-queued' into drm-intel-nightly Merge remote-tracking branch 'drm-upstream/drm-next' into drm-intel-nightly Merge remote-tracking branch 'sound-upstream/for-next' into drm-intel-nightly Merge remote-tracking branch 'sound-upstream/for-linus' into drm-intel-nightly Merge remote-tracking branch 'origin/topic/drm-misc' into drm-intel-nightly Merge remote-tracking branch 'origin/topic/core-for-CI' into drm-intel-nightly drm-intel-nightly: 2016y-07m-15d-14h-53m-25s UTC integration manifest Dave Airlie (12): Merge tag 'drm-amdkfd-next-2016-07-03' of git://people.freedesktop.org/~gabbayo/linux into drm-next Merge branch 'drm-etnaviv-next' of git://git.pengutronix.de/git/lst/linux into drm-next Merge tag 'drm-hisilicon-next-2016-07-04' of github.com:xin3liang/linux into drm-next Merge branch 'drm-next-4.8' of git://people.freedesktop.org/~agd5f/linux into drm-next Merge branch 'linux-4.8' of git://github.com/skeggsb/linux into drm-next Merge branch 'drm-fixes-4.7' of git://people.freedesktop.org/~agd5f/linux into drm-fixes Merge tag 'drm-intel-fixes-2016-07-14' of git://anongit.freedesktop.org/drm-intel into drm-fixes Merge tag 'topic/drm-misc-2016-07-14' of git://anongit.freedesktop.org/drm-intel into drm-next Merge tag 'drm-intel-next-2016-07-11' of git://anongit.freedesktop.org/drm-intel into drm-next Merge branch 'drm-vmwgfx-fixes' of git://people.freedesktop.org/~syeh/repos_linux into drm-fixes Merge tag 'drm-vc4-next-2016-07-12' of https://github.com/anholt/linux into drm-next Merge branch 'exynos-drm-next' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-next Dave Gordon (1): drm/i915: unify first-stage engine struct setup Dave Hansen (1): x86/cpu: Fix duplicated X86_BUG(9) macro David Daney (1): MIPS: Fix page table corruption on THP permission changes. David Mao (2): drm/amd/amdgpu : Refine tracepoints to track more information drm/amd/amdgpu : adding new tracepoints to track memory information. David S. Miller (4): Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue Merge branch 'mlx5-fixes' packet: Use symmetric hash for PACKET_FANOUT_HASH. Revert "fsl/fman: fix error handling" Edmondo Tommasina (1): drm/radeon: allow PACKET3_PFP_SYNC_ME on evergreen Eric Anholt (2): Merge tag 'drm-vc4-fixes-2016-06-06' into drm-vc4-next drm/vc4: Bind the HVS before we bind the individual CRTCs. Eric Dumazet (1): bonding: prevent out of bound accesses Eric Huang (24): drm/amdgpu: add powerplay sclk OD support through sysfs (v2) drm/amd/powerplay: add sclk OD support on Fiji drm/amd/powerplay: add sclk OD support on Tonga drm/amd/powerplay: add sclk OD support on Polaris10 drm/amdgpu: add the new common pm code to select the clock levels drm/amdgpu: add the new common pm code to support sclk OD drm/amdgpu: add the CI code to enable clock level selection drm/amdgpu: add the CI code to enable sclk OD(OverDrive) drm/amdgpu: add the common code to support mclk OD drm/amdgpu: add mclk OD(overdrive) support for CI drm/amd/powerplay: add mclk OD(overdrive) support for Tonga drm/amd/powerplay: add mclk OD(overdrive) support for Fiji drm/amd/powerplay: add mclk OD(overdrive) support for Polaris10 drm/amd/powerplay: set UVD clocks bypass mode for Polaris10 drm/amd/powerplay: keep soft_pp_table pointer value for re-uploading drm/amd/powerplay: add event task of disable dynamic state management drm/amd/powerplay: add function disable_dpm_tasks for Fiji drm/amd/powerplay: add disable dpm tasks for Tonga drm/amd/powerplay: add disable dpm tasks for Polaris10 drm/amd/powerplay: change backend allocation to backend init drm/amd/powerplay: add uploading pptable and resetting powerplay support drm/amd/powerplay: remove useless pp_table codes for Tonga/Fiji/Polaris10 drm/amd/powerplay: remove useless soft pptable in Asic related backend drm/amdgpu: some improvement in parsing inputs Florian Fainelli (1): net: bcmsysport: Device stats are unsigned long Frank Binns (1): drm/amd/amdgpu: Set DRIVER_MODESET feature flag at build time Ganapatrao Kulkarni (1): arm64: Enable workaround for Cavium erratum 27456 on thunderx-81xx Ganesh Goudar (1): cxgb4: update latest firmware version supported Haishuang Yan (1): geneve: fix max_mtu setting Hans Verkuil (1): [media] v4l2-ioctl: fix stupid mistake in cropcap condition Huang Rui (4): drm/amdgpu: add powercontainment module parameter drm/amdgpu: factor out the AMDGPU_INFO_FW_VERSION case branch into amdgpu_firmware_info drm/amdgpu: introduce a firmware debugfs to dump all current firmware versions drm/amdgpu: change pcie_gen_cap magic code to macro Hugh Dickins (1): tmpfs: fix regression hang in fallocate undo James Bottomley (1): Merge branch 'jejb-fixes' into fixes James Morse (1): arm64: kernel: Save and restore UAO and addr_limit on exception entry Jan Beulich (4): xenbus: don't BUG() on user mode induced condition xenbus: don't bail early from xenbus_dev_request_and_reply() xenbus: simplify xenbus_dev_request_and_reply() xen/acpi: allow xen-acpi-processor driver to load on Xen 4.7 Jarod Wilson (1): e1000e: keep Rx/Tx HW_VLAN_CTAG in sync Jeff Layton (1): posix_acl: de-union a_refcount and a_rcu Jeff Mahoney (2): Revert "ecryptfs: forbid opening files without mmap handler" ecryptfs: don't allow mmap when the lower fs doesn't support it Jens Axboe (1): Merge branch 'stable/for-jens-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-linus Joerg Roedel (1): iommu/amd: Fix unity mapping initialization race Johan Hovold (1): Revert "gpiolib: Split GPIO flags parsing and GPIO configuration" Jon Mason (1): MAINTAINERS: Update the Calgary IOMMU entry Josh Poimboeuf (2): perf/x86: Fix 32-bit perf user callgraph collection objtool: Fix STACK_FRAME_NON_STANDARD macro checking for function symbols Julia Lawall (2): ecryptfs: drop null test before destroy functions drm/nouveau/gr/gk20a: delete unneeded second newline Junwei Zhang (1): drm/amdgpu/dce8: fix flash with white screen on monitor Karol Herbst (2): drm/nouveau/volt: save the voltage range we are able to set drm/nouveau/hwmon: add in_min and in_max Ken Wang (3): drm/amdgpu: remove gfx8 registers that vary between asics drm/amdgpu: Add a missing register to Polaris golden setting drm/amdgpu: fix power distribution issue for Polaris10 XT Laurent Pinchart (1): [media] adv7604: Don't ignore pad number in subdev DV timings pad operations Linus Torvalds (28): Merge tag 'chrome-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/olof/chrome-platform Merge tag 'sound-4.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Merge tag 'configfs-for-4.7' of git://git.infradead.org/users/hch/configfs Merge branch 'for-linus' of git://git.kernel.dk/linux-block Merge tag 'pm-4.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Merge tag 'acpi-4.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Merge tag 'drm-fixes-for-v4.7-rc7' of git://people.freedesktop.org/~airlied/linux Merge tag 'gpio-v4.7-5' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Merge tag 'for-linus-4.7b-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Merge tag 'iommu-fixes-v4.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Merge tag 'ecryptfs-4.7-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus Linux 4.7-rc7 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Merge tag 'qcom-smd-list-voltage' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Merge tag 'acpi-urgent-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Merge tag 'media/v4.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Merge branches 'perf-urgent-for-linus' and 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Linus Walleij (1): Revert "gpio: gpiolib-of: Allow compile testing" Lionel Landwerlin (1): drm/i915: add missing condition for committing planes on crtc Lucas Stach (2): drm/etnaviv: improve error reporting in GPU init path drm/etnaviv: remove generic GPU init failure reporting Lukas Wunner (3): x86/quirks: Apply nvidia_bugs quirk only on root bus x86/quirks: Reintroduce scanning of secondary buses x86/quirks: Add early quirk to reset Apple AirPort card Lv Zheng (3): ACPICA: Namespace: Fix namespace/interpreter lock ordering ACPI / debugger: Fix regression introduced by IS_ERR_VALUE() removal ACPI / EC: Fix code ordering issue in ec_remove_handlers() Lyude (6): drm/radeon: Poll for both connect/disconnect on analog connectors drm/amdgpu: Poll for both connect/disconnect on analog connectors drm/i915/vlv: Make intel_crt_reset() per-encoder drm/i915/vlv: Reset the ADPA in vlv_display_power_well_init() drm/i915/vlv: Disable HPD in valleyview_crt_detect_hotplug() drm/i915: Enable polling when we don't have hpd Marek Szyprowski (5): drm/exynos: iommu: move dma_params configuration code to separate functions drm/exynos: iommu: add a check if all sub-devices have iommu controller drm/exynos: iommu: remove unused entries from exynos_drm_private strcuture drm/exynos: iommu: move ARM specific code to exynos_drm_iommu.h drm/exynos: iommu: add support for ARM64 specific code for IOMMU glue Marek Vasut (1): configfs: Remove ppos increment in configfs_write_bin_file Mario Kleiner (1): drm/vc4: Implement precise vblank timestamping. Mark Rutland (1): perf/core: Fix pmu::filter_match for SW-led groups Martin KaFai Lau (1): ipv6: Fix mem leak in rt6i_pcpu Masanari Iida (1): x86/Documentation: Fix various typos in Documentation/x86/ files Matt Corallo (1): net: stmmac: Fix null-function call in ISR on stmmac1000 Matthew Auld (1): drm/i915: remove superfluous i915_gem_object_free_mmap_offset call Matthew Finlay (1): net/mlx5e: Copy all L2 headers into inline segment Mauro Carvalho Chehab (1): Merge tag 'v4.7-rc2' into v4l_for_linus Michel Dänzer (1): drm/amdgpu: Unpin BO if we can't get fences in amdgpu_crtc_page_flip Mohamad Haj Yahia (4): net/mlx5: Fix teardown errors that happen in pci error handler net/mlx5: Avoid calling sleeping function by the health poll thread net/mlx5: Fix potential deadlock in command mode change net/mlx5: Add timeout handle to commands with callback Monk Liu (2): drm/amdgpu: clear RB at ring init drm/amdgpu: fix ring debugfs bug Nicolai Hähnle (5): drm/amdgpu: add amdgpu.cg_mask and amdgpu.pg_mask parameters drm/amdgpu: remove cgs_acpi_method_argument member method_length drm/amdgpu: add disable_cu parameter drm/amdgpu/gfx7: set USER_SHADER_ARRAY_CONFIG based on disable_cu parameter drm/amdgpu/gfx8: set USER_SHADER_ARRAY_CONFIG based on disable_cu parameter Oded Gabbay (1): drm/amdkfd: destroy mutex if process creation fails Omar Sandoval (1): block: fix use-after-free in sys_ioprio_get() Or Gerlitz (1): net/mlx5: Avoid setting unused var when modifying vport node GUID Paul Burton (2): irqchip/mips-gic: Map to VPs using HW VPNum irqchip/mips-gic: Match IPI IRQ domain by bus token only Peter Chen (5): gpu: drm: vc4_hdmi: add missing of_node_put after calling of_parse_phandle gpu: drm: omapdrm: connector-dvi: add missing of_node_put after calling of_parse_phandle gpu: drm: omapdrm: dss-of: add missing of_node_put after calling of_parse_phandle gpu: drm: exynos_hdmi: add missing of_node_put after calling of_parse_phandle gpu: drm: arcpgu_drv: add missing of_node_put after calling of_parse_phandle Peter Zijlstra (2): sched/fair: Fix effective_load() to consistently use smoothed load sched/fair: Fix calc_cfs_shares() fixed point arithmetics width confusion Rafael J. Wysocki (7): x86/power/64: Fix kernel text mapping corruption during image restoration Merge branches 'pm-cpuidle-fixes' and 'pm-sleep-fixes' Merge branches 'acpica-fixes', 'acpi-pci-fixes' and 'acpi-debug-fixes' Revert "ACPICA: Namespace: Fix namespace/interpreter lock ordering" Revert "ACPICA: Namespace: Fix deadlock triggered by MLC support in dynamic table loading" Revert "ACPI 2.0 / AML: Improve module level execution by moving the If/Else/While execution to per-table basis" Merge branches 'acpica-fixes' and 'acpi-ec-fixes' Rana Shahout (2): net/mlx5e: Fix select queue callback net/mlx5e: Validate BW weight values of ETS Randy Dunlap (1): init/Kconfig: keep Expert users menu together Rex Zhu (8): drm/amd/powerplay: functions's return state was reversed drm/amd/powerplay: change condition judgment as function's return value changed. drm/amdgpu: get number of shade engine by cgs interface. drm/amd/powerplay: add mvdd dpm support. drm/amd/powerplay: add shared definitions for di/dt feature. drm/amd/powerplay: add definitions related to di/dt feature for fiji and polaris. drm/amdgpu: add read/write function for GC CAC programming drm/amd/powerplay: don't add invalid voltage. Richard Alpe (1): tipc: fix nl compat regression for link statistics Rob Herring (1): drm: vc4: enable XBGR8888 and ABGR8888 pixel formats Roy Spliet (2): drm/nouveau/clk/gf100-: Clean up PLL locking test drm/nouveau/clk/gf100: Read secondary bypass postdiv when required Russell King (1): drm/etnaviv: enable GPU module level clock gating support Russell King - ARM Linux (1): net: mvneta: fix open() error cleanup Sergio Valverde (1): enc28j60: Fix race condition in enc28j60 driver Shaker Daibes (1): net/mlx5e: Log link state changes Shmulik Ladkani (1): ipv4: Fix ip_skb_dst_mtu to use the sk passed by ip_finish_output Shreyas B. Prabhu (1): cpuidle: Fix last_residency division Sinan Kaya (3): ACPI,PCI,IRQ: factor in PCI possible Revert "ACPI, PCI, IRQ: remove redundant code in acpi_irq_penalty_init()" ACPI,PCI,IRQ: separate ISA penalty calculation Sinclair Yeh (7): drm/vmwgfx: Add a check to handle host message failure drm/vmwgfx: Work around mode set failure in 2D VMs drm/vmwgfx: Add an option to change assumed FB bpp drm/ttm: Make ttm_bo_mem_compat available drm/vmwgfx: Check pin count before attempting to move a buffer drm/vmwgfx: Delay pinning fbdev framebuffer until after mode set drm/vmwgfx: Fix error paths when mapping framebuffer Sony Chacko (1): qlcnic: add wmb() call in transmit data path. Soohoon Lee (1): usbnet: Stop RX Q on MTU change Stefan Hauser (1): net: phy: dp83867: Fix initialization of PHYCR register Stephane Eranian (1): perf/x86/intel: Update event constraints when HT is off Tahsin Erdogan (1): writeback: inode cgroup wb switch should not call ihold() Thomas Gleixner (1): cpu/hotplug: Keep enough storage space if SMP=n to avoid array out of bounds scribble Thomas Hellstrom (1): drm/vmwgfx: Fix corner case screen target management Tobias Jakobi (4): drm/rockchip: make fbdev support really optional drm/rcar-du: make fbdev support really optional drm/atmel-hlcdc: make fbdev support really optional drm/nouveau: make fbdev support really optional Tom St Denis (15): drm/amdgpu/gfx8: Enable GFX PG on CZ drm/amdgpu/gfx8: Add serdes wait for idle in CGCG en/disable drm/amd/amdgpu: Convert ring debugfs entries to binary drm/amd/amdgpu: ring debugfs is read in increments of 4 bytes drm/amdgpu/trace: Add tracepoints to MMIO read/writes drm/amdgpu/gfx8: Switch Stoney to share CZ's RLC functions drm/amdgpu/gfx8: Enable CG on Stoney drm/amdgpu/gfx8: Enable PG on Stoney drm/amdgpu/gfx8: Tidy up various PG helpers drm/amdgpu/gfx80: Add QUICK_PG bit to GFX header and use it. drm/amdgpu/uvd6: De-numberify startup drm/amd/gfx: add instance field to select_se_sh (v3) drm/amd/amdgpu: Add gca config debug entry (v4) drm/amd/amdgpu: Add bank selection for MMIO debugfs (v3) drm/amd/powerplay: Unify family defines Tvrtko Ursulin (6): drm/i915: Prepare for engine init unification drm/i915: Unify engine init loop drm/i915: Make more use of the shared engine irq setup drm/i915: Simplify intel_init_ring_buffer prototype drm/i915: Move common engine setup into intel_engine_cs.c drm/i915: Pull out some more common engine init code Ursula Braun (1): qeth: delete napi struct when removing a qeth device Vegard Nossum (4): RDS: fix rds_tcp_init() error path net: fix decnet rtnexthop parsing apparmor: fix oops, validate buffer size in apparmor_setprocattr() perf/x86: Fix bogus kernel printk, again Ville Syrjälä (4): x86/perf/intel/rapl: Fix module name collision with powercap intel-rapl drm/i915: Ignore panel type from OpRegion on SKL drm/i915: Unbreak interrupts on pre-gen6 drm/i915: Ignore panel type from OpRegion on SKL WANG Cong (1): net_sched: fix mirrored packets checksum Wei Yongjun (1): drm/hisilicon: Fix return value check in ade_dts_parse() Wei Yuan (1): eCryptfs: fix typos in comment Xin Long (1): ixgbevf: ixgbevf_write/read_posted_mbx should use IXGBE_ERR_MBX to initialize ret_val Xinliang Liu (1): drm/hisilicon: Fix ADE vblank on/off handling Zoltan Kuscsik (1): drm/hisilicon: add select HISI_KIRIN_DW_DSI hayeswang (2): r8152: clear LINK_OFF_WAKE_EN after autoresume r8152: fix runtime function for RTL8152 yanyang1 (1): drm/amdgpu: print smc fw info in CGS.
Made the coarse autobisect-script even better with git bisect skip. Bisected the problem to first bad commit: BISECT_BEFORE 62e1baa128f98006261308182fe3006d66b1bf61 BISECT_AFTER b7137e0cf1e55b5b0cb88fbd85425a1bc0d24c3a git://anongit.freedesktop.org/drm-intel commit b7137e0cf1e55b5b0cb88fbd85425a1bc0d24c3a Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Jul 13 09:10:37 2016 +0100 drm/i915: Defer enabling rc6 til after we submit the first batch/context Manually tested good and bad commit, checks out. Tested with i915.enable_rc6=0, bad is still bad. Kernels compiled without debugging options for performance reasons.
Is there something in i915 rc6 handling that differs between SKL GT2 and GT3e? PS. Forgot to mention earlier that testing was done on Ubuntu 16.04 + Unity, with kernel/X/Mesa built from Git. Gap is there both with CPU P-state powersave (Ubuntu default) and performance governors, and whether compiz is compositing doesn't affect the results.
Created attachment 125227 [details] dmesg b7137e0cf1e55b5b0cb88fbd85425a1bc0d24c3a drm.debug=0xe
(In reply to Eero Tamminen from comment #5) > Is there something in i915 rc6 handling that differs between SKL GT2 and > GT3e? Actually there is. NEEDS_WaRsDisableCoarsePowerGating() is skl gt3/gt4. Not sure how that relates to the patch, comment 6 shows that we are still enabling RC6 pretty earlier, so it should not be a complete failure... Tomi, could you also post the dmesg from b7137e0cf1e55b5b0cb88fbd85425a1bc0d24c3a^ (the last good commit)?
NEEDS_WaRsDisableCoarsePowerGating() also impacts the guc it seems - another variable to check (whether not this bug is affected by enabling/disabling the guc).
Created attachment 125236 [details] [review] Enable RC6 immediately The complexity of that patch is no longer required, so let's try a simpler version.
Hmm, that was overkill (Mika will complain again about not waiting for a context before enabling RC6). A better couple of patches would be https://cgit.freedesktop.org/~ickle/linux-2.6/log/?h=bug97017
Created attachment 125239 [details] [review] Update ring freqs after runtime resume This is the most likely candidate from https://cgit.freedesktop.org/~ickle/linux-2.6/log/?h=bug97017
Created attachment 125252 [details] dmesg 62e1baa128f98006261308182fe3006d66b1bf61 drm.debug=0xe Dmesg from last good
(In reply to Tomi Sarvela from comment #12) > Created attachment 125252 [details] > dmesg 62e1baa128f98006261308182fe3006d66b1bf61 drm.debug=0xe > > Dmesg from last good RC6 on occurs at the same time, so I think it's not the change in deferral mechanism.
Tomi, can you see if the attached https://bugs.freedesktop.org/attachment.cgi?id=125239 helps?
Tested the following combinations Nightly: Nightly_666, regression noticed, commit 30eabca Patch: https://bugs.freedesktop.org/attachment.cgi?id=125239 GuC options: i915.enable_guc_submission= i915.enable_guc_loading= Nightly + GuC enabled = bad Nightly + Guc disabled = good Nightly + Patch + GuC enabled = bad Nightly + Patch + GuC disabled = good First bad + GuC disabled = good First bad + Guc default (enabled) = bad
(In reply to Tomi Sarvela from comment #15) > Tested the following combinations > Nightly: Nightly_666, regression noticed, commit 30eabca > Patch: https://bugs.freedesktop.org/attachment.cgi?id=125239 > GuC options: i915.enable_guc_submission= i915.enable_guc_loading= > > Nightly + GuC enabled = bad > Nightly + Guc disabled = good > Nightly + Patch + GuC enabled = bad > Nightly + Patch + GuC disabled = good > First bad + GuC disabled = good > First bad + Guc default (enabled) = bad Last good + Guc enabled = good
Highest+Blocker due to Regression w/o workaround
The drop in performance (5-10%) due to the presence of the GuC firmware is still seen when running the benchmarks. The numbers do improve when running it without the GuC. This is on top of nightly (Oct 3rd) + Mesa 11.01 + Guc/HuC patch series and enabling both i915.enable_guc_loading=2 and i915.enable_guc_submission=2. @Tommy, when you say "Last good + Guc enabled = good" what was the last good commit? I missed that during my investigation as I thought just introducing the GuC was the cause.
Last good is BISECT_BEFORE 62e1baa128f98006261308182fe3006d66b1bf61
Note that the WaRsDisableCoarsePowerGating is also meant to be improved by GuC 9.x
The latest update on this front: Next steps: 1. Continue experimenting w/ at least 4 more different builds to prove that the performance issue is somehow related to the inter dependencies between the changes from the last bad commit (commit b7137e0cf1e55b5b0cb88fbd85425a1bc0d24c3a - "drm/i915: Defer enabling rc6 til after we submit the first batch/context") and having the system w/ GuC enabled for load/submission. The 4 builds include: a. Last good commit + no GuC b. Last good commit + GuC c. Last bad commit + no GuC d. Last badd commit + GuC 2. Try with the latest (approved from VPG) GuC w/ version 9.x as suggested by Chris W to see if there is an improvement to WaRsDisableCoarsePowerGating() which is related to the RC6 changes affecting SKL. (Note, on SKL we are in fact using an older version of the GuC).
Link to the data sheet showing results of GuC 6.1 vs GuC 9.13 using drm-nightly (Oct 17). https://docs.google.com/a/intel.com/spreadsheets/d/1Y6VBlZsZ6NRRHUeyJtST4XLiEjIOc0SzGV1cjGRM1T0/edit?usp=sharing The gap is still there (GuC 9.x vs without GuC) but it's specially larger on certain tests only (~10% but not across aboard). See the data above for reference.
This issue will be investigated by the VPG team. We are able to bring the performance back as previously seen before the GuC by forcing the GPU to run at a lower frequency (i.e., 300MHz). So far the analysis of the investigation points to an imbalance of the GPU/CPU frequencies when the GuC submission is enabled causing the GPU frequencies to run too high. On certain test loads the lower CPU frequency may be causing the drop in fps.
Jeff, can you update this bug with your team's plan to investigate and fix?
Assigning to Sagar from my team. He has been leading PnP efforts around GuC. Sagar - can you give a summary?
See the description at the top. If you have new GuC FW that Eero can retest with, let him know. He can likely retest faster than you can.
SLPC Turbo helps resolve these regressions. This was tested with v9 GuC firmware and patches at https://patchwork.freedesktop.org/series/17537/. Perf-meter output shows fluctuations in the frequency with Host RPS wheres SLPC Turbo keeps running at high frequency. CPG is disabled for GT3 SKU so CPG forcewakes latency may not be stalling submission rate as in APL. More debug/analysis in progress.
Performance drop is happening due intermittent lowering of GT frequency by Host RPS. This lowering is happening due to burst of RP UP interrupts that makes Host RPS adjustments go bad and overflow negative. Kernel patch has been posted for trybot testing at https://patchwork.freedesktop.org/patch/133064/. This fix will apply to all platforms.
Eero, as original reporter, can you try Sagar's patchset and confirm then the status? thanks
Performance cycle is completed with fix (with v9 GuC firmware) and all these regressions are fixed. Results for DRM-Tip 362e5eb + http://pixel.fi.intel.com/~tsa/non-slpc-ww3.5.mbox + v9 GuC firmware at benchsrv Custom/SKL_6260U_nuci5/2017-01-24T13:45:02Z Base results: Custom/SKL_6260U_nuci5/2017-01-09T14:17:38Z
(In reply to yann from comment #30) > Eero, as original reporter, can you try Sagar's patchset and confirm then > the status? Fix verification can be only done only after the fix actually is in drm-tip (and AFAIK it's not there yet)... Why the bug is in NEEDINFO state?
Although the fix from GuC perspective is not merged, other patch that is making Host RPS handle erroneous adjustment properly is merged from https://patchwork.freedesktop.org/series/18252/. Specifically following patch is helping resolve the issue that is merged in drm-tip: 7e79a68 drm/i915: Set adjustment to zero on Up/Down interrupts if freq is already max/min With current available GuC firmware 6.1 for SKL I could verify this fix for workloads. Scores for 3 runs: default: glb_egypt_fixedtime = 129.0 <= 130.0 <= 131.0 gfxbench3_alu_offscreen = 287.9 <= 288.0 <= 288.1 with guc enabled glb_egypt_fixedtime = 130.0 <= 130.0 <= 130.0 gfxbench3_alu_offscreen = 287.9 <= 288.2 <= 289.5 with fix reverted glb_egypt_fixedtime = 119.0 <= 119.0 <= 120.0 gfxbench3_alu_offscreen = 254.5 <= 254.5 <= 254.5 If this bug is not gated by GuC submission enabling we can close this. Hence I am marking this as resolved. Kindly correct if this needs to changed.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.