Created attachment 32151 [details] dmesg booting working linux 2.6.31 kernel Following an upgrade from 2.6.31 to 2.6.32 I'm running into DRM issues at boot. Here's a diff of the dmesg between them (- is 31, + is 32) starting from [drm]: [drm] Initialized drm 1.1.0 20060810 i915 0000:00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 i915 0000:00:02.0: setting latency timer to 64 mtrr: type mismatch for d0000000,10000000 old: write-back new: write-combining [drm] MTRR allocation failed. Graphics performance may suffer. i915 0000:00:02.0: irq 27 for MSI/MSI-X -[drm] TMDS-8: set mode 720x480 d -Console: switching to colour frame buffer device 90x30 -[drm] fb0: inteldrmfb frame buffer device -[drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0 +[drm] set up 7M of stolen space +nommu_map_sg: overflow 22e000000+4096 of device mask ffffffff +[drm:drm_agp_bind_pages] *ERROR* Failed to bind AGP memory: -12 +[drm:i915_driver_load] *ERROR* failed to init modeset +i915: probe of 0000:00:02.0 failed with error -28 I'm attaching the full dmesg output and configs for both kernels, as well as an lspci under 2.6.32.
Created attachment 32152 [details] dmesg booting broken linux 2.6.32 kernel
Created attachment 32153 [details] working linux 2.6.31 kernel configuration
Created attachment 32154 [details] broken linux 2.6.32 kernel configuration
Created attachment 32155 [details] lspci generated under linux 2.6.32
In consider David's patch commit ec402ba97a6479dd80488b4404a73275e894289f Author: David Woodhouse <dwmw2@infradead.org> Date: Wed Nov 18 10:22:46 2009 +0000 agp/intel-agp: Set dma_mask for capable chipsets before agp_add_bridge() is already in 2.6.32, this one seems failed somehow to set dma mask to 36 bit in 32 bit system (although dmesg doesn't show failure of that...). Can you tried to enable CONFIG_SWIOTLB or enable VT-d in BIOS and force "intel_iommu=on"? I'm not quite sure if there's any bug in nommu case.
The BIOS on the ASUS P2E-VM HDMI board where I'm experiencing this issue appears to lack any option for enabling VT-d. I'll hopefully get time to build a custom kernel with CONFIG_SWIOTLB this weekend and provide some feedback on that front.
Should CONFIG_SWIOTLB be expected to work on 32-bit i386 arch? There's no make menuconfig option for it where arch/x86/Kconfig suggests it ought to appear, but the option search shows me: Symbol: SWIOTLB [=n] Selected by: GART_IOMMU [=n] && X86_64 [=n] && PCI [=y] || CALGARY_IOMMU [=n] && X86_64 [=n] && PCI [=y] && EXPERIMENTAL [=y] || AMD_IOMMU [=n] && X86_64 [=n] && PCI [=y] && ACPI [=y] ...and checking those options in the original config: $ grep -e CONFIG_GART_IOMMU= -e CONFIG_X86_64= -e CONFIG_PCI= -e CONFIG_CALGARY_IOMMU= -e CONFIG_EXPERIMENTAL= -e CONFIG_AMD_IOMMU= -e CONFIG_ACPI= .config CONFIG_EXPERIMENTAL=y CONFIG_ACPI=y CONFIG_PCI=y I'm at a loss as to why GART_IOMMU [=n] && X86_64 [=n] && PCI [=y] isn't setting this (or, more likely, I'm misunderstanding the kernel option conditionals here). What am I missing? Apologies in advance...
Created attachment 32308 [details] lspci Thinkpad X61 Tablet
Hi, I'm receiving the same error messages, running 2.6.32 in 64 bit. to narrow the error a bit, I tried 2.6.31.2, 2.6.31.9, and 2.6.32-rc1, and first 2 work but the rc1 gives me the same problems. I checked my config, and I found CONFIG_SWIOTLB=y My only other guess is that 2.6.32 works fine if I have 2gb of ram in my computer, but if I have 8gb it fails. I'm also attaching my lspci. Thanks.
I, too, am experiencing this on an 8GiB system with PAE support in the kernel. Switching to a non-PAE generic kernel for the same version results in this dmesg diff starting from drm (- is PAE, + is non-PAE): [drm] Initialized drm 1.1.0 20060810 i915 0000:00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 i915 0000:00:02.0: setting latency timer to 64 mtrr: type mismatch for d0000000,10000000 old: write-back new: write-combining [drm] MTRR allocation failed. Graphics performance may suffer. i915 0000:00:02.0: irq 27 for MSI/MSI-X [drm] set up 7M of stolen space -nommu_map_sg: overflow 22e000000+4096 of device mask ffffffff -[drm:drm_agp_bind_pages] *ERROR* Failed to bind AGP memory: -12 -[drm:i915_driver_load] *ERROR* failed to init modeset -i915: probe of 0000:00:02.0 failed with error -28 +[drm] TMDS-8: set mode 1280x720 c +Console: switching to colour frame buffer device 160x45 +fb0: inteldrmfb frame buffer device +registered panic notifier +[drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0 I'll attach a full dmesg and config for the non-PAE kernel shortly.
Created attachment 32314 [details] dmesg booting non-PAE linux 2.6.32 kernel
Created attachment 32315 [details] non-PAE linux 2.6.32 kernel configuration
Could you try bisect kernel? The 32bit mask in error log seems weird.
Created attachment 32317 [details] [review] agp debug patch please apply this debug patch against your failure kernel and attach dmesg log.
I tried the patch in my computer and it didn't work. I'm attaching the dmesg output. Also I forgot to mention, I found another symptom, there is corruption in the penguin logo during boot. The corruption looks like black lines that appear on top of the image. for me it looks like the output of the console is mapped there. The patterns are (almost) the same on every boot.
Created attachment 32318 [details] Corrupted penguins
Created attachment 32319 [details] dmesg 2.6.32.2 with agp patch
Current workaround should be disable CONFIG_DMAR on no VT-d machine. I think in agpgart we might fallback to origin memory insert function in case no VT-d available.
Hi, The DMAR deactivation does the trick! I can boot 2.6.32 without problems. Anyway, since I had it configured in my previous kernels, I understand this is just a temporary fix. I'm currently bisecting the kernel (4 more iterations...) so as soon as I have the result, I'll post it. Thanks!
the kernel bisect says that the first bad commit is 176616814d700f19914d8509d9f65dec51a6ebf7, "intel_agp: Use PCI DMA API correctly on chipsets new enough to have IOMMU". This is consistent with the workaround proposed. I hope this helps, if you need, I can try patches in my computer. Thanks for your help!
Created attachment 32369 [details] [review] disable pci dma mapping in non-iommu case This one trys to revert back to origin behavior in non-iommu case. I'm not quite sure if this is the ideal solution for now, but it should fix the current issue you see.
Created attachment 32379 [details] [review] set dma mask in i915 driver Could you help to test this patch instead? The real problem here might be we failed to setup 36 bit dma mask properly. thanks.
Created attachment 32383 [details] dmesg with dma patch it looks like it is not the 36 bit dma mask. Anyway, I'm attaching the whole dmesg output.
Unfortunately, the "set dma mask in i915 driver" patch does not solve my issue. The dmesg booting with this patch is basically identical to the one I get when booting with Debian/sid's current linux-image-2.6.32-686-bigmem package. I continue to see the "nommu_map_sg: overflow XXXX00000+4096 of device mask ffffffff" error followed by "Failed to bind AGP memory: -12" (and no working DRM, obviously). I'm in no hurry for a fix, as I can still adequately test non-PAE 2.6.32 x86 kernels for now (albeit without access to all the RAM on this machine), but am happy to assist in whatever way I can util you have access to a suitable machine with 4+GiB RAM.
Jeremy, try compiling your kernel with PAE enabled and DMAR disabled. I think it should work (in my case, 64 bit and DMAR disabled works).
Confirmed that disabling DMA remapping in the kernel config also works around this for x86+PAE.
Created attachment 32430 [details] [review] remove dma mask setting in drm_pci_alloc() This is the refreshed patch after investigate the real cause of dma mask setting failure. Please help to verify it fixes your problem.
Dear Wang Zhenyu, your patch works nicely! Thanks for your help! If you need to try something else, just drop me an email.
Booting an x86+PAE kernel with your new "remove dma mask setting in drm_pci_alloc()" results in an interesting warning backtrace in dmesg (and seems to break KMS): WARNING: at /tmp/linux-2.6-2.6.32/debian/build/source_i386_none/drivers/gpu/drm/drm_crtc_helper.c:1032 drm_helper_initial_config+0x33/0x4c [drm_kms_helper]() I'll attach new dmesg and Xorg.0.log examples shortly.
Created attachment 32442 [details] dmesg booting 2.6.32 kernel without dma mask setting
Created attachment 32443 [details] Xorg.0.log under 2.6.32 kernel without dma mask setting
Okay, actually, forget my last update. I was testing this remotely from work on my HTPC at home, and these errors were probably because the HDTV was switched to a different input or turned off altogether. I'll have to retest this evening when I get home. Apologies for the confusion.
Okay, yes, the "remove dma mask setting in drm_pci_alloc()" patch fixes the regression under Linux 2.6.32 x86+PAE on my 8GiB RAM machine. The dmesg now looks esentially the same as it did for 2.6.31 (better, in fact, since it now picks a superior default resolution). Thanks again!
Patch is in Linus's tree. Close. commit e6be8d9d17bd44061116f601fe2609b3ace7aa69 Author: Zhenyu Wang <zhenyu.z.wang@intel.com> Date: Tue Jan 5 11:25:05 2010 +0800 drm: remove address mask param for drm_pci_alloc() drm_pci_alloc() has input of address mask for setting pci dma mask on the device, which should be properly setup by drm driver. And leave it as a param for drm_pci_alloc() would cause confusion or mistake would corrupt the correct dma mask setting, as seen on intel hw which set wrong dma mask for hw status page. So remove it from drm_pci_alloc() function. Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.