Bug 76619 - [BDW regression] System hung while booting on 32 bit OS
Summary: [BDW regression] System hung while booting on 32 bit OS
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: Other All
: highest critical
Assignee: Daniel Vetter
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
: 76640 (view as bug list)
Depends on:
Blocks:
 
Reported: 2014-03-26 07:03 UTC by Guo Jinxian
Modified: 2016-10-12 10:44 UTC (History)
4 users (show)

See Also:
i915 platform:
i915 features:


Attachments
hung while booting on Ubuntu32bit (41.41 KB, text/plain)
2014-03-26 07:03 UTC, Guo Jinxian
no flags Details
Replace size_t with u64 (7.46 KB, patch)
2014-03-26 08:23 UTC, Chris Wilson
no flags Details | Splinter Review
dmesg with patch (50.45 KB, text/plain)
2014-03-27 03:43 UTC, Guo Jinxian
no flags Details
Clamp to 2G address space with 32b platforms (870 bytes, patch)
2014-03-27 18:22 UTC, Ben Widawsky
no flags Details | Splinter Review
dmesg with patch 69472 (73.18 KB, text/plain)
2014-04-04 03:28 UTC, Guo Jinxian
no flags Details
kernel build config i386 (104.05 KB, text/plain)
2014-05-15 01:30 UTC, Guo Jinxian
no flags Details
Ben's fix v4. (1.77 KB, patch)
2014-05-19 22:49 UTC, Rodrigo Vivi
no flags Details | Splinter Review
dmesg (42.22 KB, text/plain)
2014-05-20 01:36 UTC, Guo Jinxian
no flags Details
v5 (1.84 KB, patch)
2014-05-27 23:54 UTC, Ben Widawsky
no flags Details | Splinter Review
dmesg (74.82 KB, text/plain)
2014-05-28 08:50 UTC, Guo Jinxian
no flags Details

Description Guo Jinxian 2014-03-26 07:03:19 UTC
Created attachment 96401 [details]
hung while booting on Ubuntu32bit

system Environment:
--------------------------
platform: BDW HSW IV 
Kernel:	(drm-intel-nightly) 07071a9d41aed59f4f2ba66afb82ec35557a80c1

Bug detailed description:
-------------------------
System hung while booting on Ubuntu 32 bit, please check the dmesg in attachment.

This bug still can be reproduce on latest -next-queued(57e00b2e1b348b3855f4a3b681d583f8c9cf39fb), but it works well on latest -fixes(0f4706d2740f2a221cd502922b22e522009041d9).

It's first time to run testing on Ubuntu 32bit.

Reproduce steps:
--------------------
1.Booting the system
Comment 1 Chris Wilson 2014-03-26 07:46:41 UTC
BUG_ON(dev_priv->gtt.base.total < dev_priv->gtt.mappable_end);
Comment 2 Chris Wilson 2014-03-26 07:49:14 UTC
size_t considered harmful.
Comment 3 Chris Wilson 2014-03-26 08:23:25 UTC
Created attachment 96403 [details] [review]
Replace size_t with u64
Comment 4 Chris Wilson 2014-03-26 14:22:54 UTC
*** Bug 76640 has been marked as a duplicate of this bug. ***
Comment 5 Ben Widawsky 2014-03-26 17:45:40 UTC
blargh... off by 1 on the size_t. size_t is [always] big enough to represent the full GGTT address space, but not the actual size. Bummer.

Patch seems correct to me.
Comment 6 Chris Wilson 2014-03-26 17:50:44 UTC
I've booted the patch on little machines (both 32/64) just need volunteers for bdw in both configs.
Comment 7 Daniel Vetter 2014-03-26 18:17:22 UTC
NEEDINFO or QA wont look. Please test Chris' patch.
Comment 8 Guo Jinxian 2014-03-27 03:43:54 UTC
Created attachment 96440 [details]
dmesg with patch

With the patch, system still hung while booting, please check the dmesg in attachment for details. thanks.
Comment 9 Ben Widawsky 2014-03-27 04:57:01 UTC
Chris, you missed something. Maybe others?

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 926cb4f..659a00f 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1623,9 +1623,9 @@ static void i915_gtt_color_adjust(struct drm_mm_node *node,
 }
 
 void i915_gem_setup_global_gtt(struct drm_device *dev,
-                              unsigned long start,
+                              uint64_t start,
                               unsigned long mappable_end,
-                              unsigned long end)
+                              uint64_t end)
 {
        /* Let GEM Manage all of the aperture.
         *
@@ -1681,7 +1681,7 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
 void i915_gem_init_global_gtt(struct drm_device *dev)
 {
        struct drm_i915_private *dev_priv = dev->dev_private;
-       unsigned long gtt_size, mappable_size;
+       uint64_t gtt_size, mappable_size;
 
        gtt_size = dev_priv->gtt.base.total;
        mappable_size = dev_priv->gtt.mappable_end;
Comment 10 Ben Widawsky 2014-03-27 05:12:27 UTC
BTW, why is 32b OS a high/critical bug? Absolutely nothing ships in this config by default (to my knowledg).
Comment 11 Chris Wilson 2014-03-27 07:40:48 UTC
Ubuntu, steamos to name but two.
Comment 12 Daniel Vetter 2014-03-27 07:50:18 UTC
While you hunt down 32bit issues: drm_mm.c uses long instead of uint64_t, too. This is more work.

And I guess it's sufficient amount of work to block backporting the 4g stuff ...

Also note that this is all in 3.15 already.
Comment 13 Ben Widawsky 2014-03-27 18:12:56 UTC
Daniel, I think we should keep 4g, and just compile it out for 32b. Patch coming up.
Comment 14 Ben Widawsky 2014-03-27 18:22:58 UTC
Created attachment 96472 [details] [review]
Clamp to 2G address space with 32b platforms

We can probably do 3, but whatever.
Comment 15 Timo Aaltonen 2014-03-28 17:03:58 UTC
www.ubuntu.com offers the 64bit image by default, and AIUI the preinstall images use that too, fwiw
Comment 16 Guo Jinxian 2014-04-01 02:39:52 UTC
(In reply to comment #14)
> Created attachment 96472 [details] [review] [review]
> Clamp to 2G address space with 32b platforms
> 
> We can probably do 3, but whatever.

I am not sure the exactly mean of this. Should this patch be work with other patches mention above, or it can works only? 

with this patch only, the system still hung while booting.
Comment 17 Chris Wilson 2014-04-02 07:18:38 UTC
Considering the prevalence of unsigned long used for address space tracking, I think we do need to enforce the clamping as well as making sure the probe is correct.
Comment 18 Ben Widawsky 2014-04-03 01:06:09 UTC
Guo Jianxian, can you paste the dmesg with my patch?
Comment 19 Guo Jinxian 2014-04-04 03:28:29 UTC
Created attachment 96879 [details]
dmesg with patch 69472

Ben Widawsky, Here is the dmesg in attachment. thanks.
Comment 20 Daniel Vetter 2014-04-05 10:28:17 UTC
(In reply to comment #17)
> Considering the prevalence of unsigned long used for address space tracking,
> I think we do need to enforce the clamping as well as making sure the probe
> is correct.

Concurred, getting this to work for real on 32bit platforms is serious work. Imo we should simply limit to 3G on gen3+ and call it a day. 3G is also the limit userspace has available.
Comment 21 Ben Widawsky 2014-05-07 05:01:38 UTC
Patch posted to mailing list.
Comment 22 Jani Nikula 2014-05-13 08:08:08 UTC
(In reply to comment #16)
> (In reply to comment #14)
> > Created attachment 96472 [details] [review] [review] [review]
> > Clamp to 2G address space with 32b platforms
> > 
> > We can probably do 3, but whatever.
> 
> I am not sure the exactly mean of this. Should this patch be work with other
> patches mention above, or it can works only? 
> 
> with this patch only, the system still hung while booting.

Are you running a 64 bit kernel and 32 bit userspace? Does the patch from comment #14 make a difference for 32 bit kernel build?
Comment 23 Rodrigo Vivi 2014-05-13 19:15:04 UTC
Hi Guo Jinxian,

Could you please check if you also face the hang with uxa?
Just checking if it is related to other hangs we are facing already.

To check that just create a file /etc/X11/xorg.conf.d/01-uxa.conf with:

Section "Device"
        Identifier "Intel Graphics"
        Driver "intel"
        Option "AccelMethod" "uxa"
EndSection
Comment 24 Chris Wilson 2014-05-13 19:50:04 UTC
(In reply to comment #23)
> Hi Guo Jinxian,
> 
> Could you please check if you also face the hang with uxa?
> Just checking if it is related to other hangs we are facing already.
> 
> To check that just create a file /etc/X11/xorg.conf.d/01-uxa.conf with:
> 
> Section "Device"
>         Identifier "Intel Graphics"
>         Driver "intel"
>         Option "AccelMethod" "uxa"
> EndSection

That's completely irrelevant, as about as useful as testing AccelMethod "blt".
Comment 25 Guo Jinxian 2014-05-14 09:08:50 UTC
(In reply to comment #23)
> Hi Guo Jinxian,
> 
> Could you please check if you also face the hang with uxa?
> Just checking if it is related to other hangs we are facing already.
> 
> To check that just create a file /etc/X11/xorg.conf.d/01-uxa.conf with:
> 
> Section "Device"
>         Identifier "Intel Graphics"
>         Driver "intel"
>         Option "AccelMethod" "uxa"
> EndSection

Change to uxa and run testing on latest -nightly(2be456541ea41728002ccca2de5235f48d14326e). The system still hang.
Comment 26 Jani Nikula 2014-05-14 13:27:29 UTC
(In reply to comment #22)
> Are you running a 64 bit kernel and 32 bit userspace? Does the patch from
> comment #14 make a difference for 32 bit kernel build?

Please attach your kernel config.
Comment 27 Guo Jinxian 2014-05-15 01:30:07 UTC
Created attachment 99051 [details]
kernel build config i386
Comment 28 Guo Jinxian 2014-05-15 01:30:56 UTC
(In reply to comment #26)
> (In reply to comment #22)
> > Are you running a 64 bit kernel and 32 bit userspace? Does the patch from
> > comment #14 make a difference for 32 bit kernel build?
> 
> Please attach your kernel config.

Please check attachment 99051 [details] for kernel config, Thanks.
Comment 29 Rodrigo Vivi 2014-05-19 22:49:11 UTC
Created attachment 99361 [details] [review]
Ben's fix v4.

Could you please verify if this latest patch attached actually fix the issue?

Thanks in advance,
Rodrigo.
Comment 30 Guo Jinxian 2014-05-20 01:36:40 UTC
Created attachment 99366 [details]
dmesg

(In reply to comment #29)
> Created attachment 99361 [details] [review] [review]
> Ben's fix v4.
> 
> Could you please verify if this latest patch attached actually fix the issue?
> 
> Thanks in advance,
> Rodrigo.

On latest -next-queued(b6fdd0f2b990006daba19eec676b632faa523fc8) with this patch, this issue still can be reproduce. Please check the dmesg for details.
Comment 31 Ben Widawsky 2014-05-27 23:54:43 UTC
Created attachment 99989 [details] [review]
v5
Comment 32 Ben Widawsky 2014-05-27 23:54:53 UTC
Please test v5
Comment 33 Guo Jinxian 2014-05-28 08:50:26 UTC
Created attachment 100018 [details]
dmesg

(In reply to comment #32)
> Please test v5
With the patch v5, the system is able to boot normally.
Comment 34 Ben Widawsky 2014-05-28 17:09:42 UTC
Daniel wants a tested-by on the patch. Can you please add it?
The msg-id is 1401234788-17506-1-git-send-email-benjamin.widawsky@intel.com
Comment 35 Ben Widawsky 2014-05-29 05:34:55 UTC
Daniel, please see Intel emails on how to help QA finish this bug.
Comment 36 Jani Nikula 2014-06-02 13:54:04 UTC
(In reply to comment #16)
> (In reply to comment #14)
> > Created attachment 96472 [details] [review] [review] [review]
> > Clamp to 2G address space with 32b platforms
> > 
> > We can probably do 3, but whatever.
> 
> I am not sure the exactly mean of this. Should this patch be work with other
> patches mention above, or it can works only? 
> 
> with this patch only, the system still hung while booting.

I do not understand why #ifndef CONFIG_64BIT does not work...

(In reply to comment #33)
> Created attachment 100018 [details]
> dmesg
> 
> (In reply to comment #32)
> > Please test v5
> With the patch v5, the system is able to boot normally.

...but #ifdef CONFIG_X86_32 works.

The latter is probably more correct, but really, under what circumstances would CONFIG_64BIT=y hold when CONFIG_X86_32=y? Indeed the attached kernel config does *not* have CONFIG_64BIT=y.

Something is wrong with the test results.
Comment 37 Daniel Vetter 2014-06-03 07:46:19 UTC
commit 9fd631617ec9ddaf6e795caa8ea31b39866a07a1
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date:   Tue May 27 16:53:08 2014 -0700

    drm/i915/bdw: Only use 2g GGTT for 32b platforms
Comment 38 Guo Jinxian 2014-06-05 01:54:01 UTC
Tested on latest -next-queued(06946f6ea02bdd96190e20ef51a8bb9fc1f4af69), the result was passed.
Comment 39 Jari Tahvanainen 2016-10-12 10:44:27 UTC
Closing verified+fixed as commit ecb889e includes the fix.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.