Bug 90973 - Fails to reconnect after suspend/resume
Summary: Fails to reconnect after suspend/resume
Status: RESOLVED FIXED
Alias: None
Product: ModemManager
Classification: Unclassified
Component: general (show other bugs)
Version: git master
Hardware: Other All
: medium normal
Assignee: ModemManager bug user
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-06-14 07:52 UTC by Tore Anderson
Modified: 2015-12-06 11:49 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
MM/NM journal (562.76 KB, text/plain)
2015-06-14 07:52 UTC, Tore Anderson
Details
MM suspend/resume journal from Git g4c036eb (304.19 KB, text/plain)
2015-12-03 18:14 UTC, Tore Anderson
Details
mbim-proxy debug output (49.75 KB, text/plain)
2015-12-04 13:07 UTC, Tore Anderson
Details
MM debug output (258.92 KB, text/plain)
2015-12-04 13:08 UTC, Tore Anderson
Details

Description Tore Anderson 2015-06-14 07:52:34 UTC
Created attachment 116483 [details]
MM/NM journal

After suspend/resume, MM/NM fails to automatically reconnect to mobile broadband. I suspect this is due to some invalid state left over since before the suspend, leading MM to believe the bearer is still connected, while in reality it is not.

This is with MM git master from 2015-06-13 (0ac7feb), and otherwise standard Fedora 22 packages (NetworkManager-1.0.2-1.fc22.x86_64).

I'm attaching a journal. Noteworthy timestamps:

09:17:40: Booted the system, let NM auto-connect to mobile broadband (works fine)
09:18:15: Set MM/NM debugging log level
09:18:49: Suspending the system by closing laptop lid (at this point, mobile broadband remains connected)
09:19:12: Resumed system, automatic reconnect commences (fails, after a few retries)
09:22:12: Restarted ModemManager and set debugging log level
09:22:56: New connection attempt (manually launched). This succeeds.

The precise reason why the automatic reconnect attempt fails is that NM does not receive any ICMPv6 Router Advertisements from the network. This is necessary in order to fully enable an IPv6-only network connection (which this the case here). If I do the same test, using an IPv4-only APN, then the reconnection ostensibly succeeds (from NM's point of view), but it doesn't actually work - there's no internet connectivity. My explanation is that MM re-uses the IPCP/IPV6CP config received before the suspend in both cases, but that with IPv4 this contains sufficient information for NM to fully enable the connection. With IPv6, on the other hand, it is not - ICMPv6 RA is also necessary.

A big "smoking gun" for the improper re-use of IPCP/IPV6CP parameter can be seen by looking at the assigned addresses. Both before the suspend, and in all automatic reconnect attempts, MM claims that the network has assigned a link-local address of fe80::23:4b65:b401. Since the link-local address is a randomly assigned 64-bit value, it is extremely improbable that this actually happened. You'll see that after MM was restarted, a different link-localaddress was assigned in the final reconnection attempt (fe80::29:83e0:2c01).

Finally, these log lines seen after the suspend/resume look rather suspicious to me - I doubt it is truly the case that the bearer is "already connected" here:

juni 14 09:19:38 envy.fud.no ModemManager[725]: <debug> Couldn't disconnect bearer '/org/freedesktop/ModemManager1/Bearer/2'
juni 14 09:19:38 envy.fud.no ModemManager[725]: <info>  Modem /org/freedesktop/ModemManager1/Modem/0: state changed (disconnecting -> connected)
juni 14 09:19:38 envy.fud.no ModemManager[725]: <info>  Simple connect state (5/8): Register
juni 14 09:19:38 envy.fud.no ModemManager[725]: <debug> Already registered in network '24201', automatic registration not launched...
juni 14 09:19:38 envy.fud.no ModemManager[725]: <info>  Simple connect state (6/8): Bearer
juni 14 09:19:38 envy.fud.no ModemManager[725]: <debug> Using already existing bearer at '/org/freedesktop/ModemManager1/Bearer/2'...
juni 14 09:19:38 envy.fud.no ModemManager[725]: <info>  Simple connect state (7/8): Connect
juni 14 09:19:38 envy.fud.no ModemManager[725]: <debug> Bearer at '/org/freedesktop/ModemManager1/Bearer/2' is already connected...
Comment 1 Tore Anderson 2015-06-18 17:38:25 UTC
I see there's NotOpened errors in the log during the failed re-connection attempts. I think maybe bug #84992 has regressed.

For what it's worth I tried to update libmbim to today's Git master. That didn't help either.
Comment 2 Aleksander Morgado 2015-12-03 11:38:25 UTC
Hey Tore, sorry for taking so long to check this.

I've checked your log and I don't see any message showing systemd notifications. I've tried this with my embedded MBIM modem in my laptop and I do get the notification, after which the modem gets explicitly closed:

ModemManager[6266]: <debug> [1449142207.844067] [mm-sleep-monitor-systemd.c:138] signal_cb(): [sleep-monitor] received PrepareForSleep signal: 1
ModemManager[6266]: <debug> [1449142207.844098] [main.c:63] sleeping_cb(): Removing devices... (sleeping)
ModemManager[6266]: <debug> [1449142207.844337] [mm-base-bearer.c:895] base_bearer_dbus_unexport(): Removing from DBus bearer at '/org/freedesktop/ModemManager1/Bearer/1'
ModemManager[6266]: <debug> [1449142207.844364] [mm-device.c:393] unexport_modem(): Unexported modem '/sys/devices/pci0000:00/0000:00:14.0/usb2/2-4' from path '/org/freedesktop/ModemManager1/Modem/0'
ModemManager[6266]: [/dev/cdc-wdm1] Sent message...
<<<<<< RAW:
<<<<<<   length = 12
<<<<<<   data   = 02:00:00:00:0C:00:00:00:20:00:00:00
ModemManager[6266]: [/dev/cdc-wdm1] Sent message (translated)...
<<<<<< Header:
<<<<<<   length      = 12
<<<<<<   type        = close (0x00000002)
<<<<<<   transaction = 32
ModemManager[6266]: <debug> [1449142207.844645] [mm-port-serial.c:1421] port_serial_close_force(): (cdc-wdm2) forced to close port
ModemManager[6266]: <debug> [1449142207.844660] [mm-port-serial.c:1421] port_serial_close_force(): (ttyACM2) forced to close port
ModemManager[6266]: <debug> [1449142207.844673] [mm-port-serial.c:1421] port_serial_close_force(): (ttyACM0) forced to close port
ModemManager[6266]: <debug> [1449142207.844828] [mm-sleep-monitor-systemd.c:68] drop_inhibitor(): [sleep-monitor] dropping systemd sleep inhibitor
ModemManager[6266]: [/dev/cdc-wdm1] Received message...
>>>>>> RAW:
>>>>>>   length = 16
>>>>>>   data   = 02:00:00:80:10:00:00:00:20:00:00:00:00:00:00:00

Is there any chance you have compiled MM git master without --with-suspend-resume=systemd ?
Comment 3 Tore Anderson 2015-12-03 12:56:26 UTC
Hi Aleksander,

> Is there any chance you have compiled MM git master without
> --with-suspend-resume=systemd ?

Absolutely, as I was just re-using the spec file that came with Fedora's MM RPM packages. I'll try to re-build with that flag tonight and let you know how that goes.

It's curious, though, that it *used* to work at one point, and I'm fairly certain I've never used --with-suspend-resume=systemd (first time I've heard of it being necessary). Maybe something else on my system changed.

Tore
Comment 4 Tore Anderson 2015-12-03 18:14:15 UTC
Created attachment 120312 [details]
MM suspend/resume journal from Git g4c036eb

Ok, so with MM Git g4c036eb built on an otherwise current F23 installation, configured with --with-suspend-resume=systemd, I get a bit further - but reconnection still does not work after a suspend/resume cycle. It appears that the modem isn't correctly initialised after the suspend (leading NM to think that mobile broadband is disabled), as shown by "mmcli -m 1":

/org/freedesktop/ModemManager1/Modem/1 (device id 'unknown')
  -------------------------
  Hardware |   manufacturer: 'unknown'
           |          model: 'unknown'
           |       revision: 'unknown'
           |      supported: 'none'
           |        current: 'none'
           |   equipment id: 'unknown'
  -------------------------
  System   |         device: '/sys/devices/pci0000:00/0000:00:1a.0/usb3/3-1/3-1.5'
           |        drivers: 'cdc_acm, cdc_mbim, cdc_wdm'
           |         plugin: 'Ericsson MBM'
           |   primary port: 'cdc-wdm1'
           |          ports: 'ttyACM1 (at), wwp0s26u1u5i6 (net), cdc-wdm0 (at), cdc-wdm1 (mbim), cdc-wdm2 (at), ttyACM2 (gps), ttyACM0 (at)'
  -------------------------
  Numbers  |           own : 'unknown'
  -------------------------
  Status   |           lock: 'unknown'
           | unlock retries: 'unknown'
           |          state: 'unknown'
           |    power state: 'unknown'
           |    access tech: 'unknown'
           | signal quality: '0' (cached)
  -------------------------
  Modes    |      supported: 'allowed: any; preferred: none'
           |        current: 'allowed: any; preferred: none'
  -------------------------
  Bands    |      supported: 'unknown'
           |        current: 'unknown'
  -------------------------
  IP       |      supported: 'none'
  -------------------------
  SIM      |           path: 'none'

  -------------------------
  Bearers  |          paths: 'none'

The attached journal shows debug-level messages from MM leading up to this. First I started MM, then connected to WWAN, before I suspended and resumed the system. As before, restarting MM manually resolves the problem.
Comment 5 Aleksander Morgado 2015-12-04 09:29:43 UTC
Ok, we're close I think. It looks like the mbim-proxy may be the issue here.

Could you re-run the same test but this time also getting mbim-proxy debug logs? You can do this as follows:

1) Stop ModemManager
2) killall -9 mbim-proxy
3) /usr/libexec/mbim-proxy --verbose
4) /usr/sbin/ModemManager --debug    // in another terminal
Comment 6 Tore Anderson 2015-12-04 13:07:03 UTC
Created attachment 120340 [details]
mbim-proxy debug output

The version is libmbim-utils-1.12.2-1.fc23.x86_64 in case that's relevant.
Comment 7 Tore Anderson 2015-12-04 13:08:10 UTC
Created attachment 120341 [details]
MM debug output

Attaching a new MM debug output so you can sync up timestamps with the mbim-proxy output.
Comment 8 Tore Anderson 2015-12-04 13:32:38 UTC
I tried upgrading libmbim to git master (e2b3db1) too, and that seems to work much better. At least I've suspended/resumed a few times and it have reconnected fine every time.

So I'll just close this ticket as fixed now, and I guess I'll just have to live with having to build custom packages until MM v1.5 (hopefully using --with-suspend-resume=systemd) and libmbim v1.13 hits the Fedora repos.

Thanks for helping me figure out I needed the --with-suspend-resume=systemd configure option!
Comment 9 Aleksander Morgado 2015-12-06 11:49:52 UTC
(In reply to Tore Anderson from comment #8)
> I tried upgrading libmbim to git master (e2b3db1) too, and that seems to
> work much better. At least I've suspended/resumed a few times and it have
> reconnected fine every time.
> 

Yes, libmbim git master also has fixes related to the suspend/resume cycle, e.g.:

http://cgit.freedesktop.org/libmbim/libmbim/commit/?id=9d8acbe9e5edbb1fb9bfdbc47d5fda1631578523


> So I'll just close this ticket as fixed now, and I guess I'll just have to
> live with having to build custom packages until MM v1.5 (hopefully using
> --with-suspend-resume=systemd) and libmbim v1.13 hits the Fedora repos.
> 

Great! We should be releasing a new version soon, hopefully.

> Thanks for helping me figure out I needed the --with-suspend-resume=systemd
> configure option!

Np :)


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.