Originally reported at: https://bugzilla.gnome.org/show_bug.cgi?id=733086 Please refer to the original bug report if more details are needed. When suspending, ModemManager keeps the state of the modem (MBIM open), but the modem itself gets fully reseted and needs an explicit new MBIM open, or otherwise the MBIM port will keep on saying that it was not opened: [/dev/cdc-wdm1] Received message (translated)... >>>>>> Header: >>>>>> length = 16 >>>>>> type = function-error (0x80000004) >>>>>> transaction = 12 >>>>>> Contents: >>>>>> error = 'NotOpened' (0x00000005)
An additional report which is showing the same issue, but hibernating instead of suspending: https://bugzilla.gnome.org/show_bug.cgi?id=725377 http://lists.freedesktop.org/archives/modemmanager-devel/2014-February/000893.html
Created attachment 111370 [details] [review] Patch 1/3
Created attachment 111371 [details] [review] Patch 2/3
Created attachment 111372 [details] [review] Patch 3/3
Tore, can you apply those 3 patches and see if you get the issue fixed? The outcome should be that once MM detects that the port isn't opened, it will mark the modem as invalid and it should then re-probe it.
(In reply to Aleksander Morgado from comment #5) > Tore, can you apply those 3 patches and see if you get the issue fixed? The > outcome should be that once MM detects that the port isn't opened, it will > mark the modem as invalid and it should then re-probe it. Wait... not yet, I'll need to review it because when marked as invalid, the modem ports are not re-probed...
(In reply to Aleksander Morgado from comment #6) > (In reply to Aleksander Morgado from comment #5) > > Tore, can you apply those 3 patches and see if you get the issue fixed? The > > outcome should be that once MM detects that the port isn't opened, it will > > mark the modem as invalid and it should then re-probe it. > > Wait... not yet, I'll need to review it because when marked as invalid, the > modem ports are not re-probed... Tore, I pushed a new "aleksander/mbim-not-open" branch to upstream git; could you check it and tell me if that solves the issue? Debug logs would also help.
Created attachment 111380 [details] Debug logs Hi, I tested the branch (df06d3a7), it doesn't fix the problem. Attaching an debug log. Timestamps of various activities so you can correlate with the log: 23:41:23 ModemManager started 23:42:33 Test SMS successfully sent (using mmcli) 23:43:16 System suspended 23:43:41 System resumed 23:44:13 Test SMS failed to send, error message: «couldn't send the SMS: 'GDBus.Error:org.freedesktop.libmbim.Error.Core.Timeout: Couldn't send SMS part: Transaction timed out'» 23:45:30 Attempted to connect to mobile broadband using NetworkManager, fails with «Couldn't connect bearer: 'Transaction timed out'» After restarting ModemManager I can successfully send SMS and connect to mobile broadband again.
(In reply to Tore Anderson from comment #8) > Created attachment 111380 [details] > Debug logs > > Hi, I tested the branch (df06d3a7), it doesn't fix the problem. Attaching an > debug log. Timestamps of various activities so you can correlate with the > log: > > 23:41:23 ModemManager started > 23:42:33 Test SMS successfully sent (using mmcli) > 23:43:16 System suspended > 23:43:41 System resumed > 23:44:13 Test SMS failed to send, error message: «couldn't send the SMS: > 'GDBus.Error:org.freedesktop.libmbim.Error.Core.Timeout: Couldn't send SMS > part: Transaction timed out'» > 23:45:30 Attempted to connect to mobile broadband using NetworkManager, > fails with «Couldn't connect bearer: 'Transaction timed out'» > > After restarting ModemManager I can successfully send SMS and connect to > mobile broadband again. Gah... this is very unfortunate. We no longer get NotOpened errors; instead the modem doesn't even reply at all after suspend, and I guess that a new MBIM re-open solves it. One thing that can be done is to mark the modem as invalid when we get N consecutive timeouts in commands; like what we do with RS232 modems. Another possibility would be to explicitly listen for suspend/hibernate events via DBus, and detect when a modem doesn't respond after such events. It probably is worth to go on and start implementing logic for the latter case (monitoring suspend/hibernate), as I believe we could make use of that info on more situations... Anyway, checking consecutive timeouts is a quick hack we can try as well. Timeouts in MBIM are very rare, I would say, so just a threshold of 2 could be enough as a temporary solution until we monitor PM events.
(In reply to Aleksander Morgado from comment #9) > (In reply to Tore Anderson from comment #8) > > Created attachment 111380 [details] > > Debug logs > > > > Hi, I tested the branch (df06d3a7), it doesn't fix the problem. Attaching an > > debug log. Timestamps of various activities so you can correlate with the > > log: > > > > 23:41:23 ModemManager started > > 23:42:33 Test SMS successfully sent (using mmcli) > > 23:43:16 System suspended > > 23:43:41 System resumed > > 23:44:13 Test SMS failed to send, error message: «couldn't send the SMS: > > 'GDBus.Error:org.freedesktop.libmbim.Error.Core.Timeout: Couldn't send SMS > > part: Transaction timed out'» > > 23:45:30 Attempted to connect to mobile broadband using NetworkManager, > > fails with «Couldn't connect bearer: 'Transaction timed out'» > > > > After restarting ModemManager I can successfully send SMS and connect to > > mobile broadband again. > > Gah... this is very unfortunate. We no longer get NotOpened errors; instead > the modem doesn't even reply at all after suspend, and I guess that a new > MBIM re-open solves it. One thing that can be done is to mark the modem as > invalid when we get N consecutive timeouts in commands; like what we do with > RS232 modems. Another possibility would be to explicitly listen for > suspend/hibernate events via DBus, and detect when a modem doesn't respond > after such events. > > It probably is worth to go on and start implementing logic for the latter > case (monitoring suspend/hibernate), as I believe we could make use of that > info on more situations... > > Anyway, checking consecutive timeouts is a quick hack we can try as well. > Timeouts in MBIM are very rare, I would say, so just a threshold of 2 could > be enough as a temporary solution until we monitor PM events. Pushed a new branch, named "aleksander/mbim-not-open-2". Could you test that one? If we detect not-opened errors OR 2 or more consecutive MBIM command timeouts, we'll fully re-probe the modem.
Created attachment 111396 [details] Debug logs Doesn't quite do the trick either: 12:59:56 Booted system 13:05:12 SMS test #1 OK 13:06:52 Suspending system 13:07:08 System resumed 13:08:15 SMS test #2 fails: error: couldn't send the SMS: 'GDBus.Error:org.freedesktop.libmbim.Error.Core.Timeout: Couldn't send SMS part: Transaction timed out' 13:09:41 Trying to connect mobile broadband: fails 13:10:11 MBIM timeout occurs and MM tries to re-init the modem, but this ends up failing too: <warn> Modem couldn't be initialized: couldn't load current capabilities: Transaction timed out
(In reply to Tore Anderson from comment #11) > Created attachment 111396 [details] > Debug logs > > Doesn't quite do the trick either: > > 12:59:56 Booted system > 13:05:12 SMS test #1 OK > 13:06:52 Suspending system > 13:07:08 System resumed > 13:08:15 SMS test #2 fails: error: couldn't send the SMS: > 'GDBus.Error:org.freedesktop.libmbim.Error.Core.Timeout: Couldn't send SMS > part: Transaction timed out' > 13:09:41 Trying to connect mobile broadband: fails > 13:10:11 MBIM timeout occurs and MM tries to re-init the modem, but this > ends up failing too: > <warn> Modem couldn't be initialized: couldn't load current capabilities: > Transaction timed out We're getting closer :) Now it's the proxy the one in the middle. I wonder if it is the proxy itself the one getting the NotOpened errors and whether if it is the one causing the timeout errors in ModemManager. Could you retry the same branch, but this time running the proxy itself in another terminal? E.g.: term1 $> sudo /usr/libexec/mbim-proxy --verbose term2 $> sudo /usr/sbin/ModemManager --debug
Created attachment 111422 [details] MM debug log Certainly! 11:52:26 Started MM 11:53:48 SMS #1 - OK 11:54:33 Suspended system 11:54:54 Resumed system 11:55:49 SMS #2 - fails (error: couldn't send the SMS: 'GDBus.Error:org.freedesktop.libmbim.Error.Core.Timeout: Couldn't send SMS part: Transaction timed out') 11:56:45 WWAN connection attempt - fails: (<warn> MBIM transactions timed out, reporting invalid modem...) Afterwards, it tries to re-init the modem, but all the info is wrong: /org/freedesktop/ModemManager1/Modem/1 (device id 'unknown') ------------------------- Hardware | manufacturer: 'unknown' | model: 'unknown' | revision: 'unknown' | supported: 'none' | current: 'none' | equipment id: 'unknown' ------------------------- System | device: '/sys/devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.5' | drivers: 'cdc_acm, cdc_mbim, cdc_wdm' | plugin: 'Ericsson MBM' | primary port: 'cdc-wdm2' | ports: 'ttyACM1 (at), wwp0s26u1u5i6 (net), ttyACM0 (at), cdc-wdm2 (mbim), ttyACM2 (at)' ------------------------- Numbers | own : 'unknown' ------------------------- Status | lock: 'unknown' | unlock retries: 'unknown' | state: 'unknown' | power state: 'unknown' | access tech: 'unknown' | signal quality: '0' (cached) ------------------------- Modes | supported: 'allowed: any; preferred: none' | current: 'allowed: any; preferred: none' ------------------------- Bands | supported: 'unknown' | current: 'unknown' ------------------------- IP | supported: 'none' ------------------------- SIM | path: 'none' ------------------------- Bearers | paths: 'none' 11:59:08 Stopped MM Tore
Created attachment 111423 [details] mbim-proxy verbose output
(In reply to Tore Anderson from comment #14) > Created attachment 111423 [details] > mbim-proxy verbose output Ah, there they are the NotOpened errors. The thing now is that MM tries to re-open the device, but the proxy still has the same /dev/cdc-wdm port flagged as being open, while it really isn't after the suspend/resume. We now need a libmbim fix to handle this case.
Ok, next try. I've pushed two branches, one for libmbim named "aleksander/function-error-in-transaction" and another one in ModemManager named "aleksander/mbim-not-open-3". Could you retry with these and get me debug logs as before both in the proxy and in MM?
(In reply to Aleksander Morgado from comment #16) > I've pushed two branches, one for libmbim named > "aleksander/function-error-in-transaction" and another one in ModemManager > named "aleksander/mbim-not-open-3". I don't see the 'mbim-not-open-3' branch: http://cgit.freedesktop.org/ModemManager/ModemManager/log/?h=aleksander/mbim-not-open-3 Could you double check if you pushed it correctly? Tore
(In reply to Tore Anderson from comment #17) > (In reply to Aleksander Morgado from comment #16) > > > I've pushed two branches, one for libmbim named > > "aleksander/function-error-in-transaction" and another one in ModemManager > > named "aleksander/mbim-not-open-3". > > I don't see the 'mbim-not-open-3' branch: > > http://cgit.freedesktop.org/ModemManager/ModemManager/log/?h=aleksander/mbim- > not-open-3 > > Could you double check if you pushed it correctly? > > Tore Sorry, pushed now.
Clear signs of progress now, I can connect to mobile broadband after resuming, but I still couldn't send an SMS. 16:15:12 Booted system 16:17:21 Test SMS - OK 16:18:57 Suspended system 16:45:36 Resumed system 16:49:38 Test SMS - fails: error: couldn't send the SMS: 'GDBus.Error:org.freedesktop.libmbim.Error.Protocol.NotOpened: Couldn't send SMS part: MBIM protocol error: NotOpened' 16:57:33 Activating WWAN connection from NM - OK 16:58:09 Deactivated WWAN 16:58:25 Stopped MM
Created attachment 111473 [details] MM debug log
Created attachment 111474 [details] mbim-proxy verbose output
(In reply to Tore Anderson from comment #19) > Clear signs of progress now, I can connect to mobile broadband after > resuming, but I still couldn't send an SMS. > > 16:15:12 Booted system > 16:17:21 Test SMS - OK > 16:18:57 Suspended system > 16:45:36 Resumed system > 16:49:38 Test SMS - fails: error: couldn't send the SMS: > 'GDBus.Error:org.freedesktop.libmbim.Error.Protocol.NotOpened: Couldn't send > SMS part: MBIM protocol error: NotOpened' > 16:57:33 Activating WWAN connection from NM - OK > 16:58:09 Deactivated WWAN > 16:58:25 Stopped MM I think that could actually be the expected result. The thing here is that ModemManager will, *as soon as it detects* that the MBIM device is not open, fully reprobe it. The key point here is the "as soon as it detects" it. In your test, you tried to send an SMS after resuming from suspend, and it was in that same moment when ModemManager removed the modem and re-created it from scratch (remember that the state of the modem was reseted). So, there will be at least 1 action which will error out with "Not Opened": this can either be a user action (like your SMS) or another automatic action (like a signal quality check request). If you confirm that *after* seeing the modem recreated by MM you can re-connect and re-send SMS, then I think we can conclude this is fixed.
I did some more testing, but only with WWAN connections this time: 17:17:03 Started MM 17:18:30 Activated WWAN - OK 17:18:44 Disconnected WWAN - OK 17:19:09 Suspended system 17:19:18 Resumed system 17:20:40 Activated WWAN - failed (failed to connect modem: MBIM protocol error: NotOpened) 17:21:34 Activated WWAN - failed (failed to connect modem: Message did not receive a reply (timeout by message bus) 17:21:39 ModemManager segfaults and is restarted by systemd 17:22:20 Activated WWAN - failed (failed to connect modem: MaxActivatedContexts) 17:23:10 Activated WWAN - failed (failed to connect modem: MaxActivatedContexts) 17:23:19 Activated WWAN - failed (failed to connect modem: MaxActivatedContexts) 17:26:01 Stopped ModemManager Anyway, from my "dumb user" point of view I think it would be ideal that these NotOpened errors never propagate up so that my user-requested actions (whether that is sending an SMS or connecting to WWAN) never fail because of them. While I certainly can live with having to click the connect button twice in order to get connected after resume, it would be better if I only had to click it once (and possibly have to wait a little bit longer while the modem was re-initialised if necessary).
Created attachment 111476 [details] MM debug log - WWAN-only testing
Created attachment 111477 [details] mbim-proxy verbose output - WWAN-only testing - pre-segfault
Created attachment 111478 [details] mbim-proxy verbose output - WWAN-only testing - post-segfault
(In reply to Tore Anderson from comment #23) > I did some more testing, but only with WWAN connections this time: > > 17:17:03 Started MM > 17:18:30 Activated WWAN - OK > 17:18:44 Disconnected WWAN - OK > 17:19:09 Suspended system > 17:19:18 Resumed system > 17:20:40 Activated WWAN - failed (failed to connect modem: MBIM protocol > error: NotOpened) > 17:21:34 Activated WWAN - failed (failed to connect modem: Message did not > receive a reply (timeout by message bus) > 17:21:39 ModemManager segfaults and is restarted by systemd > 17:22:20 Activated WWAN - failed (failed to connect modem: > MaxActivatedContexts) > 17:23:10 Activated WWAN - failed (failed to connect modem: > MaxActivatedContexts) > 17:23:19 Activated WWAN - failed (failed to connect modem: > MaxActivatedContexts) > 17:26:01 Stopped ModemManager > > Anyway, from my "dumb user" point of view I think it would be ideal that > these NotOpened errors never propagate up so that my user-requested actions > (whether that is sending an SMS or connecting to WWAN) never fail because of > them. While I certainly can live with having to click the connect button > twice in order to get connected after resume, it would be better if I only > had to click it once (and possibly have to wait a little bit longer while > the modem was re-initialised if necessary). We currently cannot make that action succeed with the current codebase. The action errors out with NotOpened, and we end up fully recreating the Modem object. The newly created modem knows anything about that action which was sent to that other modem object (which had actually a different DBus object path). The newly created modem object has nothing to do with the old modem object... So either we just propagate the NotOpened error as we do now, or we end up timing out that connection attempt. The MaxActivatedContexts is yet another thing to consider I think... Could be that the modem is connected at some point but ModemManager doesn't assume that and tries to re-connect again. And the segfault is very unfortunate; I wonder if you could get some backtrace just of that one. Anyway, I think I'll need to spend some more time reviewing the logic. The fact that I cannot reproduce this, even if I have a Ericsson MBIM modem in my thinkpad, is also unfortunate :/
(In reply to Aleksander Morgado from comment #22) > this can either be a user action (like your SMS) or another automatic action > (like a signal quality check request). FWIW I don't think the automatic signal quality checks are sufficient to trigger the re-initialisation of the modem. When I suspend/resume without doing any "user actions" afterwards, all I see in the log is this: <debug> Signal quality value not updated in 60s, marking as not being recent I'll try to get a backtrace of the segfault...
Here's a backtrace of the crash. I started ModemManager, suspended, resumed, then tried to up a WWAN connection which failed with "Error: Connection activation failed: Active connection removed before it was initialized" from nmcli, then I tried to up it again, and MM segfaulted (the nmcli command timed out after 90 secs). [root@envy ~]# gdb /usr/sbin/ModemManager GNU gdb (GDB) Fedora 7.8.1-36.fc21 Copyright (C) 2014 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /usr/sbin/ModemManager...Reading symbols from /usr/lib/debug/usr/sbin/ModemManager.debug...done. done. (gdb) run Starting program: /usr/sbin/ModemManager [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". [New Thread 0x7ffff4e92700 (LWP 14479)] ModemManager[14475]: <info> ModemManager (version 1.5.0-0.2.fc21) starting in system bus... [New Thread 0x7fffeffff700 (LWP 14480)] [New Thread 0x7fffef7fe700 (LWP 14481)] ModemManager[14475]: opening device... ModemManager[14475]: cannot connect to proxy: Could not connect: Connection refused ModemManager[14475]: spawning new mbim-proxy (try 1)... Detaching after fork from child process 14483. ModemManager[14475]: [/dev/cdc-wdm1] Read max control message size from descriptors file: 512 ModemManager[14475]: <info> Creating modem with plugin 'Ericsson MBM' and '7' ports ModemManager[14475]: <info> Modem for device at '/sys/devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.5' successfully created ModemManager[14475]: opening device... ModemManager[14475]: [/dev/cdc-wdm1] Read max control message size from descriptors file: 512 ModemManager[14475]: <warn> Couldn't find support for device at '/sys/devices/pci0000:00/0000:00:19.0': not supported by any plugin ModemManager[14475]: <warn> Couldn't find support for device at '/sys/devices/pci0000:00/0000:00:1c.3/0000:03:00.0': not supported by any plugin ModemManager[14475]: <info> Modem: state changed (unknown -> disabled) ModemManager[14475]: <info> Modem /org/freedesktop/ModemManager1/Modem/0: state changed (disabled -> enabling) ModemManager[14475]: <info> Modem /org/freedesktop/ModemManager1/Modem/0: 3GPP Registration state changed (unknown -> searching) ModemManager[14475]: <info> Modem /org/freedesktop/ModemManager1/Modem/0: state changed (enabling -> searching) ModemManager[14475]: <info> Modem /org/freedesktop/ModemManager1/Modem/0: 3GPP Registration state changed (searching -> registering) ModemManager[14475]: <info> Modem /org/freedesktop/ModemManager1/Modem/0: 3GPP location updated (MCC: '242', MNC: '1', Location area code: '0', Cell ID: '0') ModemManager[14475]: <info> Modem /org/freedesktop/ModemManager1/Modem/0: 3GPP Registration state changed (registering -> home) ModemManager[14475]: <info> Modem /org/freedesktop/ModemManager1/Modem/0: state changed (searching -> registered) [Thread 0x7fffeffff700 (LWP 14480) exited] ModemManager[14475]: <info> Creating modem with plugin 'Generic' and '1' ports ModemManager[14475]: <warn> Could not grab port (tty/ttyS0): 'Cannot add port 'tty/ttyS0', unhandled serial type' ModemManager[14475]: <warn> Couldn't create modem for device at '/sys/devices/pci0000:00/0000:00:16.3': Failed to find primary AT port ModemManager[14475]: <info> Simple connect started... ModemManager[14475]: <info> Simple connect state (4/8): Wait to get fully enabled ModemManager[14475]: <info> Simple connect state (5/8): Register ModemManager[14475]: <info> Simple connect state (6/8): Bearer ModemManager[14475]: <info> Simple connect state (7/8): Connect ModemManager[14475]: <info> Modem /org/freedesktop/ModemManager1/Modem/0: state changed (registered -> connecting) ModemManager[14475]: <info> Modem /org/freedesktop/ModemManager1/Modem/0: state changed (connecting -> connected) ModemManager[14475]: <info> Simple connect state (8/8): All done ModemManager[14475]: <info> Modem /org/freedesktop/ModemManager1/Modem/0: state changed (connected -> disconnecting) ModemManager[14475]: <info> Modem /org/freedesktop/ModemManager1/Modem/0: state changed (disconnecting -> registered) ModemManager[14475]: <info> Simple connect started... ModemManager[14475]: <info> Simple connect state (4/8): Wait to get fully enabled ModemManager[14475]: <info> Simple connect state (5/8): Register ModemManager[14475]: <info> Simple connect state (6/8): Bearer ModemManager[14475]: <info> Simple connect state (7/8): Connect ModemManager[14475]: <info> Modem /org/freedesktop/ModemManager1/Modem/0: state changed (registered -> connecting) ModemManager[14475]: <warn> MBIM device is not opened, reporting invalid modem... ModemManager[14475]: <info> Modem /org/freedesktop/ModemManager1/Modem/0: state changed (connecting -> registered) [New Thread 0x7fffeffff700 (LWP 14831)] ModemManager[14475]: opening device... ModemManager[14475]: [/dev/cdc-wdm1] Read max control message size from descriptors file: 512 (ModemManager:14475): GLib-GObject-WARNING **: invalid unclassed pointer in cast to 'MMIfaceModem3gpp' (ModemManager:14475): GLib-GObject-CRITICAL **: g_object_get: assertion 'G_IS_OBJECT (object)' failed (ModemManager:14475): GLib-GObject-WARNING **: invalid unclassed pointer in cast to 'MMIfaceModem3gpp' (ModemManager:14475): GLib-GObject-WARNING **: invalid unclassed pointer in cast to 'GObject' (ModemManager:14475): GLib-GObject-CRITICAL **: g_object_get_qdata: assertion 'G_IS_OBJECT (object)' failed (ModemManager:14475): GLib-GObject-WARNING **: invalid unclassed pointer in cast to 'GObject' (ModemManager:14475): GLib-GObject-CRITICAL **: g_object_set_qdata_full: assertion 'G_IS_OBJECT (object)' failed (ModemManager:14475): GLib-GObject-CRITICAL **: g_object_get: assertion 'G_IS_OBJECT (object)' failed (ModemManager:14475): GLib-GObject-WARNING **: invalid unclassed pointer in cast to 'MMIfaceModem' (ModemManager:14475): GLib-GObject-CRITICAL **: g_object_get: assertion 'G_IS_OBJECT (object)' failed ModemManager[14475]: <info> Creating modem with plugin 'Ericsson MBM' and '7' ports ModemManager[14475]: <info> Modem for device at '/sys/devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.5' successfully created ModemManager[14475]: opening device... ModemManager[14475]: [/dev/cdc-wdm1] Read max control message size from descriptors file: 512 ModemManager[14475]: <info> Modem: state changed (unknown -> disabled) ModemManager[14475]: <info> Modem /org/freedesktop/ModemManager1/Modem/1: state changed (disabled -> enabling) ModemManager[14475]: <info> Modem /org/freedesktop/ModemManager1/Modem/1: 3GPP Registration state changed (unknown -> registering) ModemManager[14475]: <info> Modem /org/freedesktop/ModemManager1/Modem/1: 3GPP Registration state changed (registering -> home) ModemManager[14475]: <info> Modem /org/freedesktop/ModemManager1/Modem/1: 3GPP location updated (MCC: '242', MNC: '1', Location area code: '0', Cell ID: '0') ModemManager[14475]: <info> Modem /org/freedesktop/ModemManager1/Modem/1: state changed (enabling -> registered) (ModemManager:14475): GLib-GObject-WARNING **: invalid unclassed pointer in cast to 'MMIfaceModem' (ModemManager:14475): GLib-GObject-CRITICAL **: g_object_get: assertion 'G_IS_OBJECT (object)' failed [Thread 0x7fffeffff700 (LWP 14831) exited] ModemManager[14475]: <info> Simple connect started... ModemManager[14475]: <info> Simple connect state (4/8): Wait to get fully enabled ModemManager[14475]: <info> Simple connect state (5/8): Register ModemManager[14475]: <info> Simple connect state (6/8): Bearer ModemManager[14475]: <info> Simple connect state (7/8): Connect ModemManager[14475]: <info> Modem /org/freedesktop/ModemManager1/Modem/1: state changed (registered -> connecting) (ModemManager:14475): GLib-GObject-CRITICAL **: g_object_get: assertion 'G_IS_OBJECT (object)' failed Program received signal SIGSEGV, Segmentation fault. 0x00007ffff7152eb5 in g_type_check_instance_is_fundamentally_a (type_instance=type_instance@entry=0x41, fundamental_type=fundamental_type@entry=80) at gtype.c:3979 3979 if (!type_instance || !type_instance->g_class) (gdb) bt full #0 0x00007ffff7152eb5 in g_type_check_instance_is_fundamentally_a (type_instance=type_instance@entry=0x41, fundamental_type=fundamental_type@entry=80) at gtype.c:3979 node = <optimized out> #1 0x00007ffff7134bd7 in g_object_unref (_object=0x41) at gobject.c:3067 _g_boolean_var_ = <optimized out> object = 0x41 #2 0x00005555555caa2b in device_notification_cb (notification=0x5555558d8720, self=0x5555558a0310) at mm-broadband-modem-mbim.c:1730 session_id = 0 activation_state = MBIM_ACTIVATION_STATE_ACTIVATING context_type = 0x5555558daecc bearer_list = 0x41 service = MBIM_SERVICE_BASIC_CONNECT __FUNCTION__ = "device_notification_cb" #3 0x00005555555caa2b in device_notification_cb (notification=0x5555558d8720, self=0x5555558a0310) at mm-broadband-modem-mbim.c:1838 service = MBIM_SERVICE_BASIC_CONNECT __FUNCTION__ = "device_notification_cb" #4 0x00005555555caa2b in device_notification_cb (device=<optimized out>, notification=0x5555558d8720, self=0x5555558a0310) at mm-broadband-modem-mbim.c:1973 service = MBIM_SERVICE_BASIC_CONNECT __FUNCTION__ = "device_notification_cb" #5 0x00007ffff7132d13 in g_cclosure_marshal_VOID__BOXEDv (closure=0x5555558c9270, return_value=<optimized out>, instance=<optimized out>, args=<optimized out>, marshal_data=<optimized out>, n_params=<optimized out>, param_types=0x55555587e120) at gmarshal.c:1160 cc = 0x5555558c9270 data1 = <optimized out> data2 = <optimized out> callback = <optimized out> arg0 = 0x5555558d8720 args_copy = {{gp_offset = 32, fp_offset = 48, overflow_arg_area = 0x7fffffffe180, reg_save_area = 0x7fffffffe0c0}} #6 0x00007ffff712ff64 in _g_closure_invoke_va (closure=closure@entry=0x5555558c9270, return_value=return_value@entry=0x0, instance=instance@entry=0x7fffe8004f80, args=args@entry=0x7fffffffe0a0, n_params=<optimized out>, param_types=0x55555587e120) at gclosure.c:831 marshal = <optimized out> marshal_data = <optimized out> in_marshal = 0 real_closure = 0x5555558c9250 __FUNCTION__ = "_g_closure_invoke_va" #7 0x00007ffff7149b60 in g_signal_emit_valist (instance=0x7fffe8004f80, signal_id=<optimized out>, detail=0, var_args=var_args@entry=0x7fffffffe0a0) at gsignal.c:3218 return_accu = 0x0 accu = {g_type = 0, data = {{v_int = 0, v_uint = 0, v_long = 0, v_ulong = 0, v_int64 = 0, v_uint64 = 0, v_float = 0, v_double = 0, v_pointer = 0x0}, {v_int = 0, v_uint = 0, v_long = 0, v_ulong = 0, v_int64 = 0, v_uint64 = 0, v_float = 0, v_double = 0, v_pointer = 0x0}}} accumulator = 0x0 emission = {next = 0x0, instance = 0x7fffe8004f80, ihint = {signal_id = 19, detail = 0, run_type = G_SIGNAL_RUN_FIRST}, state = EMISSION_RUN, chain_type = 93824995638656} signal_id = 19 instance_type = 93824995638656 emission_return = {g_type = 0, data = {{v_int = 0, v_uint = 0, v_long = 0, v_ulong = 0, v_int64 = 0, v_uint64 = 0, v_float = 0, v_double = 0, v_pointer = 0x0}, {v_int = 0, v_uint = 0, v_long = 0, v_ulong = 0, v_int64 = 0, v_uint64 = 0, v_float = 0, v_double = 0, v_pointer = 0x0}}} rtype = 4 static_scope = 0 fastpath_handler = <optimized out> closure = 0x5555558c9270 run_type = <optimized out> l = <optimized out> fastpath = <optimized out> instance_and_params = <optimized out> signal_return_type = <optimized out> param_values = <optimized out> node = <optimized out> i = <optimized out> n_params = <optimized out> __FUNCTION__ = "g_signal_emit_valist" #8 0x00007ffff714a3af in g_signal_emit (instance=instance@entry=0x7fffe8004f80, signal_id=<optimized out>, detail=detail@entry=0) at gsignal.c:3365 var_args = {{gp_offset = 24, fp_offset = 48, overflow_arg_area = 0x7fffffffe180, reg_save_area = 0x7fffffffe0c0}} #9 0x00007ffff76fc200 in indication_ready (self=0x7fffe8004f80 [MbimDevice], res=0x7fffe800b0f0) at mbim-device.c:436 error = 0x0 indication = <optimized out> ---Type <return> to continue, or q <return> to quit--- #10 0x00007ffff73e3807 in g_simple_async_result_complete (simple=0x7fffe800b0f0 [GSimpleAsyncResult]) at gsimpleasyncresult.c:763 current_source = 0x5555558dafb0 current_context = <optimized out> __FUNCTION__ = "g_simple_async_result_complete" #11 0x00007ffff73e3869 in complete_in_idle_cb (data=<optimized out>) at gsimpleasyncresult.c:775 simple = <optimized out> #12 0x00007ffff6e30aeb in g_main_context_dispatch (context=0x555555836a00) at gmain.c:3111 dispatch = 0x7ffff6e2d630 <g_idle_dispatch> prev_source = 0x0 was_in_call = 0 user_data = 0x7fffe800b0f0 callback = 0x7ffff73e3860 <complete_in_idle_cb> cb_funcs = 0x7ffff711e8c0 <g_source_callback_funcs> cb_data = 0x5555558db050 need_destroy = <optimized out> source = 0x5555558dafb0 current = 0x5555558345b0 i = 0 #13 0x00007ffff6e30aeb in g_main_context_dispatch (context=context@entry=0x555555836a00) at gmain.c:3710 #14 0x00007ffff6e30e88 in g_main_context_iterate (context=0x555555836a00, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at gmain.c:3781 max_priority = 0 timeout = 0 some_ready = 1 nfds = <optimized out> allocated_nfds = 9 fds = 0x555555896300 #15 0x00007ffff6e311b2 in g_main_loop_run (loop=0x55555583b050) at gmain.c:3975 __FUNCTION__ = "g_main_loop_run" #16 0x0000555555577f59 in main (argc=<optimized out>, argv=<optimized out>) at main.c:150 inner = <optimized out> err = 0x0 name_id = 1 __FUNCTION__ = "main"
Thanks for the backtrace and logs, it clearly is some missing reference to the modem object somewhere, hopefully in the new code. Also, I think it's time we end up plugging in ModemManager to PM events; I'll go and do that as well as part of this branch. It's already too much for a stable bugfix release, so it'll go only to git master for now.
(In reply to Aleksander Morgado from comment #30) > Thanks for the backtrace and logs, it clearly is some missing reference to > the modem object somewhere, hopefully in the new code. Ack, let me know if I can provide more debug info or anything else. > Also, I think it's time we end up plugging in ModemManager to PM events; > I'll go and do that as well as part of this branch. It's already too much > for a stable bugfix release, so it'll go only to git master for now. That sounds like a sensible thing to do. If you make MM to a resume/thaw event by doing something active that triggers the NotOpened error and subsequent reprobing, then I think that would be as good as it gets. You could, for example, explicitly probe for network registration status and signal strength. I think it makes sense to do so immediately after resume/thaw even if you don't care aout the NotOpened stuff, because the laptop might have moved a great distance while it's been suspended/hibernated and so the previous information from before the suspend/hibernation might be completely incorrect. Tore
> > Also, I think it's time we end up plugging in ModemManager to PM events; > > I'll go and do that as well as part of this branch. It's already too much > > for a stable bugfix release, so it'll go only to git master for now. > > That sounds like a sensible thing to do. If you make MM to a resume/thaw > event by doing something active that triggers the NotOpened error and > subsequent reprobing, then I think that would be as good as it gets. > > You could, for example, explicitly probe for network registration status and > signal strength. I think it makes sense to do so immediately after > resume/thaw even if you don't care aout the NotOpened stuff, because the > laptop might have moved a great distance while it's been > suspended/hibernated and so the previous information from before the > suspend/hibernation might be completely incorrect. > That is very true. My only concern here is that in most cases what we do see during a suspend/resume is the modem ports (either USB TTYs or cdc-wdm ports) being removed from the kernel, and re-created. In that case the modem is already probed by default from scratch, so that's good; it's this case you're reporting the only one I know of (if I'm not mistaken) where the control ports aren't removed/recreated during the cycle. So, an explicit signal or registration check is a good idea, but we also need to make sure that these actions get properly cancelled if the ports end up getting removed during the cycle, which is the most common case. I'll try to draft something :)
Tore, could you retry the suspend/resume tests with libmbim 1.12 and the 'aleksander/suspend-resume' branch in ModemManager git?
(In reply to Aleksander Morgado from comment #33) > Tore, could you retry the suspend/resume tests with libmbim 1.12 and the > 'aleksander/suspend-resume' branch in ModemManager git? BTW, please configure the branch using --with-suspend-resume=systemd (or --with-suspend-resume=upower if you're not using systemd).
Tested it, there is some clear progress but it's not 100% reliable yet. The modem does get re-initialised/discovered automatically upon resume (and assigned a new index), but sometimes this fails and the modem ends up in an defective state. I'll attach debug files when I reproduce this, after the first suspend/resume cycle it is broken, but after another suspend/resume it works again. Timeline for correlating with the log files: 13:26:54 MM started after initial boot mmcli -m 0 at this point looks OK. 13:28:31 System suspended 13:28:48 System resumed mmcli -m 1 at this point does not look OK, most of the information is "unknown" or "none". I did nothing to actively make MM re-discover the modem. Sending SMS fails with "error: modem not enabled yet", and the NetworkManager systray applet indicates that mobile broadband is disabled, and won't allow me to try to establish WWAN connections. 13:31:30 System suspended 13:31:47 System resumed mmcli -m 2 at this point looks OK again. Again, I did nothing to actively make M M re-discover the modem. 13:32:34 Test SMS sent - OK! 13:33:09 Estalished WWAN connection through NM - OK! 13:33:35 Shut down MM Tore
Created attachment 112386 [details] ModemManager debug output cf. comment #35
Created attachment 112387 [details] mbim-proxy debug output cf. comment #35
Created attachment 112388 [details] mmcli -m $idx output cf. comment #35
Looking better, yes. The problem in this last try you did was that mbim-proxy didn't detect the not-opened issue until the first message was sent from MM. At this point, it probably is worth to update MBIM port probing in MM so that not only open/close is done (which doesn't send any message to the modem when proxy used), but also a real command message, which never gets filtered out in the proxy. With that, and some retries logic, the detection of the need of a reopen should be almost instant. Let me prepare something...
Ok, I was finally able to setup an environment to test this myself. My test setup is a Telit LN930 in MBIM mode connected via an externally powered minipci<->usb adapter. When I set the PC to suspend the power adapter of the modem is still on. I then just need to disconnect the USB and reconnect it before resuming from suspend, and I'll get NotOpened errors right away. I developed some additional fixes in libmbim, so that when a client comes with a new Proxy Config Set request, we'll explicitly check if the session is open, using a Device Caps Query message. This request will get a NotOpened error and we'll silently re-open the device internally. Please test branches: * libmbim "aleksander/internal-open-checks" * ModemManager "aleksander/suspend-resume"
(In reply to Aleksander Morgado from comment #40) > Please test branches: > * libmbim "aleksander/internal-open-checks" > * ModemManager "aleksander/suspend-resume" Done. Couldn't break these. Booted up, did a 5-6 suspend/resume cycles. The modem was re-initialised correctly after each resume, and was confirmed to be operational (e.g. I could send SMS or connect to WWAN) after each cycle. Let me know if you're interested in debug logs to confirm anything, but as far as I'm concerned this bug could be closed after these branches are merged onto master. Thanks! :-) Tore
(In reply to Tore Anderson from comment #41) > (In reply to Aleksander Morgado from comment #40) > > > Please test branches: > > * libmbim "aleksander/internal-open-checks" > > * ModemManager "aleksander/suspend-resume" > > Done. Couldn't break these. Booted up, did a 5-6 suspend/resume cycles. The > modem was re-initialised correctly after each resume, and was confirmed to > be operational (e.g. I could send SMS or connect to WWAN) after each cycle. > > Let me know if you're interested in debug logs to confirm anything, but as > far as I'm concerned this bug could be closed after these branches are > merged onto master. Thanks! :-) > yey! Dan, any comment on the suspend-resume branch? I'm not sure the upower backend should be supported any more, as the Suspending/Resuming signals were long gone from upowerd's DBus interfaces...
This was merged to git master already.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.