Bug 26807

Summary: Mute interface for Channel.Type.Call
Product: Telepathy Reporter: Andres Salomon <dilinger>
Component: tp-specAssignee: Telepathy bugs list <telepathy-bugs>
Status: RESOLVED FIXED QA Contact: Telepathy bugs list <telepathy-bugs>
Severity: normal    
Priority: high CC: lassi.syrjala, olivier.crete, will
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard: draft 1 in 0.19.6, Call
i915 platform: i915 features:
Bug Depends on:    
Bug Blocks: 21847, 24894    

Description Andres Salomon 2010-02-28 13:37:47 UTC
Calls need some kind of Mute interface.  An example is given in the rtcom spec (but for StreamedMedia):

http://git.collabora.co.uk/?p=rtcom-telepathy-glib.git;a=blob;f=rtcom-telepathy-glib/Channel_Interface_Mute.xml;h=d6b07a7c7c90f34eefb77b36e3aa2a3b89c831ae;hb=HEAD
Comment 1 Andres Salomon 2010-02-28 13:47:17 UTC
Here's an initial draft spec for Mute:

http://git.collabora.co.uk/?p=user/dilinger/telepathy-spec;a=shortlog;h=refs/heads/mute
Comment 2 Simon McVittie 2010-03-11 09:18:16 UTC
Note that the API provided by RTCom is per-channel, but that's because it was only used for GSM audio calls, where you only have one content and one stream anyway. In Jingle, individual contents can be muted (you can mute audio but leave the webcam on), and video can be "muted" (i.e. to signal "I've turned off my webcam").

Things that this spec draft doesn't say anything about:

* What's the difference between setting the content/stream to receive-only, and muting it? When would you do one, and when would you do the other?

(wjt suggests that the answer is: if you physically have a webcam/microphone, and in general you want to use it within this call, but you've temporarily switched it off, then the direction should be bidirectional and mute should be signalled)

* Interaction with the protocol: does calling the method actually do anything significant or does it just send an informational message? In Hold, requesting hold causes the MediaSignalling interface (or Call equivalent) to ask the streaming implementation (i.e. telepathy-farsight) to stop sending and receiving; does Mute have similar behaviour?

* In general, what use cases are we providing for here?

-------------

Something else this spec doesn't seem to address at all is receiving the muted flag from the other guy(s), and indicating in the UI that they've muted themselves.

If it was per-call, then in Call, this would be a new flag in Call_Member_Flags, and in StreamedMedia, this would be a new flag in the CallState interface. However, it's really per-content.

Do we want to help out simple UIs by providing a "merged" flag on the Channel? If so, what are its semantics? (wjt thinks the answer is "if any of their streams are muted, then the channel as a whole is considered to be muted".)

Do we want to indicate per stream whether the stream is muted? (Probably? so a multi-way video call UI that displays a pane per participant could grey out the video pane for a "video-muted" caller, or hide it entirely.)
Comment 3 Andres Salomon 2010-03-12 17:10:43 UTC
(In reply to comment #2)
> Note that the API provided by RTCom is per-channel, but that's because it was
> only used for GSM audio calls, where you only have one content and one stream
> anyway. In Jingle, individual contents can be muted (you can mute audio but
> leave the webcam on), and video can be "muted" (i.e. to signal "I've turned off
> my webcam").
> 

Hm, you're right.  This should probably be handled at the Content level, I think (see my comments at the end).


> Things that this spec draft doesn't say anything about:
> 
> * What's the difference between setting the content/stream to receive-only, and
> muting it? When would you do one, and when would you do the other?


For protocols where you have fine-grained control of the stream data, and where tearing down a stream doesn't tear down the entire connection, perhaps there is no difference.  For protocols such as GSM, setting a content/stream to receive-only involves either resetting the connection, or some other kind of coordination between both contacts.  Muting involves merely disabling the audio mic; the content/stream remain untouched.


> 
> (wjt suggests that the answer is: if you physically have a webcam/microphone,
> and in general you want to use it within this call, but you've temporarily
> switched it off, then the direction should be bidirectional and mute should be
> signalled)
> 
> * Interaction with the protocol: does calling the method actually do anything
> significant or does it just send an informational message? In Hold, requesting
> hold causes the MediaSignalling interface (or Call equivalent) to ask the
> streaming implementation (i.e. telepathy-farsight) to stop sending and
> receiving; does Mute have similar behaviour?
> 
> * In general, what use cases are we providing for here?
> 
> -------------
> 
> Something else this spec doesn't seem to address at all is receiving the muted
> flag from the other guy(s), and indicating in the UI that they've muted
> themselves.
> 
> If it was per-call, then in Call, this would be a new flag in
> Call_Member_Flags, and in StreamedMedia, this would be a new flag in the
> CallState interface. However, it's really per-content.
> 
> Do we want to help out simple UIs by providing a "merged" flag on the Channel?
> If so, what are its semantics? (wjt thinks the answer is "if any of their
> streams are muted, then the channel as a whole is considered to be muted".)
> 
> Do we want to indicate per stream whether the stream is muted? (Probably? so a
> multi-way video call UI that displays a pane per participant could grey out the
> video pane for a "video-muted" caller, or hide it entirely.)
> 

I see Mute being for local muting.  That is, it's a quick way (at the UI level) to block sending of video and/or audio.  One can imagine a UI that allows you to separately control muting for voice and for video.  

I don't see a sane way to handle the use case of selectively muting some streams in a multi-party call, as it will lead to crazy complexity.  Ie, if you're in a multi-party call with Alice, Bob, and Carol, and you've decided that you want to support only muting Alice, what happens when you speak and Bob responds?  Does Bo's response get muted as well?  Does Alice simply get completely muted?  I feel that this kind of thing is better handled explicitly by simply removing a contact from a conference.

Therefore, I'm not convinced that we'd ever want to be able to mute at the Stream level; I think muting at the Content level is enough.

It's up to the CM that's implementing Mute to decide what to actually do to at the protocol level.  For a protocol like GSM, you mute the mic; the GSM hardware keeps transmitting audio.  For other VOIP protocols, perhaps you actually stop sending stream data where the protocol itself allows such a thing.  If it's not supported, you just send silence (or in the case of a video Content/Stream, you send a greyed-out image).  The important thing here is that Mute is a shortcut for keeping *your* audio or video data from being transmitted.  It shouldn't require any changing of stream states, or any coordination w/ remote contacts.

If I'm off-base w/ any of my assumptions, please let me know..


Comment 4 Will Thompson 2010-03-26 12:26:21 UTC
I just bounced some thoughts off Rob, and came up with:

• The interface should support the UI muting or unmuting any of its own streams, and expose your contacts notifying you that their streams are muted.
• The UI calling Mute() needs to be relayed to the streaming implementation, if there is one; ditto Unmute(). It could be similar to Hold.
• However, Hold is complicated by the fact that the streaming implementation is meant to give up the resources, so unholding can fail. I don't think unmuting should be able to fail like that. However, like Hold, the UI should be able to know when the mic's actually been muted by the streaming implementation, so that the user knows when they can speak freely.

So I think we want:
• methods for Mute and Unmute on Content, along with a MuteStateChanged signal to report when it's actually happened;
• some way on Content.Interface.Media for the CM to tell tp-fs to do the right thing, and for tp-fs to report that it has;
• a representation on ... maybe Stream? to say "the other side has muted their mic/camera".

> Ie, if
> you're in a multi-party call with Alice, Bob, and Carol, and you've decided
> that you want to support only muting Alice, what happens when you speak and Bob
> responds?

I don't think this API needs to be for silencing Alice: it's for silencing your own microphone, and exposing your contacts reporting that they've silenced their microphones.
Comment 5 Andres Salomon 2010-03-27 09:08:55 UTC
(In reply to comment #4)
> I just bounced some thoughts off Rob, and came up with:
> 
> • The interface should support the UI muting or unmuting any of its own
> streams, and expose your contacts notifying you that their streams are muted.

Ok.  This is different from what I'd envisioned.



> • The UI calling Mute() needs to be relayed to the streaming implementation,
> if there is one; ditto Unmute(). It could be similar to Hold.
> • However, Hold is complicated by the fact that the streaming implementation
> is meant to give up the resources, so unholding can fail. I don't think
> unmuting should be able to fail like that. However, like Hold, the UI should be
> able to know when the mic's actually been muted by the streaming
> implementation, so that the user knows when they can speak freely.

If unmuting can't fail, then resources shouldn't be freed up when a stream/content is muted.  I'm also failing to see the use case for notifying the remote side that you've muted your mic/camera.  For Hold, it makes sense; you want them to know when they can speak.  For mute, I'm not sure that you want them to know that you've muted your mic (and if you've muted your camera, that should be pretty obvious).


> 
> So I think we want:
> • methods for Mute and Unmute on Content, along with a MuteStateChanged
> signal to report when it's actually happened;
> • some way on Content.Interface.Media for the CM to tell tp-fs to do the
> right thing, and for tp-fs to report that it has;
> • a representation on ... maybe Stream? to say "the other side has muted
> their mic/camera".
> 
Comment 6 Will Thompson 2010-03-29 05:43:11 UTC
The interface in rtcom-tp-glib just allows the UI to tell the CM "mute this call; unmute this call", right? A first crack at this wouldn't necessarily need to expose mute notifications from other contacts; seems largely orthogonal.
Comment 7 Andres Salomon 2010-03-29 08:56:15 UTC
(In reply to comment #6)
> The interface in rtcom-tp-glib just allows the UI to tell the CM "mute this
> call; unmute this call", right? A first crack at this wouldn't necessarily need
> to expose mute notifications from other contacts; seems largely orthogonal.
> 

Exactly, unless we decide that (for remote notification purposes) the API *must* be on the stream interface.
Comment 8 Andres Salomon 2010-03-29 13:54:49 UTC
(In reply to comment #7)
> (In reply to comment #6)
> > The interface in rtcom-tp-glib just allows the UI to tell the CM "mute this
> > call; unmute this call", right? A first crack at this wouldn't necessarily need
> > to expose mute notifications from other contacts; seems largely orthogonal.
> > 
> 
> Exactly, unless we decide that (for remote notification purposes) the API
> *must* be on the stream interface.
> 

For now, I've pushed changes to move Mute onto Call.Content.
Comment 9 Will Thompson 2010-03-30 11:27:55 UTC
Discussion from a spec meeting, mostly open questions:

Do we need to tell Farsight to mute things, or just leave it up to the
UI?

Do we want to relay the fact that we've muted our mic to the other
parties? We don't see any harm in it: it's useful metadata, and isn't an
information leak because you can hear that someone's gone silent anyway.

If we make the UI control the actual muting (or in the Maemo case, make
it talk to s-e directly) then it could control whether not to tell the
other guy by whether or not it call Mute() on the CM. But for GSM it
needs to call Mute() on the CM because that's the thing which cares. So
two semantics here: one informational, one a request.
Comment 10 Olivier Crête 2010-03-30 12:45:12 UTC
(In reply to comment #9)
> Do we need to tell Farsight to mute things, or just leave it up to the
> UI?

Mute can be implemented in two ways:

1. Turn the volume down to 0
2. Actually stop sending data

For case 1, then this is only for GSM as for everything else the UI can do it by itself (like it happens right now in Fremantle). That said, Jingle has a "mute" informational message that may be useful.

For case 2, that's what "direction = receive" is.

> Do we want to relay the fact that we've muted our mic to the other
> parties? We don't see any harm in it: it's useful metadata, and isn't an
> information leak because you can hear that someone's gone silent anyway.

For Jingle, we definitely want to do that. I don't think other protocols have that.

> If we make the UI control the actual muting (or in the Maemo case, make
> it talk to s-e directly) then it could control whether not to tell the
> other guy by whether or not it call Mute() on the CM. But for GSM it
> needs to call Mute() on the CM because that's the thing which cares. So
> two semantics here: one informational, one a request.

On maemo right now, s-e does not do mute, the UI talks directly to pulseaudio (or audiopolicy or something).

I think the real question is: Does GSM have a direction=receive that's different from Mute? If yes, then we need a Mute() call, if not just change the direction.


Comment 11 Will Thompson 2010-04-21 08:42:06 UTC
(In reply to comment #10)
> I think the real question is: Does GSM have a direction=receive that's
> different from Mute? If yes, then we need a Mute() call, if not just change the
> direction.

On XMPP, streams can't have direction: none. So under your suggestion, the call would end if we both pressed mute... (Unless you're suggesting changing the direction locally but not on the wire?)
Comment 12 Olivier Crête 2010-04-21 09:31:53 UTC
(In reply to comment #11)
> (In reply to comment #10)
> > I think the real question is: Does GSM have a direction=receive that's
> > different from Mute? If yes, then we need a Mute() call, if not just change the
> > direction.
> 
> On XMPP, streams can't have direction: none. So under your suggestion, the call
> would end if we both pressed mute... (Unless you're suggesting changing the
> direction locally but not on the wire?)

That's exactly why I wanted senders=none in Jingle. Can you have a call with no contents ? (probably not?). Anyway, if XMPP is broken, we should fix it. That also probably means that Gabble needs to be hacked to always send sound even if the direction doesn't say it.

That said, there is the added complication of WiFi power managementm, it buffers packets for a long time if it does not receive anything. So we have to always be sending regularly anyway. So Mute for VoIP really has to mean "I'll send sound buffers, they just happen to have no sound in them".

So, unless mute exists for GSM, I would suggest just changing the direction on the TP level and have the VoIP CMs interpret that not as a change in direction, but as a "please send no sound at all".
Comment 13 Will Thompson 2010-05-04 11:35:41 UTC
(In reply to comment #12)
> (In reply to comment #11)
> > (In reply to comment #10)
> > > I think the real question is: Does GSM have a direction=receive that's
> > > different from Mute? If yes, then we need a Mute() call, if not just change the
> > > direction.
> > 
> > On XMPP, streams can't have direction: none. So under your suggestion, the call
> > would end if we both pressed mute... (Unless you're suggesting changing the
> > direction locally but not on the wire?)
> 
> That's exactly why I wanted senders=none in Jingle. Can you have a call with no
> contents ? (probably not?). Anyway, if XMPP is broken, we should fix it.

You can't.

> That
> also probably means that Gabble needs to be hacked to always send sound even if
> the direction doesn't say it.

What? Why would you do that?

> That said, there is the added complication of WiFi power managementm, it
> buffers packets for a long time if it does not receive anything. So we have to
> always be sending regularly anyway. So Mute for VoIP really has to mean "I'll
> send sound buffers, they just happen to have no sound in them".
> 
> So, unless mute exists for GSM, I would suggest just changing the direction on
> the TP level and have the VoIP CMs interpret that not as a change in direction,
> but as a "please send no sound at all".

That seems kind of perverse.

I remain convinced that there's a semantic distinction between a unidirectional stream, and a bidirectional stream where one party has muted their microphone...
Comment 14 Olivier Crête 2010-05-04 11:55:52 UTC
(In reply to comment #13)
> (In reply to comment #12)
> > (In reply to comment #11)
> > That
> > also probably means that Gabble needs to be hacked to always send sound even if
> > the direction doesn't say it.
> 
> What? Why would you do that?

For the wireless power management thing. Actually, this is how its implemented on maemo. You can't have a unidirectional Audio stream (for voip). It only mutes pulseaudio.

Another possibility is to just not implement Mute in XMPP at all or add a property to the interface to say its only an informational message (and that it does not actually Mute anything).
Comment 15 Will Thompson 2010-05-04 12:08:09 UTC
(In reply to comment #14)
> Another possibility is to just not implement Mute in XMPP at all or add a
> property to the interface to say its only an informational message (and that it
> does not actually Mute anything).

I think a D-Bus method called Mute() which doesn't actually mute anything would be surprising... ☺
Comment 16 Simon McVittie 2010-05-05 04:23:23 UTC
*** Bug 21846 has been marked as a duplicate of this bug. ***
Comment 17 Senko Rasic 2010-05-06 01:21:50 UTC
(In reply to comment #15)
> (In reply to comment #14)
> > Another possibility is to just not implement Mute in XMPP at all or add a
> > property to the interface to say its only an informational message (and that it
> > does not actually Mute anything).
> 
> I think a D-Bus method called Mute() which doesn't actually mute anything would
> be surprising... ☺

Instead of (Request)Mute() that behaves erratically, or hinting mute-ing by changing directions which CMs should then ignore (even more bizzarre :), can we just have Muted(b:muted) method, and say:

"""
This is an informational message from the client to the cm, to inform it that the client has muted the Call.Content. Client is responsible for (un)muting itself (e.g. by muting the microphone or shutting off the camera). Client MUST call this whenever it mutes or unmutes the content.

Rationale: for some protocols, the fact that the content is muted needs to be transmitted to the peer; for others, the notification to the peer is only informational (eg. XMPP), and some protocols may have no notion of muting at all.
"""

(Also: in this case, do we need elaborate state changes?)
Comment 18 Senko Rasic 2010-05-09 01:46:17 UTC
(In reply to comment #17)
> Instead of (Request)Mute() that behaves erratically, or hinting mute-ing by
> changing directions which CMs should then ignore (even more bizzarre :), can we
> just have Muted(b:muted) method..

Pushed a branch with this suggestion implemented:
http://git.collabora.co.uk/?p=user/ptlo/tp-spec-senko/.git;a=shortlog;h=refs/heads/mute

Since the method is informational, there's no need for transitive states, so it also simplifies the interface a bit.

Related point: AIUI it's implicitly understood that only the channel handler should mute the call (after all, it's probably the one actually controlling mic/camera anyways). Should we mandate this, or should we explicitly say that any client can do it? (in which case, all clients should be watching for MuteStateChange to know what's the current state and whether they can mute or unmute it).

(TBH I don't have a good use case for why anyone but handler would need it.)
Comment 19 Olivier Crête 2010-05-09 10:25:25 UTC
On GSM, I guess the method would actually mute the stream ? Wouldn't it ?
Comment 20 Senko Rasic 2010-05-10 03:29:15 UTC
(In reply to comment #19)
> On GSM, I guess the method would actually mute the stream ? Wouldn't it ?

Yes, if muting's done on protocol level (can't confirm it). So the stream itself would be muted (by the CM, as part of the protocol), *and* the mic (by the client) would be muted.
Comment 21 Simon McVittie 2010-05-10 04:48:17 UTC
Editorial nitpicking:

+      <p>Although it's client's responsibility to actually mute the microphone
+        or turn off the camera, using this interface the client can also
+        inform the CM and other clients of that fact.
+        <tp:rationale>
+          For some protocols, the fact that the content is muted needs to be
+          transmitted to the peer; for others, the notification to the peer is
+          only informational (eg. XMPP), and some protocols may have no notion
+          of muting at all.
+        </tp:rationale></p>

Please end the paragraph after "fact", and move <p></p> inside the rationale. <tp:rationale> is conceptually "larger than" <p>.
Comment 22 Simon McVittie 2010-05-10 04:52:57 UTC
More non-merge-blocker editorial nitpicking:

+          A boolean indicating whether or not the content was muted.

"True if the content is now muted"

+        The current mute state of the call content.

"True if the content is muted"

+          A boolean indicating whether or not the content is muted by the
+          client.

"True if the client has muted the content"

+        <tp:error name="org.freedesktop.Telepathy.Error.Disconnected"/>

I don't think a Content method should be able to raise Disconnected (which really means "not yet connected"). NetworkError is probably the only useful error here?
Comment 23 Simon McVittie 2010-05-25 06:59:45 UTC
All of my editorial complaints have been fixed and draft 1 is in master, so there's no patch here.
Comment 24 Jonny Lamb 2010-10-13 06:47:56 UTC
This is fixed as a draft in git, closing.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.