Bug 90894

Summary: Problem selecting server reflexive candidates
Product: nice Reporter: Jakub Adam <jakub.adam>
Component: GeneralAssignee: Olivier Crête <olivier.crete>
Status: RESOLVED MOVED QA Contact:
Severity: normal    
Priority: medium CC: ilya.konstantinov, kakaroto
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: conncheck: generate candidate pair for valid srflx candidate
conncheck: rename priv_process_response_check_for_peer_reflexive()

Description Jakub Adam 2015-06-08 06:45:56 UTC
Created attachment 116353 [details] [review]
conncheck: generate candidate pair for valid srflx candidate

In priv_process_response_check_for_peer_reflexive(), mere presence of a candidate in local_candidates doesn't mean there's also some candidate pair in conncheck_list using it - for instance that candidate may be server reflexive, for which no check pairs are initially created (see conn_check_add_for_candidate_pair()).

If we fail to find corresponding pair upon receiving such candidate's IP in a conncheck response's XOR-MAPPED-ADDRESS attribute, we shall add a new one in a similar way we would add a new pair for a just discovered peer reflexive candidate.

Previous priv_process_response_check_for_peer_reflexive() implementation would return NULL, causing a CandidateCheckPair with local candidate of type HOST to be wrongly selected even though the local host IP might not be directly accessible by the remote counterpart (e.g. it's an address on a private network segment). In practice this was coming through as a duplex connection that libnice was reporting as properly established, but only one direction of the communication was actually working.
Comment 1 Jakub Adam 2015-06-08 06:46:47 UTC
Created attachment 116354 [details] [review]
conncheck: rename priv_process_response_check_for_peer_reflexive()
Comment 2 Philip Withnall 2015-06-09 16:49:15 UTC
A description of a simple network configuration which triggers this would be appreciated, and would make the review a lot easier to reason about.
Comment 3 Jakub Adam 2015-06-10 07:57:41 UTC
(In reply to Philip Withnall from comment #2)
> A description of a simple network configuration which triggers this would be
> appreciated, and would make the review a lot easier to reason about.

The setup is like the following one:

                  DMZ|                          |Company intranet
                     |                          |
                     |                          |
 initiator (nice)    |                          |  Lync client
     ____   __       |           TURN           |   ____   __
    |    | |==|      |        __________        |  |    | |==|
    |____| |  |      |       [_...__...°]       |  |____| |  |
    /::::/ |__|      |        1.1.109.21        |  /::::/ |__|
  192.168.101.77     |                          |192.168.122.166
  srflx 10.10.215.103|                          |srflx 10.14.128.54
                     |                          |

Machines in 'DMZ' and 'Company intranet' can't connect directly to each other and have to use TURN. A connection check response that 'initiator' gets from 'Lync client' through the TURN server contains intiator's srflx IP in XOR-MAPPED-ADDRESS (10.10.215.103 in the example). However, after the connection check succeeds, libnice selects - incorrectly, IMO - its local host candidate (IP 192.168.101.77), which isn't reachable by neither the other participant logged in from the intranet, nor the TURN server.

The communication is conducted through UDP protocol.
Comment 4 Philip Withnall 2015-06-26 14:13:16 UTC
Migrated to Phabricator: http://phabricator.freedesktop.org/T115

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.