Tuesday, September 11, 2012

On the design of the STUN and TURN URI formats

The first goal of this post is to write down my reasoning for the formats I am promoting for the future STUN and TURN URIs, mostly because I keep forgetting it and have to reconstruct it from scratch each time I have this discussion with other people (and sadly also with myself), but this post can be of interest if you are confused about what TURN and STUN are, and how they can be used.

Let's start with STUN (RFC 5389): It is important to immediately separate the STUN protocol from the STUN usages. The STUN protocol covers how bits are organized on the wire and how STUN packets are sent, received and retransmitted - all details that are not terribly important for this discussion, excepted on how they contribute to the confusion. The really interesting part is the list of STUN usages, which is the list of different things that can be done with STUN. At the time this post is written there is 4 different STUN usages, which always involve a STUN client and a STUN server:
  • NAT Discovery, specified in RFC 5389, which used is to find under which IP address and port a STUN client is visible to a STUN server. If the STUN client is inside a NAT and the STUN server on the Internet, then the NAT Discovery Usage permits to find the IP address of the NAT.
  • NAT Behavior Discovery, specified in RFC 5780, which used is to find what type of NAT separate a STUN client from a STUN server. It is a bad idea to use this information for anything else than collecting debugging data, which is why this RFC is experimental and why we will not discuss it.
  • Connectivity Check, specified in RFC 5245 (aka ICE), which used is to find if a STUN server can be reached by a STUN client.
  • Keep-alive, specified in RFC 5626, which is used to a) detect if a STUN server can still be reached by a STUN client, b) detect if the NAT/Firewall IP address or port changed and c) to keep the NAT/Firewall open.

STUN is defined to be used over UDP, TCP or TLS. STUN cannot yet be used over DTLS (i.e. TLS over UDP), or any more recent transports like SCTP or DCCP. One fundamental point to understand for this discussion is that the choice of the transport used by STUN is dependent only on the application needing it. If for instance the NAT Discovery Usage is used with RTP, only STUN over UDP can be of use to this application. STUN over TCP cannot help at all, so the choice of the transport is not left to the user of the application or to the administrators of the STUN server - it is purely a consequence of what the application is trying to achieve.

TURN (RFC 5766) is an application layer tunneling protocol. Although TURN have absolutely nothing to do with any of the Usages described above, it shares the same protocol than STUN - same bits on the wire, same way the packets are sent, received and retransmitted. This is the first reason of the confusion between STUN and TURN, the second being that, to save a round-trip, the TURN allocate transaction returns the exact same information that the STUN NAT Discovery Usage returns. In spite of this similarities with STUN, the job of the TURN protocol is completely different, as it is to carry application data between the TURN client and the TURN peer, through the TURN server. These application data can be anything, e.g. RTP packets. They can even be STUN packets, in which case the TURN client can also be a STUN client and the TURN peer (not the TURN server) can also be a STUN server.

Like for STUN, TURN is defined to be used over UDP, TCP or TLS between the TURN client and the TURN server. But this is the transport used for the tunnel itself, and the transport used inside the tunnel (i.e. for our RTP or STUN packets) can be different. RFC 5766 defines only UDP as TURN allocation (this is how the inside transport is called in the specification), but RFC 6062 extends TURN by adding the support of TCP allocations, although with the limitation that a TCP allocation cannot be used over a UDP transport (i.e. a UDP tunnel cannot carry TCP inside).

The very important point here is that the application does not care which transport is used for the TURN tunnel - it can be any tunnel transport that can carry the inside transport that the application need to use with the peer. So if the application needs UDP to send STUN or RTP to the peer, it does not matter if the tunnel transport is UDP, TCP or TLS.

On the other hand, what tunnel transport is available can matter for the provider of the TURN server. At the difference of STUN servers, TURN servers use reel resources (ports, bandwidth, CPU), so the administrators of these TURN servers may want to be able to balance the load, fail-over servers, etc... One of the other things that an administrator may want to manage is the priority between the different tunnel transports that a TURN client can use, and this is exactly what RFC 5928 provides.

But before going into RFC 5928, let's have a look to the way the DNS interacts with STUN and TURN. A TURN server or a STUN server for the two first STUN Usages listed above (NAT Discovery and NAT Behavior Discovery) are generally deployed on fixed public Internet addresses, and so it is useful to use the DNS to associate a name with them (in an A or AAAA record). Because more than one instance of these servers is generally required to run a service, the SRV records can be used to distribute the load between servers, to manage fail-over and to assign a port to the servers. What RFC 5928 adds to this is the definition of a NAPTR record to select the transport.

Under RFC 5928 when an application wants to use a TURN server it has to provide two sets of information. The first set contains the list of tunnel transports that the application implements. The second set, which is probably stored in the configuration of the application, contains the name of the domain for the TURN server, an optional port, an optional transport and an optional secure flag. The algorithm in RFC 5928 takes these two sets of information and spit out an ordered list of IP address, port and tunnel transport that the TURN client can try to establish the tunnel. As soon the tunnel is established, The TURN client can request a TCP or a UDP allocation to send and receive packets, depending, as explained above, on the purpose of the application.

Because there is no point on having the STUN server administrators choosing the transport, there is no need to define something equivalent to RFC 5928 for STUN.

The TURN URI as currently designed carries all the information that are in the second set passed to the RFC 5928 algorithm. The URI "turn:example.org" fills the host parameter with "example.org", and sets the secure flag, the transport and the port to undefined. The URI "turns:[2001:DB8::1]:2345;transport=TCP" sets the host to the IPv6 address 2001:DB8::1, the secure flag on, the port to 2345 and the transport to TCP.

Let's now replace the TURN URI in the WebRTC context, which is the reason it is needed in the first place. The TURN URI is passed from the Web server to the browser in the Javascript code. In normal operations, the TURN URI will probably be something like "turns:example.org", meaning that the tunnel transport will be negotiated between the capabilities of the browser and what the administrators of the TURN servers in the example.org domain prefer. But the administrators of the Web server may want for debugging reason to use a specific server and port, e.g. "turn:[2001:DB8::::1]:1234". They may also want to force a specific transport, knowing that others transport have an unfixed bug, by using something like "turn:example.org;transport=UDP". This flexibility is even more useful knowing that even with the cooperation of the DNS administrators, it will take some time for the new DNS records to propagate. So in this context, it makes sense that the TURN URI has a transport parameter.

On the other hand, a transport parameter on a STUN URI would make no sense, because the transport used by STUN is dictated by the application. If the UDP transport has a bug in the STUN servers, switching to a TCP transport cannot help an application that is trying to send RTP packets.

One of the alternative format that was proposed for the TURN and STUN URIs was to lose the "s" suffix in the "turns" and "stuns" scheme and to consolidate it inside a ";proto=" parameter. With this alternative format, "turns:[2001:DB8::1]:2345;transport=TCP" becomes "turn:[2001:DB8::1]:2345;proto=TLS". But because as demonstrated previously STUN URI does not need a transport parameter, it is not possible way to remove the "s" suffix and convert it in a ";proto=" parameter. One way would be to convert "stuns:example.org" to "stun:example.org;secure", but one can ask how this is better than the original STUN URI.

For all these reasons, and because it would look strange that STUN uses the "s" suffix and not TURN, I think that the right format is to allow "turns" and "stuns" scheme, and to use the ";transport=" parameter only for TURN URIs.

Updated 09/12/2012: Added a bit more text about the interaction between STUN/TURN and the DNS.
Post a Comment