Views

Network Configuration for VoIP Providers

The Wiki of Unify contains information on clients and devices, communications systems and unified communications. - Unify GmbH & Co. KG is a Trademark Licensee of Siemens AG.

Revision as of 13:37, 6 May 2010 by Hubbertz (talk | contribs)
Jump to: navigation, search

Network Configuration for VoIP Providers provides information about implementing VoIP communication with SIP protocol between private and public IP networks. The first chapters give the technical background for possible problems and their solution. Practical recommendations for the SEN small and medium platforms can be found here.
This document gives recommendations how the local network (LAN) should be configured to use VoIP Providers (ITSP).

Please note that while some of the guidelines are valid in general, most of them apply for SEN SME telephony systems only:

The problem with NAT & Firewalls

SIP & SDP

Internet telephony uses the Session Initiation Protocol (SIP) to establish phone calls (or other multimedia sessions). SIP messages can contain a body with data of the Session Description Protocol (SDP), that contain at least one IP address and port that is used for sending and receiving the audio (voice) data (RTP).

Also some important SIP headers contain the local socket address (IP+port).

Network address translation (NAT)

NAT1.png

Figure 1: Typical Internet access with NAT

This is a problem if a NAT router is present between the two telephony endpoints. The NAT router changes private IP addresses to the public one (e.g. ADSL endpoint address). In the example network in Figure 1, the router translates the private network 192.168.1.0/24 to one single IP address 149.246.229.150. In addition, the port is changed in many cases. This is sometimes named Network address port translation (NAPT). In this document, the term NAT always may also mean NAPT.

UDP encapsulation.svg Figure 2: TCP/IP Layer Model (Source: Wikimedia Commons)

The address change is made in IP/UDP header only – the data (SIP+SDP) is left unchanged.

Binding life-time

Practically all NAT implementations need to store some information about its state to recognize response packets and change them accordingly. This state information is called binding.

These bindings have a time-to-live (TTL) that is implementation-specific and may also depend on the used port number and the count of transmitted packets (e.g. DNS requests/responses have a shorter lifetime than other packets).

Stateful firewalls that do not perform NAT store a “binding” that has a specific time-to-live, too.

That means that if there is no response (or new request) on an open binding, the binding is closed after some time and responses or new incoming requests no longer can traverse the router or firewall.

It also means that an incoming packet can received only after a binding was opened earlier.

Problems resulting from router behavior

Payload establishment

The private IP address used in the SDP c-line and port(s) used in the m-line(s) are not changed by the NAT, because they are belonging to the UDP data.

This private IP is received by the other telephony endpoint (peer). The peer is not able to use this address as destination address for its payload.

Furthermore even if the address was correct, an open binding may be needed to pass a firewall. This is especially a problem for incoming one-way payload (e.g. announcements like "Called number temporarily unavailable").

Routing of responses and new requests within the dialog

Also the SIP signaling is affected. The SIP RFC states that responses must to be sent back to the port used in the topmost Via-Header (or to the default port 5060 if no port is present).

So if the NAT router changes the port, the response cannot be sent back to the client. The same applies for new requests within the dialog (e.g. a BYE sent by the called party).

If the ITSP works in proxy mode and does not add the Record-Route header, there is another problem for these requests:

The called party sends new requests within the dialog directly to the peer (the caller), e.g. a BYE request to release the call.

In most configurations, the NAT router or firewall then has no open binding and cannot forward the request to the telephony system, so then for example the caller is not released if the callee releases the call.

Registration (Incoming calls)

To receive calls from the ITSP the telephony system needs to register itself at the ITSP. The address where incoming calls should be routed to is written to the Contact-Header of the SIP REGISTER request. If a private IP address is used that cannot be used by the ITSP, so no incoming calls are possible.

Even if the public address is written correctly to the Contact-header, there must be an open binding on the NAT router or firewall, to receive the call. While typical binding life-time is in the range of 30 to 120 seconds, the typical ITSP re-registration interval is at least 120 seconds, sometimes much longer (e.g. up to 6 hours), so the SIP REGISTER message and its response do not keep the binding open long enough.


Possible Solutions

Application Layer Gateways

There are devices with Application layer gateway (ALG) functionality, that also inspect the UDP data for IP addresses and thus change SIP and SDP data. Experience shows that those devices are unreliable very often, so it is best to switch off any ALG functionality.

STUN (RFC 3489, RFC 5389)

One solution to solve this problem is that the endpoint finds out how the private transport address is changed by the NAT router and uses this information to write correct SIP headers and SDP data. Therefore, the STUN protocol exists. A public STUN server responds to requests with an answer that contains the public address it has seen in the IP/UDP header of the request. Other protocols that query the NAT router for the public address directly are not used widespread due to security considerations (UPnP) or are still in draft state (Midcom).

Remote (provider-side) solutions: SBC, symmetric response routing

Another solution is to ignore the addresses in SIP and SDP data and just route responses and payload back to the originating address seen in the IP/UDP header. Obviously, this only works if at least one endpoint uses a public IP address and if only one side uses this mechanism and if the other starts with sending payload. Many ITSPs use media gateways (sometimes also called Session Border Controllers - SBC) that implement this (far-end NAT support or symmetric response routing).

To address all the problems mentioned above, the ITSP has to support symmetric response routing for

  • media: send RTP payload back to the transport-address as seen in IP / UDP header instead of the one received in c/m-lines of the SDP data
  • SIP responses: see rport
  • SIP Requests: Store received IP and UDP source address as contact instead of the contact-header and add Record-Route header if working in proxy mode
  • SIP Registration: store received IP and UDP source address as contact instead of the contact-header

rport (RFC 3581)

The rport mechanism changes the SIP routing behavior, so that responses can be received through a NAT even if private addresses are used in the SIP headers. It does not offer a solution for SIP registration and RTP establishment.

For those interested, some more details: The SIP RFC specifies that responses to requests are sent back to the IP address seen in the IP header, but to the port seen in the SIP Via: - header. To achieve this in a multi-proxy environment, every proxy adds a received-parameter containing the source IP address of the received packet to the Via-Header of a SIP message it forwards.

When the rport-Parameter is added by the client to an outgoing SIP request, the receiving proxy also fills this parameter with the source port it sees in the received UDP header and uses it to route back the responses.

Offered solutions by the HiPath / OpenScape telephony systems

Our telephony systems support STUN and can trigger payload establishment with a provider-side media gateway. (This is needed if the ITSP wants to send payload before we start sending, e.g. if the provider wants to play back an announcement like “number not available” before the local phone starts sending payload.) Furthermore, the rport mechanism is always used (if not deactivated in the ITSP profile), because it does no harm and can help in diagnostics. To keep NAT/firewall bindings open, empty UDP packets are sent to any SIP proxy and registration gateway of all registered ITSP.


STUN (RFC 3489 and RFC 5389)

NAT type detection

The STUN protocol allows the detection of the way how the router performs NAT. This NAT type detection works with most, but not all routers that are available.

Different NAT types were defined in the original STUN RFC 3489. Experience showed that these are ambiguous sometimes and the detection method specified in RFC 3489 is unreliable for some routers. For these reasons RFC 4787 introduced new terms and now distinguishes between NAT behavior and filter behavior. The new STUN RFC 5389 makes use of the new terms.

RFC 3489 (STUN v1) RFC 4787 (NAT behavioral requirements for UDP)
Symmetric NAT NAT with Endpoint dependent mapping (EDM-NAT)
Port Restricted Cone NAT NAT with endpoint independent mapping (EIM-NAT) and Address and Port-Dependent Filtering
Restricted Cone NAT NAT with endpoint independent mapping (EIM-NAT) and Address-Dependent Filtering
Full Cone NAT NAT with endpoint independent mapping (EIM-NAT) and Endpoint-Independent Filtering
Symmetric UDP firewall no NAT, but filtering
Open Internet No NAT, no filtering

Table 1: Comparison of old and new NAT type definitions

Red
STUN does not help in NAT traversal
Green
STUN can be used for NAT traversal
Grey
No NAT traversal necessary


The filter behavior is of no interest to the HiPath / OpenScape telephony system: It always assumes that it needs to open a binding to the IP address and port of its peers and thus send empty UDP packets to open connections (RTP) or keeping bindings alive (SIP registration).

Example SIP packets

Outgoing INVITE without STUN

INVITE sip:08970070@sip2.fsys.co.uk:5060 SIP/2.0
Accept: application/sdp
Via: SIP/2.0/UDP 192.168.138.70:5060;rport;branch=z9hG4bKf08fd3da8b8f1cab9
Max-Forwards: 70
From: <sip:84415955@fsys.co.uk>;tag=5509ef8b35
To: <sip:08970070@sip2.fsys.co.uk>
Call-ID: 5675ca5b59b3a97e
CSeq: 1076853466 INVITE
Allow: INVITE, ACK, OPTIONS, BYE, CANCEL, REGISTER, INFO
Contact: <sip:192.168.138.70:5060>
Supported: 100rel
User-Agent: OpenScapeOffice M5T SIP Stack/4.0.26
X-Siemens-Call-Type: ST-insecure
Content-Type: application/sdp
Content-Length: 355

v=0
o=MxSIP 0 1965008630 IN IP4 192.168.138.70
s=SIP Call
c=IN IP4 192.168.138.70
t=0 0
m=audio 30520 RTP/AVP 9 8 0 18 101
a=rtpmap:9 G722/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:0 PCMU/8000
a=rtpmap:18 G729/8000
a=rtpmap:101 telephone-event/8000
a=silenceSupp:off - - - -
a=fmtp:18 annexb=no
a=fmtp:101 0-15
a=sendrecv

Example 1: Outgoing INVITE without usage of STUN

As you can see, the own IP address is used for both SIP signaling (Contact and Via header) and payload negotiation (c- and m-line). The o-line also contains a private IP address, but this is not critical for payload establishment. A NAT router changes the IP address and may change the UDP port in the IP / UDP header, but it does not look into the packet's payload (body), so the addresses in SIP header and SDP body are no longer valid.

Outgoing INVITE with STUN

INVITE sip:08970070@sip2.fsys.co.uk:5060 SIP/2.0
Accept: application/sdp
Via: SIP/2.0/UDP 80.144.207.205:62600;rport;branch=z9hG4bKf0791ba4f4197c0dd
Max-Forwards: 70
From: <sip:84415955@fsys.co.uk>;tag=c404812abd
To: <sip:08970070@sip2.fsys.co.uk>
Call-ID: b0f373a4d6609fa9
CSeq: 1151429907 INVITE
Allow: INVITE, ACK, OPTIONS, BYE, CANCEL, REGISTER, INFO
Contact: <sip:80.144.207.205:62600>
Supported: 100rel
User-Agent: OpenScapeOffice M5T SIP Stack/4.0.26
X-Siemens-Call-Type: ST-insecure
Content-Type: application/sdp
Content-Length: 354

v=0
o=MxSIP 0 857906743 IN IP4 80.144.207.205
s=SIP Call
c=IN IP4 80.144.207.205
t=0 0
m=audio 30520 RTP/AVP 9 8 0 18 101
a=rtpmap:9 G722/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:0 PCMU/8000
a=rtpmap:18 G729/8000
a=rtpmap:101 telephone-event/8000
a=silenceSupp:off - - - -
a=fmtp:18 annexb=no
a=fmtp:101 0-15
a=sendrecv

Example 2: Outgoing INVITE without usage of STUN

With STUN the public IP and port can be learned with a STUN Binding Request. These addresses are then inserted in SIP header and SDP body by the telephonie system.


Behavior of the HiPath / Openscape system

Overview of the different STUN modes

Auto

The STUN component does a NAT type detection and if the result shows that STUN can be used and needs to be used it is switched on, otherwise off. In Tabelle 1 the green modes use STUN. For the red mode (symmetric UDP) STUN cannot be used, the grey modes do not need STUN, so STUN is switched off for these NAT types in AUTO mode.

Always

The NAT type detection is done, but STUN is switched on regardless of the result. The result is shown in WBM and log files, so it can be used for diagnostics.

Off

STUN is switched off completely.

Port preserving router

No NAT type detection is made. STUN gains the information of its public SIP signaling address, but does not start any request to lookup the address for payload. Instead of this, it assumes that the payload IP address is same as the SIP signaling IP address and the RTP port is not changed by the router. Practically this means that STUN monitors the public address of port 5060 by sending out a binding request frequently (every 15 seconds). No binding request is made for ITSP calls.

Static IP

This mode is similar to “Port preserving router”, but the IP address is not monitored at all. Instead of this, the address entered in WBM is used as SIP signaling address. For SDP body, the public IP is used and the port remains unchanged.

IP Monitoring

The STUN component monitors the public SIP signaling address by sending out frequent binding requests on the SIP signaling port. If it detects a change in the address, all ITSP calls are released and the ITSP registration is renewed.

Binding opening / keep-alive

Media stream

As soon as the two endpoints of the media connection are known, the system sends out an empty UDP packet to the received media address (c and m-line in received SDP data) to open a NAT/firewall binding. This empty packet also triggers the ITSP’s media gateway to start sending payload (if in “symmetric response routing” mode).

Registration

The fully qualified domain name (FQDN) of any active ITSP that is successfully registered or needs no registration is added to a list.

Every 15 seconds for every list-entry a DNS query is made. This name resolving is done in the same way as it would be done for an outgoing call (SIP INVITE) for this ITSP, that means if the configured port is 0, DNSSRV is used for resolving.

Then an empty UDP packet is sent to every IP address returned by the DNS.

With that mechanism, a binding is opened to all possible proxy servers of the ITSP that may send us a INVITE for an incoming call.

Note that a multi-level cache makes sure that the impact of name resolving on system or network performance is as small as possible.


Interworking of STUN and NAT routers in practice

'Nice' routers (compliant with RFC 4787)

A well-behaving router works fine 'out of the box'.

A good router works fine in combination with STUN Mode 'AUTO'.

Port preserving routers

The biggest problems with STUN are some routers that behave non-deterministic. The NAT type detection detects a NAT that works fine, but under load the router changes it behavior.

For example, a lot of routers try to preserve the port if possible. NAT type detection then usually detects a Port restricted Cone NAT and uses STUN. Everything is fine, until it happens that the port needs to be changed, because it is already in use by another host or was used before by the same host and the binding has not timed out yet.

Some routers then fall back to a symmetric NAT, some keep their Cone NAT behavior. For the first it often helps to change the mode to port preserving router, for the latter the Auto (or Always) mode works better.

The best solution in both cases is to open the RTP port range on the firewall/NAT router and configure a port-forwarding rule that forwards the whole range unchanged to the telephony system. Then the Port preserving router mode can be used reliably.

Application Layer Gateways (ALG)

Some routers offer more or less well implemented application layer gateways.

An ALG looks into the packet and changes the transported addresses in the same way as it does the NAT. A good ALG needs to know the protocol to do this correctly. SIP is supported by most ALG, but sometimes the implementation has significant problems (see below).

In general it is a bad idea to activate both STUN on the telephony system and ALG on the router at the same time, because they both try to do the same. While in theory a good ALG should cope with that, experience shows that this often results in really strange behavior (e.g. some ALG then leave outgoing packets unchangend, but change the public IP inserted by STUN back to the private IP on incoming packets).

Either enable STUN (system) and disable the ALG (router) or vice versa.

Please be aware that if an ALG is used, it is the responsibility of the ALG to do the NAT traversal. Experience shows that some ALGs are badly implemented and for example cannot handle two simultaneous connections, because they mix up the used RTP ports.

Symmetric NAT

Symmetric NAT cannot be traversed by STUN, because the information gained with the help of the STUN server is not valid for packets that are sent to another server (SIP proxy).

A solution would be to send the STUN requests to the SIP proxy and payload partner directly (SIP Outbound / ICE). This is currently supported by neither our telephony systems nor any ITSP we know.

However, by enabling port forwarding on the router, a symmetric NAT router can be “changed” to a “port preserving router”.

Open the RTP port range on the firewall/NAT router and configure a forwarding rule that the whole range is forwarded unchanged to the telephony system. Then the “Port preserving router” mode can be used reliably.

Session border controller

The term session border controller (SBC) is not clearly defined. In general a SBC is a device that terminates (in the sense of “communication endpoint”) both SIP signaling and RTP media stream.

As such they have full control over signaling and voice payload and can and must perform NAT traversal on their own.

Sometimes devices or configurations used by ITSPs for far-end NAT traversal are also called “SBC”. For details about these, see section Far-end NAT traversal.

Some typical SBC features are also provided by the HiPath / Openscape telephony systems, so there is usually no need to install a SBC in the local network.

If a customer whishes to install a SBC within the local network, the local network must be configured according to the SBC’s documentation and guidelines.

If a SBC is used within the local network, it is recommended to disable STUN.

Far-end NAT traversal

Some ITSP use far-end NAT traversal mechanism like symmetric response routing.

There are differences in their implementations, so there is no general recommendation how to configure the HiPath / Openscape telephony system.

Some ITSP only activates their far-end NAT traversal support only, if they see known private IP address (e.g. 192.168.0.1 or any other RFC 1918) in SDP.

It may also make a difference if a call is made to (or coming from) another user of the ITSP or a different provider/phone company.

Even if the ITSP supports far-end NAT traversal mechanism, activating STUN may help to establish payload or get shorter (lower latency) speech paths. On the other hand, with some ITSP and network configuration, disabling STUN will yield better results.

It is recommended to test incoming and outgoing calls to both external numbers (e.g. mobile phone) and another user account of the same ITSP (that has be be registered at a different telephone system using another public IP address (e.g. other DSL connection)) to make sure the voice connection is fine for all calls.


Examples of misbehaving network devices

In this chapter some problems we had in the past are mentioned. This is neither a complete list nor a detailed analysis of the problem’s cause. It is meant to give a ‘feeling’ for the kind of problems that may occur when setting up an IP network.

NAT routers

Port preserving with symmetric fallback

Some routers try to preserve the used port if possible, and are detected as "Port restricted Cone NAT". However, if they cannot preserver the port, some of them fall back to a symmetric behaviour.

Linux Kernel

For example, the behaviour of the Linux kernel and it's netfilter ("iptables") component depend on some configuration details. The kernel has a "symmetric NAT" behaviour, if only a "masquerading" or simple NAT is configured and if the ITSP starts sending payload before the local telephonie system does. Otherwise it tries to preserve the port and keeps a "port restricted cone NAT" behaviour if this isn't possible.

If a Linux system is used as router, the simplest way to avoid a "symmetric NAT" behaviour is to add a filter rule that drops any incoming UDP packet, that has no conntrack entry. A default policy of "DROP" for incoming packets on the WAN interface should be sufficient.


  • load problems

ALG

An ALG (Application Layer Gateway) is a component running on the NAT router (or a device tightly coupled with it). It inspects routed packets in detail (application payload), analyses the content and changes containing IP addresses and ports in the same way as the NAT component changes the IP / UDP header in the packet.

The problem is that the ALG needs to know the transported protocol (e.g. SIP and SDP) to do this correctly. As this is very complicated to implement, most known ALGs have flaws that often lead to problems that are hard to analyse.

Furthermore, ALG often are activated by default or use different terms (e.g. "generic NAT support", "P2P Support" etc.) so its hard to find them in the configuration menus.

Examples

For example, we have seen a problem with a NAT-Router that has a built-in ALG that was used with STUN. Due to STUN usage, the outgoing messages (INVITE) contained already the public address, so the ALG did not change anything. When the response (e.g. 200 OK) came back, the ALG saw the public address and changed it to the internal one. The SIP stack of the telephony system then discarded the packet because the received Contact header was not the same as the sent one. Turning STUN off solved that problem (the ALG then worked fine), but only for one call: If a second call was started simultanously the ALG mixed up RTP ports and was signaling the RTP port of the first call in the second one.

Such problems can only be found with a detailed look at network traces (e.g. with wireshark), so we strongly recommend to switch of ALG functionality in the router, especially in cheap "home routers".