To understand the options existing to integrate Lync
Server with an existing voice infrastructure, it is important to
understand some fundamentals of telephony. This section discusses some
basic concepts in telephony and how they apply to Lync Server.
Public Switched Telephone Network
The
Public Switched Telephone Network (PSTN) is the common network of
telephony systems across the world. Similar to the Internet, it can be
considered a cloud through which phone systems (as opposed to computers)
are connected. Protocol standards implemented across many different
vendors is what allows for a common set of services such as making and
receiving phone calls to work across the PSTN regardless of where the
calls are placed. Connections to PSTN are analog phone lines, cellular
connections, satellite based, or any other form, as shown in Figure 1. The PSTN serves as the backbone for voice services around the world.
Private Branch Exchange
A Private Branch Exchange (PBX) is a device that
organizations typically have on-premise, which enables them to connect
internal phones, fax machines, or devices together. The PBX on premise
allows for users within the organization to call each other without
traversing the PSTN and incurring charges. A PBX also usually has trunk
lines that connect to the PSTN so that internal users can make and
receive calls with PSTN users when required, as shown in Figure 2.
As telephony has evolved over the years, different
types of PBXs have been used by companies. Usually they fall into one of
three categories:
Traditional PBX—
A traditional PBX is one that does not have IP capabilities. These are
generally very old or low-end systems with limited feature sets. These
systems are usually entirely based on analog handsets for end users.
IP PBX—
An IP PBX is a system that is entirely based on Voice over IP (VoIP).
It does not support analog devices natively and all endpoints are
IP-based network devices.
Hybrid PBX— Many
PBXs have the capability to function as both a traditional PBX with
analog endpoints and as an IP PBX through the purchase of expansion
modules and software upgrades. These PBXs offer the most flexibility for
an organization because they can connect many different types of
devices, as shown in Figure 3 as the business transitions to IP telephony.
Signaling
To
facilitate users who are able to call each other, there must be some
information exchanged between the PBX and the end users, such as the
phone number of the caller and the phone number of the callee. This is
referred to as the signaling information,
and usually contains more than just phone numbers. However, for the
sake of this text, it can be considered what controls the calls. The
signaling information is how a call is placed, transferred, or ended.
The actual voice traffic, or the audio a user speaks and hears, is
considered the media.
Signaling information can come in the form of in-band or out-of-band. In-band
means the information shares the same channel or line as the media. The
most common form of in-band signaling is dual-tone multi frequency
signaling (DTMF), which is sent when pressing keys on a phone. Each key
transmits a unique tone, indicating a different piece of information to
the PBX.
Signaling can also be carried out-of-band, which is
typical for PBX trunk lines to the PSTN or when connecting directly to
another PBX. Out-of-band signaling uses a dedicated channel for the
signaling information while the media or actual voice traffic is carried
in different channels. Using a T1 connection as an example, there are
24 channels each with 64 kbps of bandwidth available. The first 23
channels carry the voice traffic, so 23 simultaneous calls are
supported. The channel 24 carries the signaling information for all of
the first 23 channels. This is considered out-of-band because the
signaling and media are in separate channels on the connection, as shown
in Figure 4.
Voice over IP
As internal networks began to grow, Voice over IP
(VoIP) based PBXs began to emerge. Instead of using traditional analog
lines to connect internal users, the VoIP handsets connected to the PBX
over the IP protocol, just like a computer or any other device on the
network. This allowed voice and data traffic to share a common
infrastructure, which cuts down on wiring and management overheard.
Just like with traditional PBXs, VoIP requires some
form of signaling to control the calls. An early form of signaling used
for VoIP was H.323, and the Media Gateway Control Protocol (MGCP) has
also gained widespread adoption.
The
Session Initiation Protocol, or SIP, has also emerged as a standard
that many IP PBXs use for signaling. Lync Server uses SIP for all of its
internal signaling and for integrations with other PBX vendors because
it provides a common framework for controlling calls. Vendors can also
implement extensions on top of SIP to provide additional signaling
capabilities. These extensions make SIP extremely flexible, but can lead
to interoperability problems between different IP PBXs because each
vendor develops its own extensions.
Media
Although SIP meets the needs for signaling
information, VoIP PBXs still require a method to transmit the media
stream. The Real-Time Transport Protocol (RTP) is used in almost every
VoIP implementation and was developed specifically for transmitting
audio and video traffic across networks. Encryption of the media traffic
was later added in the form of Secure Real-Time Transport Protocol
(SRTP), which is what Lync Server uses by default to ensure that the
media cannot be intercepted and played back.
SRTP only provides a standard for carrying the media
traffic that can be of various media codecs. Media codecs are a way of
translating audio and video data into bits that can be transmitted
across a network. For two users to have an audio conversation, the codec
used by both parties must match to correctly encode and decode the
traffic. Although SRTP carries the real-time media, the parties must
agree on a codec to have a conversation. Figure 5 displays this split of signaling and media traffic, which uses a specific codec such as RTAudio or G.711.
Lync Server 2010 endpoints have the ability to use
two different audio codecs. The default codec is Microsoft’s proprietary
RTAudio codec, which can dynamically adjust its bandwidth to ensure a
certain level of call quality. Lync endpoints can now also take
advantage of the G.711 codec in certain scenarios that many VoIP
implementations have used for years.
When Lync endpoints cannot communicate directly with
another endpoint, the Mediation Server role can be used to transcode
between RTAudio and G.711 codecs in a media
stream. This is typical for when Lync endpoints communicate to a
Mediation Server via RTAudio, but the Mediation Server may communicate
with a media gateway via G.711. The Mediation Server acts as a
translator in these scenarios.