The Voice, Video, and Home Applications Configuration Guide shows you how to configure your Cisco router or access server to support voice, video, and broadband transmission. Cisco's voice and video support are implemented using voice packet technology. In voice packet technology, voice signals are packetized and transported in compliance with ITU-T specification H.323, which is the ITU-T specification for transmitting multimedia (voice, video, and data) across a local-area network (LAN).
This overview is divided into two parts:
The "Configuration Guide Overview" section describes the voice technologies represented in the Voice, Video, and Home Applications Configuration Guide. The "Voice Primer" section provides supplementary information for those users unfamiliar with voice telephony.
The Voice, Video, and Home Applications Configuration Guide document is divided into three parts:
Each of these parts contains one or more chapters that describe configuration procedures for each respective technology. The following sections describe the chapter contents for each part of this configuration guide.
Cisco offers two different implementations of voice technology, depending on the particular Cisco device you are using:
All of these implementations are in compliance with ITU-T specification H.323.
The key to understanding Cisco's voice implementation is to understand the use of dial peers. Dial peers describe the entities to and/or from which a call is established. All of the voice technologies use dial peers to define the characteristics associated with a call leg. A call leg is a discrete segment of a call connection that lies between two points in the connection, as shown in Figure 2 and Figure 3. Four call legs comprise an end-to-end call, two from the perspective of the source router as shown in Figure 2, and two from the perspective of the destination router, as shown in Figure 3. You use dial peers to apply specific attributes to call legs and to identify call origin and destination. Attributes applied to a call leg include Quality of Service (QoS), compression/decompression (CODEC), Voice Activation Detection (VAD), and fax rate.
There are basically two different kinds of dial peers with each voice implementation:
Voice port commands for both the Cisco 3600 series and the Cisco MC3810 define the characteristics associated with a particular voice-port signaling type. Voice ports for both the Cisco 3600 series routers and the Cisco MC3810 provide support for three basic voice signaling formats:
The Cisco 3600 series currently provides only analog voice ports for its implementation of Voice over IP. The type of signaling associated with these analog voice ports depends on the interface module installed into the device.
The Cisco MC3810 hardware features two models, each providing different configuration options for voice ports: either six analog voice interfaces or 1 digital voice module (DVM) that provides up to 24 voice channels. The type of signaling associated with the analog voice ports depends on the analog personality module(s) (APMs) that are installed on the analog voice module (AVM). Each APM provides either FXO, FXS, or E&M signaling. You can have different combinations of APMs installed on the Cisco MC3810 analog hardware version.
The voice port syntax depends on the hardware platform is being configured. On the Cisco 3600 series, the voice-port syntax is voice-port slot-number/subunit-number/port. On the Cisco MC3810, the voice-port syntax is voice-port slot/port.
Cisco IOS Release 12.0 offers the following voice technologies:
Voice over IP enables a Cisco 3600 series router to carry voice traffic (for example, telephone calls and faxes) over an IP network. In Voice over IP, the DSP segments the voice signal into frames, which are then coupled in groups of two and stored in voice packets. These voice packets are transported using IP in compliance with ITU-T specification H.323. Because it is a delay-sensitive application, you need to have a well-engineered network end-to-end to successfully use Voice over IP. Fine-tuning your network to adequately support Voice over IP involves a series of protocols and features geared toward quality of service (QoS). Traffic shaping considerations must be taken into account to ensure the reliability of the voice connection.
Voice over IP is primarily a software feature; however, to use this feature on a Cisco 3600 series router, you must install a voice network module (VNM). The VNM can hold either two or four voice interface cards (VICs), each of which is specific to a particular signaling type associated with a voice port.
Voice over Frame Relay enables a Cisco MC3810 concentrator to carry voice traffic (for example, telephone calls and faxes) over a Frame Relay network. Voice over Frame Relay on the Cisco MC3810 is supported on serial ports 0 and 1, as well as on the T1/E1 trunk.
When sending voice traffic over Frame Relay, the voice traffic is segmented and encapsulated for transit across the Frame Relay network. The segmentation engine is similar to FRF.12. The data segmentation size is from 80 to 1600 bytes, and the size configured must match the line rate, or the port access rate. To ensure a stable voice connection, the same data segmentation size must be configured on the Cisco MC3810 concentrators on both sides of the voice connection. When voice segmentation is configured, all priority queuing, custom queuing, and weighted fair queueing is disabled on the interface.
When configuring voice and data traffic over the same Frame Relay DLCI, traffic shaping considerations must be taken to ensure the reliability of the voice connection.
Voice over ATM enables a Cisco MC3810 to carry voice traffic (for example, telephone calls and faxes) over an ATM network. The Cisco MC3810 supports compressed Voice over ATM on ATM port 0 only.
When sending voice traffic over ATM, the voice traffic is encapsulated using a special AAL5 encapsulation for multiplexed voice. The ATM PVC must be configured to support real-time voice traffic, and the AAL5 voice encapsulation must be assigned to the PVC. The PVC must also be configured to support variable bit rate (VBR) for real-time networks for traffic shaping between voice and data PVCs.
Traffic shaping is necessary so that the carrier does not discard the incoming calls from the MC3810. To configure voice and data traffic shaping, you must configure the peak, average, and burst options for voice traffic. Configure the burst value if the PVC will be carrying bursty traffic. The peak, average, and burst values are needed so the PVC can effectively handle the bandwidth for the number of voice calls.
Voice over HDLC enables a Cisco MC3810 concentrator to carry live voice traffic (for example, telephone calls and faxes) back-to-back to a second Cisco MC3810. Voice over HDLC on the Cisco MC3810 is supported on serial ports 0 or 1, or on 0:x (the T1/E1 trunk, where x represents the channel group number). Voice over HDLC traffic is carried over a serial line. As a result, configuration is simpler than for Voice over IP, Voice over Frame Relay, or Voice over ATM.
Level 2 implementations of voice have additional factors that need to be taken into consideration during the configuration procedure:
Frame Relay-ATM interworking describes how to use the FRF.5 Frame Relay-Interworking function to transport either data or voice traffic over an ATM cloud via a virtual interface within the Cisco MC3810. Using this encapsulation process, you can migrate from Frame Relay to ATM, or you can tunnel Frame Relay traffic across an ATM backbone to a second Cisco MC3810 or other Frame Relay device, and then extract the ATM traffic back to Frame Relay.
You need to configure a single synchronous master clock source on the Cisco MC3810 to prevent data corruption and data loss when receiving and transmitting voice and video data streams. Because voice and video streams are real-time and continuous, the information is normally generated by the source device and received by the destination device at a synchronized fixed rate. If the source and destination clocking is not synchronized, meaning the devices generate information at different rates, there will be a loss of information as one side overruns and the other side underruns.
You can configure the Cisco MC3810 to obtain the master clock from a device attached to one of the two T1/E1 controllers, from a device attached to serial port 0, or to generate the master clock from the internal network clock phase lock loop (PLL). If obtaining the master clock from a device attached to a T1/E1 controller, make sure only one controller's clock source is configured to the line setting. If obtaining the master clock from a device attached to serial port 0, or if generating the master clock from the network clock PLL, make sure both T1/E1 controller's clock source is configured to the internal setting (in most cases). To prevent clocking problems, make sure only one master clock is active at a time.
You can also configure a hierarchy of clock sources so that if the master clock source fails, a secondary clock source can take over.
If the clock source is not configured correctly on the Cisco MC3810, the voice traffic will be disrupted.
At the present time, Cisco offers video support on the Cisco MC3810 only. The Cisco MC3810 supports video traffic within a data stream in two ways:
For this release, Cisco offers broadband functionality through the Cisco uBR7246 universal broadband router. Cisco universal broadband features allow the Cisco uBR7246 to communicate with a hybrid fiber coaxial (HFC) cable network via a Cisco MC11 cable modem card. This cable modem card allows you to connect a cable modem on the HFC network to a Cisco uBR7246 in a Community Antenna Television (CATV) headend facility and provides the interface between the Cisco uBR7246 protocol control information (PCI) bus and the radio frequency (RF) signal on the HFC network.
The Cisco uBR7246 supports both two-way and telephone return modems on a single downstream channel. The Cisco uBR7246 therefore allows both one-way and two-way cable plants to provide cable modem service and gives cable operators the flexibility to roll out service in systems that are only partially upgraded to two-way.
To understand Cisco's voice implementations, it helps to have some understanding of analog and digital transmission and signaling. This section provides some very basic, abbreviated voice telephony information as background to help you configure Voice over IP, Voice over Frame Relay, Voice over ATM, and Voice over HDLC and includes the following topics:
The standard public switched telephone network (PSTN) is basically a large, circuit-switched network. It uses a specific numbering scheme, which complies to the ITU-T E.164 recommendations. For example, in North America, the North American Numbering Plan (NANP) is used, which consists of an area code, an office code, and a station code. Area codes are assigned geographically, office codes are assigned to specific switches, and station codes identify a specific port on that switch. The format in North America is 1Nxx-Nxx-xxxx, with N = digits 2 through 9 and x = digits 0 through 9. Internationally, each country is assigned a one- to three-digit country code; the country's dialing plan follows the country code. In Cisco's voice implementations, numbering schemes are configured using the destination-pattern command.
Until recently, the telephone network was based on an analog infrastructure. Analog transmission is not particularly robust or efficient at recovering from line noise. Because analog signals degrade over distance, they need to be periodically amplified; this amplification boosts both the voice signal and ambient line noise, resulting in degradation of the quality of the transmitted sound.
In response to the limitations of analog transmission, the telephony network migrated to digital transmission using pulse code modulation (PCM) or adaptive differential pulse code modulation (ADPCM). In both cases, analog sound is converted into digital form by sampling the analog sound 8000 times per second and converting each sample into a numeric code.
PCM and ADPCM are examples of "waveform" CODEC techniques. Waveform CODECs are compression techniques that exploit the redundant characteristics of the waveform itself. In addition to waveform CODECs, there are source CODECs that compress speech by sending only simplified parametric information about voice transmission; these CODECs require less bandwidth. Source CODECs include linear predicative coding (LPC), code-excited linear prediction (CELP), and multi-pulse, multi-level quantization (MP-MLQ).
Coding techniques are standardized by the ITU-T in its G-series recommendations. The most popular coding standards for telephony and voice packet are:
In Cisco's voice implementations, compression schemes are configured using the codec command.
Each CODEC provides a certain quality of speech. The quality of transmitted speech is a subjective response of the listener. A common benchmark used to determine the quality of sound produced by specific CODECs is the mean opinion score (MOS). With MOS, a wide range of listeners judge the quality of a voice sample (corresponding to a particular CODEC) on a scale of 1 (bad) to 5 (excellent). The scores are averaged to provide the mean opinion score for that sample. Table 3 shows the relationship between CODECs and MOS scores.
|Compression Method||Bit Rate (kbps)||Framing Size||MOS Score|
G.729 x 2 Encodings
G.729 x 3 Encodings
Although it might seem logical from a financial standpoint to convert all calls to low-bit rate CODECs to save on infrastructure costs, you should exercise additional care when designing voice networks with low-bit rate compression. There are drawbacks to compressing voice. One of the main drawbacks is signal distortion due to multiple encodings (called tandem encodings). For example, when a G.729 voice signal is tandem encoded three times, the MOS score drops from 3.92 (very good) to 2.68 (unacceptable). Another drawback is CODEC-induced delay with low bit-rate CODECs.
One of the most important design considerations in implementing voice is minimizing one-way, end-to-end delay. Voice traffic is real-time traffic; if there is too long a delay in voice packet delivery, speech will be unrecognizable. Delay is inherent in voice-networking and is caused by a number of different factors. An acceptable delay is less than 200 milliseconds.
There are two kinds of delay inherent in today's telephony networks: propagation delay and handling delay. Propagation delay is caused by the characteristics of the speed of light traveling via a fiberoptic-based or copper-based media. Handling delay (sometimes called serialization delay) is caused by the devices that handle voice information. Handling delays have a significant impact on voice quality in a packetized network.
CODEC-induced delays are considered a handling delay. Table 4 shows the delay introduced by different CODECs.
|CODEC||Bit Rate (kbps)||Compression Delay (ms)|
3 to 5
Another handling delay is the time it takes to generate a voice packet. In Voice over IP, the DSP generates a frame every 10 milliseconds. Two of these frames are then placed within one voice packet; the packet delay is therefore 20 milliseconds.
Another source of handling delay is the time it takes to move the packet to the output queue. Cisco IOS software expedites the process of determining packet destination and getting the packet to the output queue. The actual delay at the output queue is another source of handling delay and should be kept to under 10 milliseconds whenever possible by using whatever queuing methods are optimal for your network. Output queue delays are a quality of service (QoS) issue in Voice over IP for the Cisco 3600 series and are discussed in the "Configure IP Networks for Real-Time Voice Traffic" section.
In Voice over Frame Relay, you need to make sure that voice traffic is not crowded out by data traffic. Strategies on how to manage Voice over Frame Relay voice traffic are discussed in the "Configuring Voice over Frame Relay" chapter.
Jitter is another factor that affects delay. Jitter occurs when there is a variation between when a voice packet is expected to be received and when it actually is received, causing a discontinuity in the real-time voice stream. Voice devices such as the Cisco 3600 series and the Cisco MC3810 compensate for jitter by setting up a playout buffer to playback voice in a smooth fashion. Playout control is handled through RTP encapsulation, either by selecting adaptive or non-adaptive playout-delay mode. In either mode, the default value for nominal delay is sufficient.
Figuring out the end-to-end delay is not difficult if you know the end-to-end signal paths/data paths, the CODEC, and the payload size of the packets. Adding the delays from the end points to the CODECs at both ends, the encoder delay (which is 5 milliseconds for G.711 and G.726 CODECs and 10 milliseconds G.729 CODEC), the packetization delay, and the fixed portion of the network delay yields the end-to-end delay for the connection.
Echo is hearing your own voice in the telephone receiver while you are talking. When timed properly, echo is reassuring to the speaker; if the echo exceeds approximately 25 milliseconds, it can be distracting and cause breaks in the conversation. In a traditional telephony network, echo is normally caused by a mismatch in impedance from the 4-wire network switch conversion to the 2-wire local loop and controlled by echo cancellers. In voice packet-based networks, echo cancellers are built into the low-bit rate CODECs and are operated on each DSP. Echo cancellers are limited by design by the total amount of time they will wait for the reflected speech to be received, which is known as an echo trail. The echo trail is normally 32 milliseconds.
In Cisco's voice implementations, echo cancellers are enabled using the echo-cancel enable command. The echo trails configured using the echo-cancel-coverage command. For example, Voice over IP has configurable echo trails of 16, 24, and 32 milliseconds.
Although there are various types of signaling used in telecommunications today, this document describes only those with direct applicability to Cisco's voice implementations. The first one involves access signaling, which determines when a line has gone off-hook or on-hook (in other words, dial tone). FXO and FXS are types of access signaling. There are two common methods of providing this basic signal:
In Cisco's voice implementations, access signaling is configured using the signal command.
Another signaling technique used mainly between PBXes or other network-to-network telephony switches is known as E&M. There are five types of E&M signaling, as well as two different wiring methods. Cisco's voice implementation supports E&M types I, II, III, and V, using both 2-wire and 4-wire implementations. In Cisco's voice implementations, E&M signal types are configured using the type command.