| [ LiB ] |
By implementing QoS, you can grant the appropriate service levels to your mission-critical applications. Because remote-access users do not usually care about the network topology or the high level of security/encryption or firewalls that handle their traffic, your solution must be able to give them what they do care about: an acceptable response time for their applications.
Your users' acceptance levels for delays will vary, depending on the application they are using at the time. What is an acceptable level of delay for FTP might not meet with the same acceptance when accessing a database or running voice over IP.
QoS gives you the mechanisms necessary to give your users this level of performance. QoS is a vital tool designed to ensure that all applications coexist and function at acceptable levels of performance. The primary QoS features you will be concerned with, especially when dealing with VPNs, are as follows:
Packet classification using committed access rate (CAR)
Bandwidth management by policing with CAR, shaping with Generic Traffic Shaping/Frame Relay Traffic Shaping (GTS/FRTS), and bandwidth allocation with WFQ
Congestion avoidance using WRED
Continuity of packet priority over Layer 2 and Layer 3 VPNs with tag switching/Multiprotocol Label Switching (MPLS)
Each of these features is discussed in the following sections.
The end result of packet classification efforts is to group packets based on your predefined criteria so that the resulting groups of packets can then be subjected to specific packet treatments. This can include faster forwarding by intermediate devices or reducing the probability of a packet's being dropped because of lack of buffering resources. It is often necessary that your traffic be classified before tunneling and encryption, because a tunnel header appended to an IP packet might make the QoS markings in the IP header invisible to intermediate routers/switches.
With classification, you can base decisions on a number of match criteria before your traffic leaves:
IP addresses
TCP/UDP port numbers
IP precedencethe 3 bits in the type of service (ToS) field of the IP packet header
URL and sub-URL
MAC addresses
Time of day
As soon as your packets are classified based on your match criteria, the next step is to mark, or color, the packets with a unique ID to ensure that your classification is honored from end to end. The easiest way to do this is to set the IP ToS field in the header of an IP datagram. This marking of packets is the means you use to ensure that downstream QoS features, such as scheduling and queuing, are used for the proper treatment of the packets you have marked.
Differentiated services let network traffic receive premium treatment at the expense of other less-critical traffic on the same WAN link.
After your selected traffic has been classified, the next step is to ensure that it receives the special treatment it requires from the devices. You do this through the use of queuing and scheduling.
You have the choice of two different implementations of Weighted Fair Queuing (WFQ):
Flow-based WFQ Packet classification is based on a flow. Each flow is placed in a separate output queue. When your packet is identified as belonging to a particular flow, it is placed in the associated queue. During times of congestion, WFQ allocates a portion of the available bandwidth for use by each active queue.
Class-based WFQ Packets receive the functionality of WFQ with user-defined traffic classes. You create these traffic classes through such mechanisms such as access control lists. After the traffic is classified, you can assign it a fraction of the output interface bandwidth.
Traffic shaping lets you shape Layer 3 traffic into a desired set of rate parameters to enforce a maximum traffic rate. Its end result is a smooth traffic stream at the IP layer through the use of traffic-shaping queues based on the Service Level Agreement (SLA).
Traffic shaping is based on the concept that bursty traffic can be queued, causing the TCP sender to back off its rate of sending, ultimately ensuring that future transmissions conform to your desired rate.
Policing is used to drop excess traffic, and shaping is used to allow excess traffic to be queued. Shaping can be a better choice where applications are concerned, because shaped traffic does not require a retransmission (dropped traffic does). In this case, Generic Traffic Shaping (GTS) might be the better tool.
Be aware that excessive shaping can result in very deep queues on the shaping device. This might cause the sender to retransmit because of a perceived delay. Policing/dropping of excess traffic is the better choice for IP multicasts or TCP-based traffic related to non-mission-critical applications.
Congestion avoidance is the ability to recognize and act on congestion in the output direction of an interface in an attempt to reduce or minimize the effects of that congestion.
Congestion produces unwanted effects on a VPN and should be avoided if possible. Tools such as Weighted Random Early Detection (WRED), an implementation of the Random Early Detection (RED) algorithm, let you differentiate between treatment of traffic by adding per-class queue thresholds that determine when packet drops will occur. These thresholds can be configured by the user.
Packet dropping is based on the ideal that adaptive flows such as TCP will back off and retransmit when they detect congestion. By monitoring the average output queue depth and by dropping packets from selected flows, WRED tries to prevent the ramp-up of too many TCP sources at once. Without WRED, TCP synchronization might result.
WRED works by dropping packets from low-priority traffic before it drops packets from high-priority traffic. WRED allows you to select up to six such traffic classes.
One issue you might face when implementing QoS in a VPN tunnel is the requirement that the QoS parameter you normally find in the header of the IP packet needs to be reflected in the tunnel packet header regardless of the type of tunnel you choose to use. The four primary tunneling protocols used with VPNs are
Layer 2 Tunneling Protocol (L2TP)
IPSec
Layer 2 Forwarding (L2F)
GRE
L2TP is commonly used for node-to-node applications, with the tunnel terminating at the edge of the user's network. L2TP is based on an IETF-based standard that merges Cisco's L2F tunnel protocol with Microsoft's Point-to-Point Tunneling Protocol (PPTP). L2TP uses third-party security schemes such as IPSec to provide security to packet-level information. L2TP is used primarily with PPP traffic.
GRE tunnels are based on RFC 1702, which allows any protocol to be tunneled inside an IP packet. You can encapsulate data using either IPSec or GRE, both of which can copy the IP ToS values from the packet header into the tunnel header.
This allows devices between GRE-based tunnel endpoints to adhere to the precedence bits you set, improving the routing of premium-service packets. This also gives you the means to use QoS technologies such as policy routing, WFQ, and WRED on intermediate devices between GRE tunnel endpoints.
Differentiated Services, or DiffServ (DS), can redefine the IP ToS byte into a DiffServ byte (the DS byte). The DS byte relays a packet's required QoS level. It is also used to classify packets. DS uses per-hop behaviors (PHBs) to enable common QoS behaviors in the network. The aim is to provide the basis for standards-based QoS in a VPN from end to end.
CAR implements both classification services and policing through rate limiting. You can use CAR's classification services to set the IP precedence for packets entering your network. This allows you to partition your network into multiple priority levels or classes of service. Networking devices within your network can then use the assigned IP precedence values to determine how to treat the traffic. You can use the 3 precedence bits in the ToS field of the IP header to define up to six classes of service.
Your policies can be based on physical port, source or destination IP or MAC address, application port, IP protocol type, or other criteria that can be specified by access lists or extended access lists. You also have the option of classifying packets by categories that are external to the networkfor example, by customer. After a packet has been classified, a network can either accept or override and reclassify the packet according to a specified policy. CAR includes commands you can use to classify and reclassify packets.
Custom queuing (CQ) is designed to handle traffic by specifying the number of packets or bytes to be serviced for each class of traffic. It services the queues in a round-robin fashion, sending only the allocated portion of bandwidth for each queue before moving to the next queue. If a queue is empty, the device moves to the next queue and sends packets from it, assuming that it has packets ready to send.
When you enable CQ on an interface, the system creates and maintains 17 output queues for that interface. You have the option of configuring queues 1 through 16 by associating a configurable byte count, specifying how many bytes of data to send before moving to the next queue.
Queue 0 is a reserved system queue and is emptied before any of the other queues are processed. The system queue is used for high-priority packets, such as keepalive packets and signaling packets. Other traffic cannot use this queue.
For queues 1 through 16, the system cycles through the queues sequentially, sending the configured byte count from each queue in each cycle, delivering packets in the current queue before moving on to the next one. When a particular queue is being processed, packets are sent until the number of bytes sent exceeds the queue byte count or the queue is empty. You can specify the bandwidth a particular queue can use indirectly by specifying a byte count and queue length. CQ is statically configured and does not automatically adapt to changing network conditions.
The bandwidth that a custom queue is allocated is determined by the following formula:
(queue byte count / total byte count of all queues) * the interface's bandwidth capacity
where bandwidth capacity equals the interface bandwidth minus the bandwidth for priority queues.
Priority queuing (PQ) is used to define how traffic is prioritized in your network. You can configure up to four traffic priorities with a series of filters based on packet characteristics to place traffic in these four queues. The queue with the highest priority is serviced first until it is empty, and then the lower queues are serviced in sequence.
This means that PQ gives priority queues absolute preferential treatment over low-priority queues. Packets are classified based on criteria you specify and are placed in one of the four output queueshigh, medium, normal, or lowbased on your assigned priority. Packets that you do not classify by priority are placed in the normal queue.
You can set a queue's maximum length by defining the length limit. When a queue is longer than the queue limit, all additional packets are dropped.
A priority list defines a set of rules on how packets are assigned to priority queues. A priority list can also define a default priority or the queue size limits of the various priority queues.
You can classify packets by the following criteria:
Protocol or subprotocol type
Incoming interface
Packet size
Fragments
Access list
Keepalive packets sourced by the device are always assigned to the high-priority queue. You must specifically configure all other management traffic into queues. Packets that are not classified by the priority list mechanism are assigned to the normal queue.
Frame Relay Traffic Shaping (FRTS) builds on existing support of congestion control by adding capabilities that improve a Frame Relay network's scalability and performance, increasing the density of VCs and improving response time.
FRTS can be used to eliminate bottlenecks in Frame Relay networks that have high-speed connections at your central site and low-speed connections at your branch sites. You can configure rate enforcement, a peak rate configured to limit outbound traffic, to set a limit on the rate at which data is sent down a VC at your central site.
By using FRTS, you can configure rate enforcement to either the committed information rate (CIR) or some other defined value, such as the excess information rate on a per-VC basis. This ability allows you to share the medium with multiple VCs. Bandwidth can be allocated to each VC, essentially creating a virtual time-division multiplexing (TDM) network.
You also can define PQ, CQ, and WFQ at the VC or subinterface level to achieve finer granularity in the prioritization and queuing of traffic, giving you more control over the traffic flow on an individual VC. If you combine per-VC queuing and rate enforcement with CQ, your VCs can carry multiple traffic types, such as IP, SNA, and Internetwork Packet Exchange (IPX), with a bandwidth guaranteed for each traffic type.
By using backward explicit congestion notification (BECN)-tagged packets, FRTS can dynamically throttle traffic by holding packets in the router's buffers to reduce the data flow from the router into the Frame Relay network. The throttling is done on a per-VC basis. The transmission rate is adjusted based on the number of BECN-tagged packets received.
| [ LiB ] |