Americas

  • United States

Breaking the data replication bottleneck

Opinion
Dec 18, 20073 mins
Data ManagementNetworkingWAN

* Size does matter but not sufficient to guarantee successful data replication

The vast majority of IT organizations are concerned about disaster recovery. While there is wide disagreement as to whether the most appropriate backup is a cold, warm or hot site, IT organizations do agree on the need for backing up key business data. Backing up large volumes of data is not trivial and as we’ll show in the next three newsletters, data replication takes a lot more than just a big pipe.

Because of the widespread interest in disaster recovery, vendors often tout how their products enable data replication. Central to the claims of most of these vendors is the size of the WAN link that they support. While we have to acknowledge that size does matter, a high-bandwidth WAN link is not sufficient to guarantee successful data replication. In addition to the throughput of the WAN link, network engineers should also look at the “goodput” of the link. Goodput refers to the amount of data that is successfully transmitted. For example, if a thousand bit packet is transmitted 10 times in a second before it is successfully received, the throughput is 10,000 bps and the goodput is 1,000 bps.

To better understand the data replication bottleneck, it is important to point out how data replication differs from other, more typical data applications.

Most typical data applications, such as inquiry/response applications and VoIP, are characterized by their transfer of moderate amounts of information for brief periods of time. This is in contrast to a data replication application that continually transfers huge amounts of information. In addition, WAN links between branch offices and a data center typically support tens and perhaps hundreds of typical data applications. The links between data centers that support data replication may well support just one or two data streams. Hence, part of the design goal for the link between a branch office and a data center is to be able to support large numbers of simultaneous data transfers, where none of them is terribly high volume.

Another part of the design goal is to overcome the issues that are associated with chatty protocols and WAN characteristics such as delay, while simultaneously making sure that certain bandwidth hogs do not interfere with important, delay sensitive applications. The design goal associated with the WAN links between data centers is a lot simpler – what has to be done to get the highest possible goodput on that link?

In the next newsletter we will discuss the amazing impact that small amounts of packet loss has on goodput. In the meantime, more information on this topic can be found at Webtorials. Also, check out Network World’s IT Buyer’s Guide on data backup and replication, and data replication and mirroring.

Jim has a broad background in the IT industry. This includes serving as a software engineer, an engineering manager for high-speed data services for a major network service provider, a product manager for network hardware, a network manager at two Fortune 500 companies, and the principal of a consulting organization. In addition, Jim has created software tools for designing customer networks for a major network service provider and directed and performed market research at a major industry analyst firm. Jim’s current interests include both cloud networking and application and service delivery. Jim has a Ph.D. in Mathematics from Boston University.

More from this author