In the past few years all cloud providers are providing their users more and more opportunities and bandwidth to use on the cloud side for the user’s hosted physical and virtual servers running network services. While using more bandwidth directly contributes to the higher connection speed and better end-user experience, especially for large file downloads, video streaming or network services with a large number of simultaneous users, only the biggest ones provide an opportunity to host the user’s servers and network services on different physical locations and impact another crucial parameter for high connection speed: latency.
Bandwidth, a maximum capacity of the network connection measured in bit per seconds, defines how much data can be sent through the network pipe and therefore high bandwidth for server connection is desirable, enabling servers to send more network traffic to the end-users in the same period of time. Latency, also known as RTT (Round Trip Time) or Ping time, is measured in (mili)seconds and defines how long it takes for the network packet to travel from the client to the server and return. Contrary to bandwidth, low latency values are desirable since higher latency values means longer delays of network packet delivery and therefore poorer end-user experience regarding the network service or application response time. Although latency depends on several parameters, including bottlenecks and types of the connections on the network path, the most important parameter is network distance between client and server. The smaller the distance, the faster the network packet will reach the destination. In most cases higher latency has greater influence on server/service response time then lower bandwidth. Furthermore, on high bandwidth connections network latency has more impact on network speed and sometimes influence on application response time regardless of high server bandwidth.
High latency also influences the bandwidth utilization since long delays prevent servers to use all available bandwidth because high latency can cause significant network delays and bottlenecks. This situation is known as long fat network (LFN) where bandwidth-delay product, defining how many bits can be on the network link before waiting for the TCP acknowledgement, is very large. To reduce the impact of LFN’s on the network performance, TCP windows scale option and higher values of Initial Congestion Window are used, but the best solution is prevention of LFN’s occurrence by lowering latency by reducing the distance between client and server. The most commonly used method for doing this is using Content Delivery Networks (CDNs) servers which are located near the client. Using CDNs is a good solution for static content delivery where original content from the source server is cached to the CDNs servers and end-users access those closer servers. For dynamic content delivery, for example a real-time or voice application, the best solution is using multiple replicated original servers located on a different geographic locations managed by a powerful Global Server Load Balancer (GSLB) capable to suggest any user the low latency server at any time. By using modern Client-side GSLB technology, the end-user will always measure the latency for currently available servers, in contrast to commonly used imprecise geographic location approximation, and choose low latency, high performance online server for the best user experience and avoid unacceptable content buffering or lagging.
So, the only scenario where both network parameters are optimal, which means when bandwidth is high and latency is low, will ensure fast and uninterruptable network connection for end-users with no congestions and long delays on the network path.