As the Internet's growth charges ahead at a phenomenal pace and networks opt to share traffic only with private peering agreements in place, distributing an Internet server's load to multiple machines becomes increasingly imperative. It is my hope that this resource can be developed into one that is comprehensive in it's provision of information on technology and applications related to these needs.Articles that discuss these issues:Distributing content in a way that minimizes the network latency that end users must endure is highly desirable for content providers. While distribution that is network topology based such as Cisco's Distributed Director and IBM's Interactive Network Dispatcher sound attractive, a large portion of Internet end user traffic comes from large networks such as AOL (via ans.net), Netcom, att.net, uu.net and CompuServe - their network performance is subject to router backhauling and NAP configurations. The benefits of content distribution under such a regime are unclear at this time.
Date: Wed Mar 1 11:04:43 EST 1995 Subject: Q4.10 - Distributing load using named Q: If you attempt to distribute the load on a system using named, won't the first response be cached, and then later queries use the cached value? (This would be for requests that come through the same server.) A: Yes. So it can be useful to use a lower TTL on records where this is important. You can use values like 300 or 500 seconds. If your local caching server has ROUND_ROBIN, it does not matter what the authoritative servers have -- every response from the cache is rotated. But if it doesn't, and the authoritative server site is depending on this feature (or the old "shuffle-A") to do load balancing, then if one doesn't use small TTLs, one could conceivably end up with a really nasty situation, e.g., hundreds of workstations at a branch campus pounding on the same front end at the authoritative server's site during class registration. Not nice. A: Paul Vixie has an example of the ROUND_ROBIN code in action. Here is something that he wrote regarding his example: >I want users to be distributed evenly among those 3 hosts. Believe it or not :-), BIND offers an ugly way to do this. I offer for your collective amusement the following snippet from the ugly.vix.com zone file: hydra cname hydra1 cname hydra2 cname hydra3 hydra1 a 10.1.0.1 a 10.1.0.2 a 10.1.0.3 hydra2 a 10.2.0.1 a 10.2.0.2 a 10.2.0.3 hydra3 a 10.3.0.1 a 10.3.0.2 a 10.3.0.3 Note that having multiple CNAME RR's at a given name is meaningless according to the DNS RFCs but BIND doesn't mind (in fact it doesn't even complain). If you call gethostbyname("hydra.ugly.vix.com") (try it!) you will get results like the following. Note that there are two round robin rotations going on: one at ("hydra",CNAME) and one at each ("hydra1",A) et al. I used a layer of CNAME's above the layer of A's to keep the response size down. If you don't have nine addresses you probably don't care and would just use a pile of CNAME's pointing directly at real host names. {hydra.ugly.vix.com} name: hydra2.ugly.vix.com aliases: hydra.ugly.vix.com addresses: 10.2.0.2 10.2.0.3 10.2.0.1 {hydra.ugly.vix.com} name: hydra3.ugly.vix.com aliases: hydra.ugly.vix.com addresses: 10.3.0.2 10.3.0.3 10.3.0.1 {hydra.ugly.vix.com} name: hydra1.ugly.vix.com aliases: hydra.ugly.vix.com addresses: 10.1.0.2 10.1.0.3 10.1.0.1 {hydra.ugly.vix.com} name: hydra2.ugly.vix.com aliases: hydra.ugly.vix.com addresses: 10.2.0.3 10.2.0.1 10.2.0.2 {hydra.ugly.vix.com} name: hydra3.ugly.vix.com aliases: hydra.ugly.vix.com addresses: 10.3.0.3 10.3.0.1 10.3.0.2
These are switches that operate on a higher level; the 4th OSI network layer, thus 'layer 4 switching.' By being aware of the burden that connected devices are suffering from, layer 4 switches can intelligently decide which machine is best suited to fulfill a request.
In the event of a server or router failure, the HyperFlow SP800 transparently re-distributes traffic flow to another server or router. When a server is ready to be returned to the cluster, the HyperFlow SP800 "auto convergence" feature allows the server to be returned without interruption. This same feature allows customers to add and delete servers from the cluster for scalability and maintenance without disruption of services.
The HyperFlow SP800 has eight auto-sensing 10/100Mbps, full duplex ports that can balance bandwidth with server requirements. This allows the customer to add servers of any operating system and any performance, at any time while the cluster is running. This provides maximum utilization of existing investment and the ability to quickly respond to periods of high demand.
Many databases have facilities built-in for replication and synchronization, most file systems do not. For most high availability and high performance Internet servers, serving files from a common NFS mount offers little if any benefit. Most system load is due to disk subsystem I/O, distributing the load to more disk spindles and more disk controllers will offer better performance than centralizing these functions. Many implimentations of NFS also suffer from performance and security concerns. A possible solution to NFS' drawbacks is cachefs but this is only supported by a few platforms (I know of Solaris and IRIX, perhaps there are others I haven't heard of yet).Some of the technologies that come to mind for content synchronization include integrity checkers such as tripwire and source control/distributors such as CVSup but adapting these systems for your Internet server's content may be cumbersome.