So as someone who was previously employed by Hostway Corporation, now the largest webhosting company in America, I periodically check back on my former employers site to see what’s new in the web-hosting marketplace. Let’s face it, the world is turning digital, so you have to keep on the cutting edge in order to stay competitive.

So low and behold, I see something that certainly catches my eye, labeled: Hostway Edge Caching Content Delivery Network (CDN). Now as someone who has had more than 3 1/2 years in the profession, this certainly intrigued my interest.

Let’s go over some of the key points that Hostway is making:

  • Faster load times — Our global network of servers can dramatically improve your load times by speeding data long the shortest route. It also makes audio and video files play smoothly by speeding the data transmission and eliminating pauses in streaming content.
  • High availability — With content distributed across the globe, a server failure or disaster in one area won’t impact your Web site. Your content will be sent along the next-best channel to your customers.
  • Scalability— Multiple servers give you greatly expanded bandwidth capacity. They can easily process temporary burst requests and other traffic surges.

Now let’s go over a few things here. First off, Hostway is a web-hosting company, meaning they are not an ISP, meaning they do not own their own fibre network. This is the case with most hosting companies, such as Rackspace or Aplus–hosting companies rely on backbone providers to provide network connectivity. I’m sure you’ve heard of these companies, some of the big names are Level3 Communications, Verio, AT&T, Cogent, Time Warner, and good old UUNET (formerly MCI). If network traffic is flying around the internet, chances are they are traveling over one of these providers. In fact, for me to get from my house to Google.com, I traverse one of these providers (Level3):

6 COMCAST-IP.car2.Chicago1.Level3.net (4.71.182.34) 41.871 ms 20.378 ms 20.328 ms
7 te-4-2.car2.Chicago1.Level3.net (4.71.182.33) 21.493 ms 21.509 ms 21.508 ms
8 ae-32-52.ebr2.Chicago1.Level3.net (4.68.101.62) 18.984 ms 18.920 ms 32.621 ms
9 ae-1-100.ebr1.Chicago1.Level3.net (4.69.132.41) 32.545 ms 32.553 ms 18.205 ms
10 ae-2.ebr2.NewYork1.Level3.net (4.69.132.66) 47.277 ms 50.774 ms 50.763 ms
11 ae-62-62.csw1.NewYork1.Level3.net (4.69.134.82) 62.983 ms ae-72-72.csw2.NewYork1.Level3.net (4.69.134.86) 51.118 ms ae-82-82.csw3.NewYork1.Level3.net (4.69.134.90) 46.543 ms
12 ae-41-99.car1.NewYork1.Level3.net (4.68.16.195) 43.577 ms ae-11-69.car1.NewYork1.Level3.net (4.68.16.3) 43.531 ms ae-31-89.car1.NewYork1.Level3.net (4.68.16.131) 44.038 ms
13 GOOGLE-INC.car1.NewYork1.Level3.net (4.71.172.86) 42.583 ms 43.977 ms 45.734 ms
14 216.239.43.146 (216.239.43.146) 65.769 ms 72.14.236.213 (72.14.236.213) 48.585 ms 91.108 ms

As an end user, you typically don’t know that this is the case, let’s face it, folks like AOL really screwed it up for the rest of us, as they lead people to believe that the internet IS AOL (lol!). Really the internet is just several layers of networks on-top of networks with servers at various levels and points. When everything is said and done when you go to your browser and try to pull something up, such as http://www.google.com/ you are not just going from your computer directly to Google, you are passing through dozens upon dozens of servers to eventually get to your content page.

Ok, so back to CDN, so Hostway doesn’t own their own networks, they probably lease them and create virtual private networks where they can prioritize their traffic over their network space. It’s sort of like when you get a leased car, you drive the car, you maintain the car, you pile your friends into the car and go off to god knows where, but when everything is said and done the car belongs to the dealership and not you–same thing applies here. So if the network isn’t theirs, where does that leave us?

From what it sounds like, this is just an elaborate attempt to have HA over a cluster of servers that are spread out over various datacenters. Now let’s break that down. HA, what is that? HA is High Availability. This means that if you are someone who has a blog (like me!) you do not need HA. If your blog goes down, you wine and complain to your hosting company (like Hostway!), but when the day is said and done, I didn’t loose money from it, my business is not hurt because of this, I didn’t loose a bunch of traffic that was destined for my site due to click through ad campaigns, the blog goes down it’s not the end of the world (although there are some people out there that pretend that it does, let’s keep things on the up and up folks!).

For other people, like banks and Fortune500 companies, HA means everything. HA is basically a system whereby there is no single point of failure. When someone goes to http://www.chase.com/, let’s break down what potentially could be Chase’s HA architecture.

First, at the domain level, the name servers are distributed. What this means is that when you resolve the domain chase.com, the name servers that are authoritative for that domain are not all located in one datacenter. Think of it like you are putting all your nuts in one basket, if the name server that are authoritative for your domain are located in the same datacenter, what happens if that datacenter goes off the grid? All traffic to your site stops because there are no authoritative name servers that can tell you where the server is located for things like web traffic, email, etc. How do I know this to be the case? Do a simple whois lookup, or if you are running a linux box at home, run the command:

host -t ns chase.com

and you will see an output similar to the below:

host -t ns chase.com
chase.com name server ns1.jpmorganchase.com.
chase.com name server ns2.jpmorganchase.com.
chase.com name server ns05.jpmorganchase.com.
chase.com name server ns06.jpmorganchase.com.

If you go further into this, you will see that each of the name servers points to a separate IP address:

host ns1.jpmorganchase.com
ns1.jpmorganchase.com has address 159.53.46.53
Columbus, OH

host ns2.jpmorganchase.com
ns2.jpmorganchase.com has address 159.53.78.53
Lisle, IL

host ns05.jpmorganchase.com
ns05.jpmorganchase.com has address 159.53.110.152
Chicago, IL

host ns06.jpmorganchase.com
ns06.jpmorganchase.com has address 159.53.110.153
Chicago, IL

And each of their IP addresses will be hosted in different datacenters strategically located throughout your general geographic area (GeoIP most likely). As you can see each of the above IP addresses for their name servers are located around my area (Chicago, IL), but if one of their datacenters goes down in Chicago or it’s surrounding suburbs, there will be several other name servers to take it’s place to resolve chase.com. As you can imagine with a global company like Chase they will have hundreds of name servers out there to ensure that no one datacenter, no one network, no one provider will cause their site to ever be in a situation where people cannot resolve their domain.

So that’s the name server level, let’s look at their website. Most likely what they have is a complex network of front line servers. Now what these servers are most likely, are load balancers that look for a variety of things:

  • Where you are coming from geographically
  • What URL you are resolving either www.chase.com or chase.com (these can resolve two different addresses if you are not familiar with web protocols)

Among other things. What the load balances do is basically just hand off your request to the best available front line web server that will process your request. This means that at the initial stage there is no single point of failure as there are several load balancers that can take your request and pass it along to a web server. If one load balancer goes down, another one comes up (N+1 failover) or the request is routed to another load balancer that is already taking requests. The same concept would be applied in a situation where the load balancer is getting overloaded with requests, it will either pass the traffic to a less congested load balancer or it will activate load balancers that are just waiting to be used to handle the high traffic, then when the traffic dies down the extra load balancers stop taking requests and sit there waiting for the next period of high load/traffic.

So that prevents the load balancing to multiple web servers from ever being a point of failure. Now at the web server level, we look at clusters, and not just clusters, but server farms. This is to say that instead of thinking that www.domain.com resolves to a single webserver and that webserver is the authoritative place for content to be served, there are hundreds of webserver nodes that are in place just to process web traffic. This can be done at multiple levels, software or hardware, but the end result is that if a web server node goes down or experiences high load, another one will take it’s place (N+1) or the traffic will shift to another less busy web server node.

Now this is good and all, but you may be asking yourself, well that leaves the data that is actually being served, the content files or the database calls. As you can probably tell from the above, we can use similar architectural steps all the way down the line through the file storage system to the database system to cluster, balance, and fail over to servers or storage arrays that are located either in the same data center or in data centers around the world. The great thing about the information age is we are not limited to any one option, we can blend and expand both vertically (more hardware) or horizontally (more datacenters) to achieve better performance, uptime and ultimately a better customer experience.

So this in a nut shell is HA, the basic concept behind this is fairly simple, remove any single point of failure, from the network, to the datacenter, to the server to the data that is being served, and to design a system where by you can go into a single datacenter, flip the off switch and watch your site not experience a single second of downtime or flux in the customer experience.

Ok, so that was a long drawn out explaination of HA and that is still just scratching the surface. Let’s go back to Hostway’s CDN for a moment. I said that it looks like this is an elaborate HA cluster spread over their private network space. What does this mean for you and do the benefits out-weight the costs?

Let’s examine the speed aspects for a moment here to see if an end-user observes a difference. From their webpage, they provided two URL’s that you can test the speed difference using the original non enhanced website (I’m assuming that this is being served from just one server), as well as a URL for a page that is being served via CDN (multiple servers in multiple datacenters strategically placed around the world). Now let’s get some idea of what I’m testing this on. My connection is a Comcast 6MB w/ PowerBoost (means if there is extra bandwidth on the network, I can burst up to 12MB instead of 6MB), I tested from http://speakeasy.net/speedtest/ using the Chicago, IL location for my test. I tested it 3 times over a period of 5 minutes and below are my base line results:

Download Speed: 7076 kbps (884.5 KB/sec transfer rate)
Upload Speed: 1413 kbps (176.6 KB/sec transfer rate)

Download Speed: 6676 kbps (834.5 KB/sec transfer rate)
Upload Speed: 1386 kbps (173.3 KB/sec transfer rate)

Download Speed: 7586 kbps (948.3 KB/sec transfer rate)
Upload Speed: 1445 kbps (180.6 KB/sec transfer rate)

Average Download Transfer Rate = 889.1KB/sec
Average Upload Transfer Rate = 176.83KB/sec

So you can tell from the above, on each attempt I had a fairly standard distribution on the speed tests with no great deviations.

Next, I ran 10 tests over a period of 5 minutes where I launched the first page (Original single web server), and clicked on the Click to Reload link in the upper right hand corner. My results were:

5.094
3.18
3.316
4.313
4.209
3.901
3.402
3.949
3.563
4.455
Average = 3.94

So an average of 3.94, that’s not bad! Next I tested the enhanced page (multiple web servers) using the same test method as above, and got the below results:

3.883
3.61
3.491
3.612
3.689
3.376
3.694
3.436
3.484
3.927
Average = 3.62

Hmm, now wait, that’s not as impressive as I thought. We are talking about less than .3 seconds average gain from the original page (single server) to the enhanced page (multiple servers). Let’s think about this–why would this happen?

Well the first thing I can think about is the idea that multiple servers in spread out geographical area’s is a good thing. From the Hostway site, I see that their servers are located in:

North America
Europe
Asia
Australia

So for all intents and purposes if the idea behind CDN is to find the network that will give me the fastest results, as a US based end user, we can immediately scratch out Europe, Asia, and Australia from the mix, as the network latency alone in connecting to one of these datacenters would destroy any gains in performance by having many servers working to serve my content.

And in North America, Hostway operates datacenters in Vancouver (NetNation), Chicago (Hostway), Austin (Dedicated Central), Tampa (PowerMedium), and Ft Lauderdale (Affinity). So for all intents and purposes, if I am a US based business, looking to use CDN to speed up my trafffic, my traffic would potentially be served from any of the above 5 datacenters based on the location of the requestor (the person in front of a browser looking for the website or content from the website).

Now that is great from a network perspective, but what about server and data storage? The questions that need to be asked, and they are not available from the website are:

  1. Is this a true HA cluster, in that there are truly multiple servers acting as front end webserver nodes. So if Chicago goes down, one of the other datacenters will seemlessly pick up the traffic and not allow for any downtime.
  2. What about the backend, how is the data being served up? Are they using high end file stores, such as NetApp’s to serve my data? Is there data replication between each datacenter so that if data is lost due to corruption in Chicago, that a copy will safely be in each of the other locations?
  3. How often is data being replicated. You can’t have a large data storage network like this with real time replication (well you can, but the overhead is quite large). So how often is data updated between all the various storage backends?
  4. Are there backups that go to the other locations, such as Europe and Asia, so in the event all internet traffic dies in the US, the other data centers in geographic locations around the world also receives a copy of the data?

Given the gains that I can see from an end user perspective, I’m not sure I’m convinced that CDN is all that it’s hyped to be. What are the chances that if you have a dedicated server in a Tier1 hosting providers datacenter will go down for a prolonged time? Actually when it comes to Hostway this has happened (Click here for link from Netcraft)!

Best practices. So don’t be shocked, this isn’t the first time, and it won’t be the last, but as a consumer you have to be educated and prepared for the marketplace. Some of the things that you can do to ensure you have a good stable platform is:

  1. Provider independent. A lot of people try to have services all from one provider, and while this makes your CFO happy because he only has one bill, this doesn’t provide you with any type of competitive advantage when it comes to negotiating rates, having the ability to pull the plug if something goes wrong, or flexibility in terms of geographic locations and server types.
  2. You should always consider having name servers with different providers that are redundant and geographically located.
  3. You should have several web servers that serve out your content. While in some HA clusters this is a complex process, you can simplify this greatly by using neat tricks on DNS (discussed below) and single server architectures.
  4. Use a globally recognized redundant mail platform. Web hosts boast a robust platform, this that and the other. Why don’t you just throw in the kitchen sink while you are at it! Give up on this, and either go with the free version of Google Apps or the paid version for Enterprise use. I’ve used Google Apps for about a year now on my domain, and I have been quiet happy with it’s performance, availability, storage and features–best of all, it’s free! You email will be fully redundant because your MX records will be located in various datacenters run by Google, and the system is entirely web-based (although you can use Outlook or something on a POP or IMAP connection, but who needs it when you are already on the web!). The only downside is if you are a company that uses Blackberries, sorry to tell you, but Google Apps doesn’t have a nice integration with BES (Blackberry Enterprise Server). Hopefully the good Ph.D folks at Google will think up something brilliant as they always do. But think about it, your email is on the same platform that google.com is hosted on. Have you ever seen google.com offline? Netcraft doesn’t have any reported downtime on google, so you have to be confident that your email will always be there when you need it.
  5. Make your IT person work for their pay! Goodness, this is a no brainer, your business process should not make the IT job easier, it should make it balanced between ease of use, economics, and best practice. If your IT staff have to put into place a manual process of updating information on 2-3 sources when making a change on the web, so be it. It will just force them to find a way to automate it through scripts and nifty replication techniques. Let your IT staff loose and they might wow you a bit!

So these are just a few tips, but you might be interested in what is mentioned in item 3 above. Well let’s say you are a company that needs geographic redundancy, servers in multiple datacenters with different providers, and 99.999 uptime or 100% uptime. Well one option is to make a gigantic load balanced solution using hardware/software load balancers, with heartbeat monitoring on servers and N+1 architecture, blah blah blah. Yes you can do this, and Yes this is the right way to do it, but there are other ways to achieve similar results.

First off, you can start with the DNS level. To do this, first you must understand the basic DNS setup. For a domain, lets say evolutioncreations.com as an example, your DNS would look like:

evolutioncreations.com IN A 192.168.1.1
www.evolutioncreations.com IN A 192.168.1.1

The above saids that if you go to either http://evolutioncreations.com or http://www.evolutioncreations.com they will both resolve to the same IP address: 192.168.1.1 and presumably your IT person or hosting company has setup proper host headers in the webserver so it knows what to do for either domain name.

The other way you can do this is to use round robin strategies. This can be accomplished by setting up something similar to:

evolutioncreations.com IN CNAME www1.evolutioncreations.com
evolutioncreations.com IN CNAME www2.evolutioncreations.com
evolutioncreations.com IN CNAME www3.evolutioncreations.com
evolutioncreations.com IN CNAME www4.evolutioncreations.com
www.evolutioncreations.com IN CNAME www4.evolutioncreations.com
www.evolutioncreations.com IN CNAME www3.evolutioncreations.com
www.evolutioncreations.com IN CNAME www2.evolutioncreations.com
www.evolutioncreations.com IN CNAME www1.evolutioncreations.com

then you would have:

www1.evolutioncreations.com IN A 192.168.1.2
www2.evolutioncreations.com IN A 192.168.1.3
www3.evolutioncreations.com IN A 192.168.1.4
www4.evolutioncreations.com IN A 192.168.1.5

This basically means that when someone goes to either www.evolutioncreations.com or evolutioncreations.com in a browser, the DNS will round robin through 4 different records and could point the customer to 1 of 4 servers that can be located in different datacenters with different providers. This way you would just need a single beefy server (dual proc/dual core) with loads of RAM, if you are running Apache and PHP, make sure it has mod_php enabled, and your Apache conf is properly tuned to allow the appropriate threads to be created and MIN/MAX values. If you are running a site that uses a database, make sure your my.cnf file has the appropriate number of max concurrent connections or user connections to ensure you don’t hit a limit.

What this means is that if your customers are having problems getting to your site, they will either:

  1. Reload the current page, this will attempt to have the current server (www1-4.evolutioncreations.com) try to serve out the content.
  2. They will try to go to the main site: http://www.evolutioncreations.com or http://evolutioncreations.com at which time DNS will most likely route them to another server that should be less loaded.

This is a simple way to have your site load balanced and provider independent. It does require that your IT staff update up to 4 different sites when making file changes and verify that version control processes are in place to ensure that each of the servers are identical. Does take some work, but not impossible.

Oh and one other thing, I just did a google search for “content delivery network”, Hostway is not the first, looks like other providers like Peer1 are also doing this, and with Peer1, they are a network provider (click here).