Continuous Availability for Your Business Networks.
This silo based methodology almost always results in the use of several different availability / DR technologies from an array of vendors, with noticably different designs, capabilities and limited/no integration points. For example, an online web ordering system might deploy network load balancing for front end web servers, some form of data mirroring or clustering for backend database servers, and a 3rd party availability
alternative for middleware. Point of sale solutions, CRM tools, and even BlackBerry messaging environments follow a similar prescription, employing alternate technologies for every layer in the application stack.
Employing such an approach to running a business continuity solution for your company application ecosystem has countless drawbacks. Initially, one must examine the cost implications of utilizing different technologies within a continuous availability or DR architecture. The most apparent cost is the capital outlay for the hardware/software alone. By electing (or being forced) to use different solutions from different vendors, there is no option to leverage economies of scale. Most hardware and software companies offer volume based pricing enticements for larger order sizes, but this option is obviously lost when various alternatives from different vendors are employed.
Moreover if each solution leverages different underlying hardware, disk, or OS technologies, an even higher total cost of ownership will be noticed. Of course cost extends beyond just hardware and software to include implementation, training, and
ongoing management costs. Imaginedeploying even a comparatively basic, three tier application architecture. In the online web ordering example explained previously, one would need to deal with the somewhat daunting task of learning about not only the intricacies of SQL clustering, but also deployment and management of network load balancing and any middleware components needed. Each time a new variation of any of these solutions is released, theres the added cost of relearning a brand new technology.
Then examine the complexity of integrating disparate availability technologies from numerous vendors. Are they guaranteed to interoperate with one another? Is such interoperability built in (doubtful) or will some level of customization and manual scripting (very probably) be required, so that every tier can communicate with the other tiers? If custom scripting is necessary, what happens when even a lone part of the availability architecture changes? Will additional, custom consulting work be needed to develop and re test existing scripts? Last but not least, if and when something breaks down, whose responsibility is it to establish the root cause? With different solutions from different vendors, one must be wary of the inevitable finger pointing that may result when things go bad.
Naturally one option is to simply not integrate the solutions, after all, so long as every part is doing its job, isnt it safe to deduce that the whole system is operational? Not really. Consider for example the deployment of a multi tier, distributed architecture across physical sites for DR purposes. If the entire, primary production site fails, will the servers start up in the correct order and fashion at the remote site, or will some degree of interaction be needed from an administrator?
Now consider the more probabletype of failure, when just one component instead of an entire site fails. Unless youve set up a combined High Availability + Disaster Recovery solution, possibilities are that the single failed component will resume operations at the DR site. But in most cases, the latency between sites will be too high for any multi tier application to function correctly. In this situation, its best to actually fail all of the components across to the remote site as a single, cohesive unit. But once again, how does this coordination take place? Either we are back to scripting the failover in some fashion, or else some hands on administrator engagement is required. When that takes place,
recovery times certainly increase; when recovery times increase, so does the bottom line expense of the outage to the business.