This basic pattern focuses on avoiding unnecessary network latency.
Communication between nodes is faster when the nodes are close together. Distance adds network latency. In the cloud, “close together” means in the same data center (sometimes even closer, such as on the same rack).
There are good reasons for nodes to be in different data centers, but this article focuses on ensuring that nodes that should be in the same data center actually are. Accidentally deploying across multiple data centers can result in terrible application performance and unnecessarily inflated costs due to data transfer charges.
This applies to nodes running application code, such as compute nodes, and nodes implementing cloud storage and database services. It also encompasses related decisions, such as where log files should be stored.
The Colocation Pattern effectively deals with the following challenges:
One node makes frequent use of another node, such as a compute node accessing a database
Application deployment is basic, with no need for more than a single data center
Application deployment is complex, involving multiple data centers, but nodes within each data center make frequent use of other nodes, which can be colocated in the same data center
In general, resources that are heavily reliant on each other should be colocated.
A multitier application generally has a web or application server tier that accesses a database tier. It is often desirable to minimize network latency across these tiers by colocating them in the same data center. This helps maximize performance between these tiers and can avoid the costs of cloud provider data transmission.
This pattern is typically used in combination with the Valet Key and CDN Patterns. Reasons to deviate from this pattern, such as proximity to consumers and overall reliability, will be discussed in another article.
Cost Optimization, Scalability, User Experience
When you think about it, this may seem an obvious pattern, and in many respects it is. Depending on the structure of your company’s hardware infrastructure (whether a private data center or rented space), it may have been very difficult to do anything other than colocate databases and the servers that accessed them.
With public cloud providers, multiple data centers are typically offered across multiple continents, sometimes with more than one data center per continent or region. If you plan to deploy to a single data center, there may be more than one reasonable choice. This is good news and bad news, since it is possible (and easy) to choose any data center as a deployment target. This makes it possible to deploy a database to one data center, while the servers that access the database are deployed to a different data center.
The performance penalty of a split deployment can be severe for a data-intensive application.
Enforcing colocation is really not a technological problem, but rather a process issue. However, it can be mitigated with automation.
You will want to take proactive steps to avoid accidentally splitting a deployment across data centers. Automating deployments into the cloud is a good practice, as it limits human error from repetitive tasks. If your application spans multiple data centers but each site operates essentially independently, add checks to ensure that data access is not accidentally spanning data.
Outside of automation, your cloud platform may have specific features that make colocation mistakes less likely.
Operations will generally be less expensive if your databases and the compute resources that access them are in the same data center. There are cost implications when splitting them.
As of this writing, the Amazon Web Services and Windows Azure platforms do not charge for network traffic to enter a data center, but do charge for network traffic leaving a data center (even if it is to another of their data centers). There are no traffic charges when data stays within a single data center.
Cloud Architecture Patterns
By: Bill Wilder