What Is Apache Spark

Apache Spark is a cluster computing platform designed to be fast and general-purpose. On the speed side, Spark extends the popular MapReduce model to efficiently support more types of computations, including interactive queries and stream processing. Speed is important in processing large datasets, as it means the difference between exploring data interactively and waiting minutes […]

Continue Reading

Understanding Service Statelessness

Statelessness refers to the storage of variable values internal to the service. If a service is truly stateless, we should be able to call any method or reference any property on the service, and as long as we pass the same parameters, the service should behave in the same way. In other words, no values […]

Continue Reading

The Golden Rule of API Design

API DESIGN IS TOUGH, PARTICULARLY IN THE ENTERPRISE. If you are designing an API that is going to have hundreds or thousands of users, you have to think about how you might change it in the future and whether your changes might break client code. Beyond that, you have to think how users of your […]

Continue Reading

Continuously Integrate

HE BUILD AS A “BIG BANG” EVENT in project development is dead. The architect, whether an application or enterprise architect, should promote and encourage the use of continuous integration methods and tools for every project. The term continuous integration (CI) was first coined by Martin Fowler in a design pattern. CI refers to a set […]

Continue Reading

Pay Down Your Technical Debt

ON ANY PROJECT THAT IS IN PRODUCTION (i.e., it has customers that are using it), there will come a time when a change must be made; either a bug needs fixing, or a new feature must be added. At that point there are two possible choices: you can take the time needed to “do it […]

Continue Reading

Colocate Pattern

This basic pattern focuses on avoiding unnecessary network latency. Communication between nodes is faster when the nodes are close together. Distance adds network latency. In the cloud, “close together” means in the same data center (sometimes even closer, such as on the same rack). There are good reasons for nodes to be in different data […]

Continue Reading

Busy Signal Pattern

This pattern focuses on how an application should react when a cloud service responds to a programmatic request with a busy signal rather than success. This pattern reflects the perspective of a client, not the service. The client is programmatically making a request of a service, but the service replies with a busy signal. The […]

Continue Reading

Multitenancy and Commodity Hardware Primer

This primer introduces multitenancy and commodity hardware and explains why they are used by cloud platforms. Cloud platforms are optimized for cost-efficiency. This optimization is partially driven by the high utilization of services running on cost-efficient hardware that manifests as multitenant services running on commodity hardware. The decisions made in building the cloud platform also […]

Continue Reading

Queue-Centric Workflow Pattern

This essential pattern for loose coupling focuses on asynchronous delivery of command requests sent from the user interface to a back-end service for processing. This pattern is a subset of the CQRS pattern. The pattern is used to allow interactive users to make updates through the web tier without slowing down the web server. It […]

Continue Reading

Database Sharding Pattern

This advanced pattern focuses on horizontally scaling data through sharding. To shard a database is to start with a single database and then divvy up its data across two or more databases (shards). Each shard has the same database schema as the original database. Most data is distributed such that each row appears in exactly […]

Continue Reading

Eventual Consistency Primer

The Eventual Consistency primer introduces eventual consistency and explains some ways to use it. This primer uses the CAP Theorem to highlight the challenges of maintaining data consistency across a distributed system and explains how eventual consistency can be a viable alternative. In an eventually consistent database, simultaneous requests for the same data value can […]

Continue Reading