Operating System 27 | Introduction to Data Centers and Cloud Computing

Series: Operating System

Operating System 27 | Introduction to Data Centers and Cloud Computing

Introduction to Data Centers

(1) Internet Services

Internet service is any type of service that’s accessible via the web interface. The common way in which these services are used is end-users send web requests via web browsers and then receive a response. Most commonly, these types of services are then composed into three components,

a presentations component (static content) that interfaces with the end-users
a business logic component (dynamic and user-specific content)
a database tier component responsible for all the data storage and management

(2) Technology Solutions

Actually, for implementing these three components, there are many open-source and proprietary solutions for either one of the tiers. For example, Apache can be used to support the presentations and the PHP server can be used to support the business logic.

One important point to make is that for services that are organized as multiple processes, the inter-process communication between those processes is carried out via some form of IPC, such as RPC or RMI, or from Java-based architectures. As well as, some use of optimizations that are based on shared memory, in case that different processes are running from the same machine. So these are some of the examples of the interprocess communication mechanisms that we already discussed in this course. And these are relevant in the context of real-world deployments of Internet services.

(3) Functionally Homogeneous Architecture

For services that need to deal with high or variable request rates, choosing the configuration that involves multiple processes configured on potentially multiple nodes, becomes necessary. The solution to this is that we’ll scale out the service deployment by launching the same service on multiple machines. Scaling out means that we can add more workers, workbenches, parts, resources, etc.

We will have a front-end dispatching component that would route the incoming requests to the appropriate machine in this internet service. This is in some sense similar to the boss-worker pattern. If every single node is capable of executing any possible step that’s required for the processing of the request, then this is called a functionally homogeneous architecture because all the nodes are equal.

The benefit of this architecture is that we can make the front-end node quite simple. The front end can simply in a round-robin manner, send requests onto the next available node. This design doesn’t mean that every single one of the nodes has to store all of the data, all of the states that are needed for the internet service. Instead, data may be somehow replicated, or distributed across these nodes and each more can simply get to all the data.

One downside of this approach is that there is little opportunity to benefit from caching. If the front-end is simple, and it just passes requests round-robin onto the other nodes. So it will not be able to explore if a node already serviced some state.

(4) Functionally Heterogeneous Architecture

There is also another possibility that all the nodes can be specialized to execute only some specific step or some specific steps in the request processing, or they can be specialized only for certain types of requests. This is called a functionally heterogeneous architecture, which means that in certain cases where nodes are specialized in certain functions.

The benefits of this architecture are,

every single node in this system is more specialized for some of the tasks that it performs
it definitely take the advantage of caching and locality

However, there are also some obvious trade-offs,

the front end will now need to be more complex
overall management of these systems becomes more complex. When nodes fail or when request rates increase, we may need to deploy more machines. Some of the nodes can become hotspots and thus, we have to configure these machines to make the load balance.

2. Introduction to Cloud Computing

(1) Cloud Computing Example: Animoto

Animoto is a poster child of cloud computing in the early years around 2008 and 2009.

Before we talk about the success of it, we have to introduce something about AWS. In the mid of 2000s, Amazon was already a dominant online retailer servicing large volumes of online sales transactions. The vast majority of these transactions were taking place during the US holiday shopping season between Thanksgiving and Christmas. And to deal with this peak load Amazon provisioned the hardware resources to make sure that they’ve acquired a sufficient number of servers for this particular load. What that means is that the rest of the year a lot of these resources were idle.

What they ended up doing in 2006 is that they opened up those exact same type of APIs to the rest of the world and what this allowed third-party workload to the Amazon’s servers obviously for a fee. This was the birth of Amazon’s Web Services or AWS and Amazon’s Elastic Compute Cloud or EC2.

One of the companies that appeared around the same time as Amazon’s EC cloud was Animoto. So they decided to focus their resources on the development of the mechanisms that make better videos instead of deploying the servers. So instead of buying and running their own equipment, they chose to rent some of the web-provided compute infrastructure that was part of Amazon’s compute cloud.

Then in April 2008, Animoto became available on the Facebook platform, and what happened afterward was the definition of going viral. Within 3 days Animoto signed up 50,000 new users and they had to add the number of their servers from 400 by Tuesday to 3,400 by Friday of the same week. We just would not be able to bring in install wire configure et cetera as many machines in such a short time, and the only reason why this becomes possible is that they used a cloud-based deployment and then they leveraged the capabilities that cloud computing offers.

(2) Requirements for Traditional Approach

Traditionally businesses would buy and configure the resources that are needed for their services. So we have to make sure what are the requirements for this approach,

how many resources
what is the capacity of the resource
what would be the business expectations

Now if the expectations turn out to be not quite accurate and the demand ends up exceeding the provisioned capacity. The business will end up with a situation in which requests have to be dropped and there will be a lost opportunity, especially for the case of Animoto.

(3) Requirements for Cloud Computing

Instead what we would like would be the ideal case is if the following were to happen. The capacity or the available resources should scale elastically with the demand, and the scaling should be instantaneous. As soon as the demand increases, the capacity should increase too. And then in the other direction too as soon as the demand decreases the capacity should decrease as well. Meaning that the cost to operate these resources the cost to support this service should be proportional to the demand to the revenue opportunity. All of this should happen automatically without the need for some hacking wizardry. And all these resources can be accessed anytime from anywhere.

However, one potential drawback here is that you wouldn’t necessarily own these resources that magically appear on demand. But that may be something that you’re willing to compromise on provided that you really do achieve these kinds of benefits.

(4) Cloud Computing Overview

So finally, cloud computing provides the following things,

a pool of shared resources: the cloud provider is renting out physical machines as well as the virtual machines, and they will also rent the softwares and other services
APIs for access and configuration: they need to be accessed and manipulated as necessary remotely over the internet. This typically includes web-based APIs, wrapped-in libraries, or command-line interfaces
billing and accounting services: there can be different models with discrete quantities (e.g. tiny, medium, large, …)

(5) Reasons of Cloud Computing

There are two reasons why we have to use cloud computing. The first one is the law of large numbers. The need can be various if we have only a small amount of customers, however, the average need for a large number of customers is roughly constant. The second reason is the economics of scale, which means that the unit of cost to provide the resource drops at the margin.

(6) Cloud Deployment Models

There can be several models when we consider cloud deployment,

public model: means some the third-party customers can access the cloud
private model: so that the company can leverage the technology internally
hybrid model: some companies choose this to deal with the failover and testing
community model: this means that there can be more than one-third of parties in the same community can have an access to the cloud

(7) Cloud Service Models

What’s more, based on the services we have provided for each cloud, we can also have three types of clouds:

Infrasture as a service (aka. IaaS) Model: for example, the Amazon EC2
Platform as a service (aka. PaaS) Model: for example, the google APPs
Software as a service (aka. SaaS) Model: for example, the online Gmail

(8) Requirements for the Cloud

Finally, there are also some requirements of cloud computing,

it must provide fungible resources: resources can be easily repurposed to support different customers
it must elastically and dynamically adjust the resources: because we have to maintain flexibility
it must be able to manage at very large scales: we may have to support thousands of nodes
it should be able to deal with the failures
it should deal with multi-tenancy: because shared resources can across multiple tenants, we have to guarantee performance and isolation
it should care about security

(9) Clouding Techniques

Given the requirements that we listed in terms of what a cloud system has to incorporate several different technologies come into play,

Virtualization: because we want to hide the hardware details
Resource provisioning: because we need to schedule the resources by Mesos, Yarn, etc.
Big Data: because we need to provide scaling of processing by Hadoop, Spark, etc., and to provide scaling of storage by DFS, NoSQL, etc.
Software: we also need software supports on the network, storage, data centers, and etc.
Monitoring: we also have to implement an efficient monitoring technique like the real-time log processing by Flume, CloudWatch, Log Insight, etc.