Enterprise Services
Showing results for 
Search instead for 
Do you mean 

Design to Scale in the Cloud - Part 3

Gouri_CM on ‎06-05-2014 07:33 AM ‎09-30-2015 07:52 AM

By: Joost van der Vlies, HP ES Account Chief TechnologistiStock_000019382960Medium.jpg

One of the characteristics that make cloud unique is elasticity, the capability to scale outward and inward, based on demand. This flexibility, combined with pay for use, makes cloud technology very attractive.

But this is easier said than done. Scaling is not only an infrastructure play but for sure a play for the application on top of it.

This is final part of a blog series of three around Cloud Scaling, and focuses on How to scale.


It’s all about the application:

As introduced before, scaling is not only an infrastructure play, but is a play for the application on top of it. A scaling design always incorporates both the application and supporting infrastructure. Most existing applications are only able to scale within the boundaries that were defined a long time ago. It might run on a physical server in virtual compartments, where there is enough room to add more instances or a new physical server might need to be bought to accommodate more instances when the expectation is that the workload will grow. It might also be that vertical scaling would have been enough to accommodate a slight growth in the past years, as some applications grow very slowly. Also, a tight coupling between application and database might limit any scaling, in case one instance is the maximum the application design supports. The facts are that most existing applications are not designed to scale indefinitely.


To be able to do so, the application needs to have the following minimum architectural styles:

1. The primary architectural style for these applications is Service Oriented Architecture. Loose coupling is essential within cloud applications to be able to scale, and to be able to respond to different computing styles.

2. Secondary architectural style is Parallel Computing. Designing and developing application functionality in distinguished service types that can run stateless and more or less independently is essential for scaling, flexibility, integration, etc.

 This makes it possible to decide to scale a set of services (e.g. services that analyze data) instead of all services of a solution (e.g. the presentation services might not need scaling). A solution can then be mapped on several infrastructure service types (e.g. On Premise and (temporarily) Cloud Services). Also using horizontal scaling with these types of services could in theory lead to unlimited capacity. Also, solutions for service to service communication, like simple queues, need to be decided. These solutions are available on several cloud platforms.


So we have an application designed to scale, we know the workload type, we can get the data necessary to identify the need to scale. How do we move forward?


Tools are needed for this type of data detection. Tools have been available for a long time to get technical data about resource usage of a server, and to forward it to a Monitoring Component. To define thresholds in that component, and then send a scale event to a Scaling Manager is quite new. To get functional data and monitor it has been available in enterprise integration solutions, but is not standard for many enterprise applications. A Scaling manager would identify an event (e.g. resource usage too high in defined period), and then execute the defined scaling action (create an additional instance). This action needs to be picked up by a Resource Manager that can launch the requested infrastructure resources. Then, the application needs to be provisioned. The next maturity level in this process is to be able to predict that a certain workload is building up, instead of waiting on static thresholds.


The infrastructure solution can range from providing virtual machines in existing farms on the same deployment node, to launching resources in cloud services. All this depends on the overarching design. An essential characteristic of cloud computing is resource pools, which means that in theory, infrastructure resources are always available to be used and can be provisioned much faster than has been done traditionally. Depending on workload type and response needs, infrastructure can be launched manually, towards fully automated, by accessing the Programming Interfaces of the Infrastructure layer. The response needs are a major choice factor for which infrastructure solution to use, the need to launch in minutes (e.g. public cloud) versus hours (e.g. enterprise managed cloud). Not all solutions need scaling in minutes. Even in certain highly used e-commerce sites, it can be enough to prepare additional infrastructure capacity in the few days before a campaign starts, and then release the capacity when the campaign ends. In case of a highly unknown fluctuation in the amount of users that access sites, horizontal scaling on the spot might be necessary. Decisions need to be made if and how this additional (or released) infrastructure capacity is being administered in Asset Management, CMDB, Billing, and so on.


An important part of effective scaling is how to launch the application resources. When the application resources are bundled within an Operating System image, the launch can be done quickly, especially when the configuration is done dynamically and logically. In this case, the image needs to be managed well, when application versions change. Another way of doing this is to deploy the application on top of the newly received infrastructure resources. Configuring the new application instances within the overall architecture is important, e.g. making sure the load balancing works right, integration of data is possible, and so on.


In addition to the above, the following needs to be taken into account:

  • Data integration between existing environments and new environments (via horizontal scaling or bursting)
  • Making sure sessions originating in existing load balancer towards new environment are handled correctly
  • Security (protecting the data in transit)
  • Asset management (include the new server details)
  • Cleaning (e.g. removing deployed apps, platforms, and data from servers not used anymore).


Note: The bold text refer to components necessary to manage the scaling. They relate to the Delivery and Supply layer of the HP Cloud Functional Reference Architecture. To provide a fully integrated Monitoring Component, Scaling Manager and Resource Manager is what will make the difference.


Let me know your view on how to scale, and especially on tools and products that make this happen.


To know more about HP Helion portfolio of services, register for HP Discover, Las Vegas, and talk to the experts!


0 Kudos
About the Author


Nov 29 - Dec 1
Discover 2016 London
Learn how to thrive in a world of digital transformation at our biggest event of the year, Discover 2016 London, November 29 - December 1.
Read more
Each Month in 2016
Software Expert Days - 2016
Join us online to talk directly with our Software experts during online Expert Days. Find information here about past, current, and upcoming Expert Da...
Read more
View all