Grounded in the Cloud
cancel
Showing results for 
Search instead for 
Did you mean: 

OpenStack Summit Sessions: Operations

Stephen_Spector

As mentioned in a previous post, the OpenStack community is now voting on submitted talks for the upcoming Summit in Paris.  I have broken the submissions down into separate blogs to make it easier to find the available HP Helion submitted talks for your voting consideration.

 

OPERATIONS

Managing Resource Sprawl in OpenStack Cloud

 

Ensuring the best of end user experience along with keeping the cost in control is posing as a big new challenge in the cloud environment. From loading of the landing web pages to quick and easy money transactions completion, achieving great end user experience with increasingly dynamic content that is fuelled by a variety of data and content sources, is a growing challenge. Is increase in resources to the running instances or spawning new instances, are the only solutions to these issues or Are we missing some basic concepts like -

a) Are the compute, storage and network resources allocated being used optimally?

b) Are the critical business workloads running with right configuration?

c) How soon would additional resources be needed to support the increasing demand?

d) Can some instances, which are rarely used, be deleted, to reduce sprawl and save operational cost and additional Capex.

e) How easy would it be to identify and rectify performance hot-spots in your OpenStack cloud?

 

Solution:

Identifying resource sprawl in OpenStack cloud and optimizing would result in huge savings for the Cloud Tenant. Making the OpenStack administrator (a.k.a Private/Hybrid Cloud Administrator) aware of the need and ways to optimize the use of resource capacity in his environment, will not only help him ensure his critical business workloads running optimally but also help in reducing the operational cost.

The solution proposed includes various resource management aspects (placement, optimization, forecast and alerting) using various components–

a. Analytic DB component capable of storing trend data of Virtual Machines, Tenant and Physical Infrastructure like Storage, Network and Hypervisors.

b. Collector that collects utilization and performance trend data from various Virtualization entities using existing metering module in OpenStack (Ceilometer, StackTach and so on)

c. Analytic DB scripts containing algorithms to arrive at optimization and forecasts of various entities thereby identifying optimal usage of the resources.

d. Horizon plug-in (dashboard view) to launch optimization recommendation on per tenant basis. For example, a tenant can launch the forecast of one of his Virtual Instance for future days to identify the resource usage pattern.

e. Alerting component to monitor resource and generate alert based on configurable threshold condition. These alerts will be based on threshold which will be user defined. This will help the cloud administrator decide quick action on the entity.

f. Heat integration to enable easy and on-demand spawning of instances. Configuration of heat templates are recommended based upon the optimization algorithm.

 

Simplified access to Swift's statsd monitoring metrics

As many people know, statsd is the standard reporting mechanism that is used by swift to expose lots of performance statistics.  Further, if you enable it and configure your cluster monitoring solution to import those stats, you can see all kinds if nifty graphs.  However, there are what I consider to be a few limitations as well but also hope to show how they can be overcome.

 

First of all, your monitoring infrastructure has to support statsd.  The good news is that many of the more popular ones do, but most of the lesser used or home grown ones do not.  Since statsd sends its statistics to a specific UDP port, this also means it can only talk to one tool at a time, forcing you to choose one and only one.

 

This talk will describe an open source solution that not only allows you to continue to send statsd data to your centralized management solution, it also exposes the statistics locally and in a more 'tool-friendly' format, similar to what most tools that rely on statistics in /proc would expect to see.  This ultimately makes it easier for anyone to write their own swift monitoring tool and to report those stats at much a much finer granularity.

 

I'll give an overview of the types of data statsd reports, describe how that data can be imported into any tools that don't use statsd (or perhaps you may want to write your own) and finally show some examples of what this data looks like on a swift environment under load and how you could use it to see exactly what swift is doing at any point in time.

 

Monasca Overview: Monitoring as a Service

Monasca is a new cloud-scale, multi-tenant, high performance, fault-tolerant OpenStack Monitoring as a Service (MONaaS) platform described at, https://wiki.openstack.org/wiki/Monasca, that is currently focused on real-time streaming metrics storage/retrieval, alarming and notifications. Development of anomaly detection is in-progress. StackTach.v3, https://github.com/StackTach, is a stream-based processing system for events that is being integrated with Monasca to provide a unified metrics and events processing system.

Join Roland Hochmuth, HP Software Architect, Rob Basham, IBM Cloud Systems Software Architect, and Sandy Walsh, Rackspace Senior Software Developer, to learn about the work that has been done and the road ahead.

A recent article on the state of affairs in monitoring, "What we learnt talking to 60 companies about monitoring", at http://blog.dataloop.io/2014/01/30/what-we-learnt-talking-to-60-companies-about-monitoring/, sums it up as "Yes, Monitoring still sucks & its going to get a lot worse." Find out how Monasca is positioned to address the problems faced in monitoring solutions based upon the real-world experiences of teams who have solid track record running and supporting at cloud-scale.

Monasca is built on a RESTful API and integrates in with Horizon. Both internal operations monitoring uses cases, as well as customer-facing monitoring as a service use cases similar to AWS CloudWatch, are consolidated into a single solution. Streaming events processing is in the process of being integrated based on the StackTach v3 events processing pipeline. We will present early results in our integration of anomaly detection based on the open-source NuPIC Cortical Learning Algorithm (CLA), https://github.com/numenta/nupic, and also used in GROK, http://numenta.com/grok/.

 

A Data Center Management Solution leveraging OpenStack  

Currently, there are many software managing machines in data center but it seems that none of them provide a solution perfectly resolving all the requirements in data center: management software provided by hardware vendors always focus on machines manufactured by themselves; the existing third-party software only provides general-level management due to no knowledge of specific hardware. 

 

This session provides a new data center management solution, which is customizable, modular and supporting plug-ins provided by third-party or hardware vendors and which leverages OpenStack components. First, this solution facilitates easy integration or collaboration with data center's existing IT system; second, everything, including monitoring target, event policies, are customizable, which is greatly convenient for users to use; third, its plug-in mechanism allows hardware vendors to provide specific diagnostics or monitoring solution for hardware manufactured by them to enhance the manageability of their hardware, which will promote their market share, allows user to implement the solution for corner cases for their data center without much efforts.

 

Next-Gen Organizational Design – Growth hacking with “BusDevOps”  

What is "BusDevOps"? We’ve gathered together folks from DevOps, BD, Sales and Product functions to find out and talk about organizational design, culture, productivity and outcomes. What we’re talking about is a mission critical topic that’s core to the survival of any rtrying to scale – how do we get more stuff done?  And how does the Business work with the Product and Engineering teams to make the most of limited resources when you’re moving at light speed? We will talk about organizational, cultural, productivity-enhancing tips and best practices that work. Please bring questions and help us flush this one out!  
Thesis: Whether we’re talking BD Sales or GTM, all must be closely tied in deeply to the product team and aligned with Engineering if not also directly tied on and aligned with the roadmap. A Sales BD GTM (whatever) resource in the front end of the house that operates independently without enough of a hook / tie back in to eng/product is an under-effective resource. This panel will explore “Why?” and seek out best practices for the Community to build around.

 

Continuous delivery and the challenges for a public cloud  

In this presentation, we will discuss the opportunities and obstacles regarding continuous delivery in an OpenStack based public cloud, and will explore the best practices developed by HP to effectively manage the impact of changes and their compatibility across services; the pros and cons and tips and tricks using various deployment mechanisms and the differences between image based (Triple O) versus package based methods. Also covered is management of downtime for customer due to deployment, studying the impact and delivering relevant and timely communication to mitigate interruption.

 

CI/CD Pipeline to Deploy and Maintain an OpenStack IaaS Cloud  

In our role administering an OpenStack IaaS Cloud, we’ve developed a release train allowing local development and testing of configuration management, testing in virtualized environments and automated deployment to staging and production, for the building and maintenance of an IaaS Cloud using upstream vendor OpenStack packages. We will discuss the high-level concepts then review the details of our implementation and the tools we use, and created, to enable this pipeline.

 

We encourage software programming discipline to the administration of the Cloud infrastructure, with peer review, source code management and thorough testing before packaged releases.

 

Similarly, we encourage system administration discipline to configuration management code development, with deployments from scratch to volatile environments and upgrade deployments to stable environments.

Key elements of our approach are the use of upstream vendor packages for OpenStack (Ubuntu); configuration management (SaltStack); unit testing of configuration management (Test Kitchen with Kitchen-salt); Git, Gerrit and Gitshelf for source control management, peer review and packaging from git repos; volatile local development and remote test environments, using Vagrant and salt-cloud; and automated testing, packaging and deployment with the aid of Jenkins.

 

We will outline the CI/CD pipeline:

  1. Provision a      multi-node, multi-network OpenStack development environment using Vagrant      with Virtualbox (nova-qemu) with virtualised nodes representing a minimal      region, and a salt-master with a file_roots tree built as per production.
  2. Develop salt      configuration management code within this development environment. All      installation and administration is managed with SaltStack, with both      incremental upgrade and full bootstrap deployment testing.
  3. Push changes to      Gerrit for both peer review and testing in discrete development      environment instances, and unit testing with Test Kitchen via Jenkins.
  4. Automatically package      a new release in a specific ‘deploy-kit’ repo, using Gitshelf to build      release tarball artifacts from a YAML file of code repo SHA1s.
  5. Automatically deploy      to test environments in Public Cloud.
  6. Automatically deploy      to real hardware in staging and production.

Workload optimization using Heat Orchestration Template

Most of the Cloud providers offer SLAs for availability and uptime, which is not sufficient for enterprise or scale out applications. These applications require the resources to be available and also be performing at an optimal level to meet the customer needs. Guaranteed performance and seamless scale out are some of the SLAs that are not met most of the time by cloud providers.  This is mainly due to the lack of right workload placement decisions through which the requested Quality of Service (QoS) can be realized.

 

In this session, we will discuss about how to use Heat Orchestration Template (HOT) to optimize workload with right placement decisions so that the desired QoS can be achieved, at the same time use resources efficiently and cost effectively.

 

We will talk about optimizing HOT template for workloads requiring High performance, Load Balancing, High Availability, Scale out etc.

 

PDMon: Scaling Icinga to HP’s Public Cloud  

Icinga & Nagios based monitoring solutions continue to be the most prevalent choice in many IT environments.  This is true for HP's Public Cloud, based on OpenStack, as well.  How can Icinga be scaled to monitor thousands of servers and services that occur at scale?  Can Icinga be hardened for HA environments?  Is there a way to give individual teams focused views into a cloud?  How can a NOC monitor their critical subset of service checks?  These issues, and more, are what HP Cloud's monitoring team faced.
This talk will cover the solution to "Icinga at scale" that has evolved in HP's OpenStack Public Cloud over the past few years: PDMon. Starting with the basic architecture of Icinga, a brief overview of integration points will be covered.  Multiple opportunities for extension will be explored based on the points of integration.  The conclusion is the solution to deploying Icinga at cloud scale.

 

Hybrid OpenStack & Non-OpenStack Infrastructure-as-a-Service  

Companies are racing towards the many promises that OpenStack can bring the enterprise and service providers.  However, one of the most formidable obstacles faced by Infrastructure-as-a-Service (IaaS) providers is the transition to OpenStack while currently operating an existing non-OpenStack IaaS offering. HP is no exception having run its ECS-Virtual Private-Cloud and Private Cloud offerings for a number of years. This session focuses on the number of technical challenges achieved to enable OpenStack to run within an existing (legacy) IaaS offering.  Among these, include: Extending TripleO to accommodate restricted, multi-factor authentication/authorization to IPMI for power cycling physical servers within secured environments Enabling SAN in a hybrid environment consisting of pre-existing commodity hardware Running OpenStack within existing networking constraints:Controlled by existing VLAN-based provider network Leveraging existing network hardware (such as Routers/switches, VPN, Firewall, Load-balancer) Co-existence and co-mingling OpenStack and non-OpenStack enclosure, blades, networking and other hardware devices No disruption to existing tenants Enable previously onboarded tenants with authentication done prior to OpenStack and pre-existing billing/invoicing/purchase order processes Leverage existing monitoring functionality across both OpenStack and non-OpenStack servers Single API endpoint to OpenStack and non-OpenStack services Our session will detail how each of the above has been addressed to enable one of the first hybrid production-grade OpenStack and non-OpenStack IaaS offerings targeting large and medium-sized enterprises. The presenters are: Dr. Parag Doshi and Chandra Kamalakantha with Office of CTO, HP Enterprise Services Hrushikesh Gangur, HP Cloud

 

OpenStack – Did You Know?

OpenStack has grown over last few years and is a result of a global collaboration of developers and cloud computing technologists producing the ubiquitous open source cloud computing platform for public and private cloud with the ever growing feature set of OpenStack modules addressing Server, Storage and Networking requirements. 

This session “OpenStack Did You know” aims at providing / enriching the participants with feature set of various OpenStack modules around Server, Storage & Networking.   This session highlights the use of features set of various OpenStack modules in conjunction with specific use cases.   This is just a pointer to what is available and aims to be a catalyst for richer feature set in OpenStack releases to come.   

 

Some of the topics planned to be covered: 

            Feature Set of Nova Compute and various Virt Drivers 

Comparison of Virt Drivers (KVM, ESX & Hyper-V) & IRONIC 

                     

                Feature Set of Glance / Cinder / Swift 

                Feature Set of Neutron and various ML2 Drivers 

                Finally Orchestrate everything through HEAT 

 

Each of the Topics mentioned above can be a session on its own.  However, session would highlight the specific features to bring out the best of Server, Storage and Networking in OpenStack. 

 

CMDB for OpenStack public cloud based assets  

Learn the best practices required to apply a Configuration Management Datatbase (CMDB) to provide a single pane of glass for data from OpenStack services, networks, operational tools and physical assets. Discover how troubleshooting is improved by providing greater visibility to where is the troubled node located, the network switch impacted, and what is deployed on the node.  The CMDB is also an essential component for impact analysis and timely and pertinent customer notifications and provides the data mining required to figure out what customers (if any) are impacted when there is a troubled node.

 

Using Docker for rapid OpenStack development process  

Any OpenStack project development will usually involve the following frequent steps: 
1. Developer makes code changes  2. Community reviews changes  3. Developer fixes and pushes new patch set  4. Rinse-lather 1 thru 3 and merge or abandon patch 
For steps 2 and 3 above, a readily testable development environment is essential. Many times, most developers/reviewers will setup virtualenv or pyenv and test. While this is great for our own patches, as a part of reviewing somebody else’s code, what if we want to quickly run something to validate a discussion point ? Or, sometimes, to prove a point that the patch may not address a particular situation (a state the db is in with particular pipeline setting, for example).

 

This presentation proposes an idea to address these areas using Docker (a lightweight wrapper on top of Linux containers). In this presentation, you will see how Docker could be used to:  • Create a base image for each OpenStack project with a particular configuration (database/authentication providers etc.,) • Make the image available in a registry (Docker provides a default docker.io public registry, but there could be one setup for OpenStack by HP)  • For example, the Barbican development team at HP uses the docker container helion/barbimaster for a Barbican environment pre-configured with Keystone and MySQL  • For every OpenStack patch submission, Jenkins should be configured to build a docker container that includes (base image of the project for a particular configuration + the changes introduced by this patch)  • A reviewer or developer could pull the above image and locally run to debug/comment 
This approach would make it truly possible for an OpenStack developer to not worry about dependencies and just quickly run a container from any machine that supports Docker.

 

Senior Manager, Cloud Online Marketing
0 Kudos
About the Author

Stephen_Spector

I manage the HPE Helion social media and website teams promoting the enterprise cloud solutions at HPE for hybrid, public, and private clouds. I was previously at Dell promoting their Cloud solutions and was the open source community manager for OpenStack and Xen.org at Rackspace and Citrix Systems. While at Citrix Systems, I founded the Citrix Developer Network, developed global alliance and licensing programs, and even once added audio to the DOS ICA client with assembler. Follow me at @SpectorID

Events
28-30 November
Madrid, Spain
Discover 2017 Madrid
Join us for Hewlett Packard Enterprise Discover 2017 Madrid, taking place 28-30 November at the Feria de Madrid Convention Center
Read more
HPE at Worldwide IT Conferences and Events -  2017
Learn about IT conferences and events  where Hewlett Packard Enterprise has a presence
Read more
View all