AI Insights
Showing results for 
Search instead for 
Did you mean: 

Discover your biggest DevOps problems — by causing them


By: Floyd Pedad


The best way for organizations to identify their DevOps problems is to cause the problems to happen. However, this "fire drill" approach to problem discovery should be implemented only when accompanied by a strategic focus on the "three Ps" of the DevOps ecosystem: product, platform and process.


The three Ps


Devops Fire DrillProduct refers to the software application that is being developed or deployed by the developers. The obvious quality metrics here relate to its performance, stability and features.


Platform pertains to the technical environment and resources on which the software is run: the servers, networking hardware, operating system and other software components. This is largely the responsibility of the operations team. To meet the application's requirements, the operations team must ensure the software's ability to support the demands of users for reliability, scalability and flexibility.


Process covers the collaboration between the developers and the operations group, to allow for the execution of effective and efficient agile software development, deployment and maintenance activities. Today's super agile teams expect these processes to be easily adaptable, executable and as automated as possible.

Across all these areas are agreements on the technology, standards, tools and methodologies used by the organization.


Any problem or opportunity for improvement will either be due to or have an impact on all three Ps. You shouldn't make an evaluation without considering these three areas, and you can't expect to have a lasting impact on DevOps if you don't make adjustments across all three areas.

Break something


The "fire drill" approach to discovering DevOps problems should therefore involve intentionally breaking something in each of these three areas:


  • Product: Secretly push out badly broken code to discover how early the testing process catches the failures and how resilient the platform is to the wayward code that's continuously writing to the database.
  • Platform: Reformat half your application servers to see if the application product will continue to function properly, and benchmark how long it takes your operations process to restore the system back to pre-crisis state.
  • Process: Tell one of your operations personnel to take a week off. Then observe how much slower your product team is to deploy new builds, and determine whether your platform can quickly bring new server resources online.


Emphasis has been added above to illustrate how this fire drill should include an impact analysis across the entire DevOps ecosystem.


Some might say these suggestions are too risky, but often the only way to assess how people will survive a building fire is to conduct an unannounced fire drill. Having a periodic DevOps fire drill may be the easiest way for you to discover the biggest areas for improvement and to bring to the surface issues that may lead to critical problems in the future.


For more information, check out Best Practices: How IT megatrends impact infrastructure transformation to see how you can best use digital technologies to increase agility and boost innovation.



Best Practices: How IT megatrends impact infrastructure transformation


About the author

Floyd PiedadFloyd Piedad


Floyd Piedad has been in the IT industry for more than 25 years and currently leads the R&D and special projects division of Stratpoint Technologies, a software services company focused on cloud and mobile solutions. A graduate of Electrical Engineering, he is the author of High Availability: Design, Techniques, and Processes.

About the author

Connect with Floyd:

 Follow me on Twitter floydpiedad

0 Kudos
About the Author