Grounded in the Cloud
cancel
Showing results for 
Search instead for 
Did you mean: 

OpenStack Swift is for files. OpenStack Cinder is for disks.

Stephen_Spector

Guest Author: Rick Evans, Sr Evangelism Technologist, HP Helion 

 

There are two projects that manage data storage in OpenStack: Swift and Cinder. To help remove some of the confusion around them, here is an easy way to remember their differences:

  1. Swift enables you to access files via a URL
  2. Cinder manages virtual disks that get attached to the remote computers that are managed by Nova

 

In many discussions about OpenStack data storage, Swift and Cinder get muddled together, or their differences are given short shrift.

 

objectimage.pngI think part of the confusion arises because both Swift and Cinder can be configured to use the same virtual storage technologies on the back-end; for example, storage arrays offered by different vendors.

 

The key difference is the way you access the data managed by Swift, which is very different from how you access data managed by Cinder.

 

---

 

Objects, not files

 

Swift was one of the two original projects in the OpenStack Foundation. Swift was designed to let you upload a file into "the cloud" and more importantly guarantees that you can retrieve that file later on. You use a Uniform Resource Locator (URL) to access to files you store in Swift.  In other words, you could download your stored file by pasting the correct URL in your browser location bar.

 

There is a bit of ambiguity around the vocabulary used by the Swift technology. Instead of calling the thing you store, “a file,” the official term used by the official OpenStack documentation is “object.” In some circles this is a technically correct term. But I am a programmer. This is NOT the same concept of an “object” as an Object-Oriented Programmer thinks of when they think of "objects." To a programmer, an object is “the encapsulation of state and behavior” … or more loosely an instance of a class, or a data structure with Methods and Fields. This kind of object used by the OpenStack literature is more like how database administrators (DBA) use the term “blob,” or “Binary Large Objects.”

(“Objects” is also used in this form in some other undisciplined general internet discussions, probably by people with a database background.)

 

“Object,” in this sense, really just refers to a block of binary data. The idea is, when the block of data is retrieved by a client, the client interprets it. It MAY end up as a file in the File-System of an Operating System. Or maybe it ends up as data in RAM used by a client program. In my opinion, “resource” would’ve been a better word, as in Uniform Resource Identifier (URI).

 

The Internet service company, Rackspace, originally developed the Swift code with the explicit intent of providing a product to compete with Amazon Web Services (AWS) Simple Storage Service (S3). DropBox.com is the poster-child for a business founded on S3 technology. Without the massive scale and worldwide availability of S3, DropBox would not be able to provide its file synchronization service. Each file you drop into your DropBox folder on your desktop, gets copied as an S3 "object" onto AWS servers somewhere in the world.

 

Like all OpenStack software Swift is designed to work in a datacenter where there are thousands of servers. Swift (and S3) is specifically designed to work with massive amounts of relatively inexpensive storage devices. To guarantee no data is lost, the data is (typically) stored on at least three separate storage devices. That way if one device fails, there are at least two other devices with the same data, and the system can automatically work to establish a third copy as soon as possible.

 

---

 

Volumes, not disks

 

Likewise, the terminology used by the Cinder project is technically correct but may not be intuitively obvious to a casual observer. Officially Cinder manages persistent data "volumes," … not virtual disks. But, conceptually, it won't hurt to use the term “disks.”

 

To understand Cinder, it helps to get a little history.

 

Originally, virtual disk services were provided within the Nova Project, i.e., the project that manages the guest computers that get allocated via OpenStack. If you have experience with creating a virtual machine, you will know that (along with other items) you specify how much disk space to give your new virtual machine. You probably also know that when you destroy the virtual machine, the disk storage you gave it is destroyed as well. It can be very inconvenient if the data on the virtual disk is lost when the virtual machine is destroyed. It would be quite handy if the data on a virtual disk could survive the crash of a virtual machine; or be disconnected and reused by other computers. Also, it can be handy to be able to clone virtual disks so the data can be used by other computers.

 

There are a lot of third-party hardware and software solutions that provide interfaces for managing virtual Disks.

Keeping track of all the interfaces for all the different vendors within the original Nova Compute project was too complex. To simplify Nova, the code for managing virtual Disks, was spun out to a new project called Cinder. (At roughly the same time the code for managing virtual Network connections was spun out to the project called Neutron.) Cinder provides one set of commands (a façade front-end) that allows users to allocate and manage these Virtual Disks without worrying about the details of any specific vendor’s virtual disk storage solution. This project is a web-application that basically is a web-front end to a database that keeps track of all the virtual disks that were created and which virtual machine they are connected to.

 

---

 

The data stored on a Cinder volume is only usable after the volume is connected to a computer managed by Nova. The data stored in Swift can be retrieved by a web client, any time, using a unique URL.

 

In general, the data stored on a Cinder volume is accessed very fast by the connected computer. In general, the data stored by Swift will take longer to download, but its focus is on never losing data.

 

Swift is great of data backup. Swift is also great for hosting the static assets available for web pages; especially large static assets like pictures and movies. These assets can then be directly accessed by end-user clients, instead of routing these requests back through the Application’s web server.

 

Cinder is great for application data, like hosting database data. For example, if a database server dies, the data on the cinder volume can easily be reconnected to a new server.

 

Cinder can work with Swift. For example, through the connected computer, you could back up any of the data files from a Cinder volume on Swift, … for safety. Or, you can use OpenStack API calls to back up an image of the entire Cinder volume on Swift.

 

---

 

I hope this information helps.

 

Swift and Cinder manage storage in very different ways.

 

“Swift is for worldwide-web (www) access.

Cinder is for access by Computers managed by Nova”

 

"Swift is for safety.

Cinder is for speed."

 

"Swift is for files.

Cinder is for disks."

 

Or to be technically correct, according to the official OpenStack documentation:

 

"Swift is for Objects.

Cinder is for Volumes."

 

The distinction between these technologies should be obvious.

Yet, I have not found it stated so simply anywhere else on the internet ... until now. ;-)

 

"Once you see it, you can't un-see it."

Senior Manager, Cloud Online Marketing
About the Author

Stephen_Spector

I manage the HPE Helion social media and website teams promoting the enterprise cloud solutions at HPE for hybrid, public, and private clouds. I was previously at Dell promoting their Cloud solutions and was the open source community manager for OpenStack and Xen.org at Rackspace and Citrix Systems. While at Citrix Systems, I founded the Citrix Developer Network, developed global alliance and licensing programs, and even once added audio to the DOS ICA client with assembler. Follow me at @SpectorID

Events
28-30 November
Madrid, Spain
Discover 2017 Madrid
Join us for Hewlett Packard Enterprise Discover 2017 Madrid, taking place 28-30 November at the Feria de Madrid Convention Center
Read more
HPE at Worldwide IT Conferences and Events -  2017
Learn about IT conferences and events  where Hewlett Packard Enterprise has a presence
Read more
View all