Around the Storage Block
1821997 Members
3712 Online
109639 Solutions
New Article
StorageExperts

How to fight ransomware with intelligent data storage technology

As you have likely seen in the headlines for the past year, 2021 is the year of ransomware. This particular cyber threat is truly an international problem, and has rocketed up to the #1 position as the most dreaded and dangerous cyber-attack. While the USA was the number one target in 2019, the EU surpassed this as the number one target in 2020.

This blog post shares highlights of the recently updated whitepaper on ‘Protecting Your Windows SMB File Infrastructure from Ransomware’ . The paper includes step-by-step procedures, as well as scripts, to enhance malware detection. This blog does not cover the usage of intrusion detection systems, anti-virus software, firewalls, or even best practices when it comes to preventing lateral movement by a hacker. These are ALL things you should already be doing – but, as we all know – even if you do all of these things, an attacker may still get in.

Fighting-ransomware_HPE-Alletra_HPE-InfoSIght_blog_shutterstock_1822441808.pngThe craft of gaining access

An attacker may find an infinite number of ways and attempts to breach security, can have a 0.001% success ratio, and still win. The defender must be perfect across all of the different methods of attack, or they will fail to secure their environment, leaving it vulnerable to cyber crimes.

Once a system or environment is infected, the attacker may operate for months or years before being discovered, while the encryption stage could take weeks or months to fully run before being triggered. In fact, even when the encryption is complete, a server may not trigger an alarm until external backups have expired, or until other servers in the same infrastructure have also completed; then, they may trigger as a group.

The procedure outlined here is to help you detect a possible infection in progress, when other methods of defense have failed. Once you detect active ransomware, you can then determine, using the data, when the software started operating, and how to ultimately restore your data.

Step one: Detection

There are a number of anomalies in a data center that can be monitored, and a few combinations that can be good indicators of active ransomware. To better understand this, realize that ransomware will read files from the file system, encrypt them, and write them back by overwriting the original. This activity will alter the normal disk activity in the following measurable ways:

  1. The read/write ratio of the file system diverts from the historical norm, and approaches 50%/50%.
  2. The workload will increase slightly, but also become less burst-y as the valleys fill in with extra workload data, and differ from the historic norm.
  3. The extra workload from item 2 and the extra write operations from item 1 will NOT result in an equivalent decrease in file system free space. This indicates the workload is likely overwrite-heavy. This can be detected by monitoring the file system free space, as well as the size of the snapshot growth.
  4. The compression ratio and/or the deduplication ratio will decrease in accordance with the extra write workload from items 1 and 2.

These are not exact measurements, and some judgement on these results will need to occur. This assessment requires that you are able to gather historic data on your system to know what the true norm is.

This is where HPE Infosight really shines, and can bring you the detailed information you need to make management decisions.

HPE InfoSight is a cloud-based predictive analytics platform. Among the many things it can do, it produces graphs that are highly customizable, and able to collect and show all of the information just outlined. Let’s take a look at an example of the kind of historic data or environmental snapshot – that HPE InfoSight can provide via its executive dashboard.

The figure below shows an example of a typical workload that transforms from an 80% read to a 50% read ratio. From this graph alone, you can detect that between October 20th and 27th the workload radically changes.

Figure 1: Typical workload that transforms from an 80% to 50% read ratioFigure 1: Typical workload that transforms from an 80% to 50% read ratioAdditionally, you can see that this workload has natural decreases in load on a cyclical nature, which also seems to disappear in the same timeframe.

The next question we should ask is about the added workload. Can you see an anomalous ramp-up that may not equate to any extra customer (front-end) load.

And. of course, the final question to ask, does this increased workload cause a drastic rise in the size of the snapshots – since snapshots don’t grow when we add files, only when we overwrite files. With HPE InfoSight you can map a drastic increase in snapshot space required around the same timeframe.

Figure_2: Anomalous ramp-up that may not equate to any extra customer front end loadFigure_2: Anomalous ramp-up that may not equate to any extra customer front end load

 We also see from this that the volume usage is going up steadily as well. If we compare that to the filesystem itself, we would expect to see less free space on the filesystem. In the case illustrated above, the free space on the file system has remained roughly unchanged over this timeframe. This indicates that the volume usage is due to the lack of savings from either compression or deduplication.

Step two: Validation

None of this is proof that you have a ransomware infection actively encrypting a volume, but it’s enough to raise concerns and prompt you to start investigating more deeply. At this point, a prudent storage administrator would consider ensuring that older snapshots are not inadvertently deleted. One might also do a test-restore of the most current snapshot to an alternate test server. The test server could be used to look at the most recent snapshot of the file system to determine if those files are indeed encrypted. Since the test server isn’t infected with the ransomware software, it will have no way to decode them, and the file contents will appear to be corrupted.

Step three: Isolate and cleanse the infrastructure

The next step is outside of the scope of this document. It involves removing the ransomware and backdoors in all of the infected servers. These actions assume that credentials that have been used on the infected server are compromised. You would then expand the cleanse process to other machines that have access, as well. See the NIST documents online for how to properly respond to this type of attack.

While this is happening, remove access to all of the volumes that are hosted on your HPE Nimble Storage or HPE Alletra deviced, that those servers can see.

Step four: Identify the point of recovery (POR)

The next step is to determine roughly where the encryption actions started, and to restore the volumes to the most recent snapshot before the expected move date to a test server. This can commonly be determined from the massive size growth of the snapshots. 

Figure 3: Identify the point of recovery PORFigure 3: Identify the point of recovery PORIn the above example, I would want to restore the snapshot highlighted in red, as it likely is prior to the encryption attack. Since a few files on that snapshot might be encrypted, to be doubly sure, I might also restore to that test server the snapshot from the day prior.

At this point, using WinDiff or another file system comparison utility, I could highlight the differences between these file systems, and interrogate those files to determine if any encrypted data is present. If there is encrypted data, I would repeat the process with the next older snapshot. Once you have found a snapshot that contains no overwritten files from the previous snapshot, it can be marked as the snapshot to be restored.

Step 5: The moral hazard

Every ransomware payment fuels the criminal ransomware industry. This blog and related white paper are attempts to help you avoid ever paying a ransom to retrieve your own data.  

HPE Alletra 9000HPE Alletra 9000To survive a ransomware attack, your organization must plan ahead for all contingencies, and invest in not only enhanced detection, but also in a well thought-out response plan with clear actions detailed. These steps should include the ability to backup and restore compromised data.

Additionally, you should assume that some credentials in your environment have been leaked, and institute least privilege access, which includes limiting access to the HPE Alletra or HPE Nimble Storage management planes to prevent a bad actor from removing snapshots.

I hope this blog article was helpful, and I would love to hear your thoughts and comments. Please leave your questions or comments below.

BTW - I will be a guest speaker at the SNIA Storage Developer Virtual Conference, September 28th and 29th, 2021 . To register or for more information on the event, please check out their website!

lionetti_photo.jpgMeet HPE Storage Blogger, Chris Lionetti. Chris is a veteran of the storage industry who has been building complex systems and SANs for over 25 years. He has long been actively involved with the Storage Network Industry Association (SNIA), and is currently the Board Vice-chair. He is also a reference architect on the HPE Storage team.

Chris participates in many technical working groups, and holds 9 patents on topics related to data centers, networking, and storage.  Follow Chris on Twitter!

Storage Experts
Hewlett Packard Enterprise

twitter.com/HPE_Storage
linkedin.com/showcase/hpestorage/
hpe.com/storage

 

About the Author

StorageExperts

Our team of Hewlett Packard Enterprise storage experts helps you dive deep into relevant data storage and data protection topics.