Ace Data – About Technicalities and Business of Data Storage and much more: January 2014

Sunday 19 January 2014

Unique Deduplication

And there comes the unique one. This classifies on all the three modes of deduplication. This comes using the Abhraya’s unique agentless backup methodology. Let us understand what is so unique here.

Abhraya picks up data from the source production system and brings it to the Abhraya client system. The deduplication happens at the Abhraya client. This ensures that no deduplication process takes any processor or memory of the source system. Of course, it does compression and encryption also here so it keeps it completely free from using production system’s resources. From here the deduplicated data goes to the Abhraya Server for final storage.

How I classify this in all three modes is because it does deduplication before it reaches the final backup destination and does not need any resources or extra disks to store databefore going for target based. For the source production system on the other side, there is no process happening there. Data goes out and gets deduplicated on some other system.

And finally, it does perform inline deduplication also as the target based deduplication happens before writing data on the disk. Unlike a typical target based deduplication which first writes the data and then reads and deletes duplicate blocks. Abhraya does it before writing and offers benefits of all the three types to its users.

Tuesday 7 January 2014

Target Based Deduplication

Target based deduplication was one of the initial ways of performing deduplication. This was more on the VTLs when VTLs were launched as a technology, and over time, it seems that this is not much in use now.

All it did was to carry the data to the target backup device – mainly VTL – and store it there. It would then run a deduplication process to match blocks and delete the duplicate data. At a later schedule, system will run a clear garbage type process to finally remove the deleted content and free up the space on the VTL.

When it was launched, it was the only process and was a great process. With newer and better technologies available, target based deduplication has lost its charm since most applications now prefer source based deduplication and reduce data before it travels on the network.

So target based deduplication has its advantages of using minimum source processing and memory cycles in performing deduplication at the target, the hind side being slightly oversizing the target to ensure enough space to accommodate full data before starting the deletion process.

Saturday 4 January 2014

Inline Deduplication

Inline deduplication looks as a very impressive term. You are made to believe that the magic would happen on the wire but at the end of the day, there are many caveats.

We have had a very recent experience where someone sized it with assumptions and commitments of reducing the backup & recovery windows tremendously but it did not really go as down as it was expected to be going.

Inline deduplication starts working on the source server itself and is followed by some more processing on the network. The left overs are taken care of by the media device’s memory. So, if you think you have a lot of extra resources on the production servers, go for this. If you are low on resources, you should first upgrade the production servers and then you are expected to have atleast two 10 Gig ports dedicated for the deduplication device and a well sized media server.

Just expecting wonders by replacing the backup device would not help much. It will reduce the backup & recovery windows a bit especially if you move from tape to disk while getting deduplication. However, you should be extremely careful in terms of your upgrade plans and the expectations that you set for yourself.