Ace Data – About Technicalities and Business of Data Storage and much more: 2013

Monday, 30 December 2013

Source Based Deduplication

Choose the unique content from source itself when you start a backup. It does utilize some processing and memory from the source system so size it well.

Source based deduplication is also very powerful in ensuring that you utilize minimum network bandwidth during the backups. The backup application will create blocks of data on source and then store their hashes there at source and send unique data on the network. This is good for backups only if it is sized well. Catalog created by some applications is large enough to cause trouble for the performance of the source system which could be a production system.

Source based deduplication also gives good results for file system backups. A traditional approach takes long for file system backup that has millions of small files taking days for getting written especially during a full backup cycle. Source based deduplication in this case picks up only the changed content of the changed files reducing the amount of data travelling on the network irrespective of the backup level set.

Global deduplication on the target further reduces the amount of data stored.

Tuesday, 24 December 2013

Deduplication on Storage

When deduplication was launched for storage, it seemed a difficult technology to handle. Like any other technological aspects, deduplication also needs processing power and memory. So deduplicating everything while storing on the primary storage would not be very effective. The base premise it started off with was reducing the disk storage investment by reducing the content to be stored. In real sense, it did not help much. While it would reduce the number of disk spindles required to store data, lesser spindles mean lesser IOPS so a compromise on performance.

SSD based storages require huge investment. Deduplication there can help reduce the disk requirement. Being SSD and capable of large no. of IOPS per disk, there is no compromise on IOPS while deduplicating. While working on one such customer requirement recently, I realized that this does not end here. Scale out storage further provides more processing power
& memory every time you upgrade and help you with consistent performance. Deduplication also happens inline i.e. you write only what is unique unlike other technologies that do deduplication at rest i.e. you write everything and then run a deduplication process to mark the duplicate content followed by a cleanup process to remove the duplicate content.

Choose deduplication on Storage with a caution, it may not be as fascinating as it looks like.

Saturday, 21 December 2013

Store less, backup lesser – The Art of Deduplication

Deduplication is not a new concept though the term is relatively new. It originated more from the backup prospective though it has now got good footholds in many IT components.

I know and have been using for over a decade backup technologies that would backup only unique emails and files, and subsequently backup only delta changes. Being the only one doing that, they used their own language. They did not coin the word deduplication. Now everyone does that and calls it File level deduplication. This approach served good for desktop and laptop backups. Then this technology progressed, the current being block level deduplication. This recognises unique blocks of data and ensures that one block gets backed up only once, thereby reducing the amount of data travelling and getting backed up.

Like any other process, deduplication also needs resources to do what it is meant for. Various applications offer various forms like source based, target based and inline deduplication. Each one of them has their own working mechanisms and pros and cons. In the following series of blogs we will discuss the various methods of deduplication along with their pros and cons.

Saturday, 14 December 2013

Protecting Small Databases

A lot of SME’s get concerned about protecting their databases – typically SQL database. The interesting challenge is the fact that they are really small databases having extremely critical data.

While traditionally a standalone tape based backup solution would be considered ideal, it is not so simple. Since a 50-60 GB database normally compresses down to 10-15 GB, it does not deserve that kind of investment into tapes, each of which is capable of holding terabytes. It ends up holding too small a data and therefore costs more per GB.

With the changing times, better options are available depending on what you want to achieve:

A simple backup could help maintain multiple versions and copies and give you old and new recoveries when required. While this gives flexibility of versioning, the recovery process would take some time based on the kind of resources available.

Alternatively, if you are looking for a quick access to your data even after a disaster, replicating –especially mirroring – is the best option. To keep it in economical range, you can use native mirroring capabilities rather than investing in third party tools.

Ace Data Abhraya offers both: Cloud based backup for option 1 and cloud based infrastructure for option 2 with committed recovery SLA. Infact for SQL databases, you can opt for recovery on cloud infrastructure with option of recovering only database or complete server on cloud while enjoying the flexibility of versioning and compliance, and investing very less based on the backup size usage only.

Monday, 9 December 2013

Processing Big Data

While many think it is difficult to manage data beyond a volume, technologists don’t agree with this. Newer technologies keep coming in to handle and process large volume of data. Consider Google search. While you are typing what you want to search, it starts autocompleting for you and starts showing you results as well.

All this is done by using clusters of servers at the backend. Data that goes in is processed by these servers so you get a good pool of processors and memory to take it in. Further to this, the storage network used behind to read/write this data offers a huge choice.

If it is a file based data, you can go for a sale-out NAS. If you have to handle block level data, scale-out SAN options are available. To help the really heavy databases, pure Flash based storage is now available.

Flash based storages help achieve upto a few million IOPS especially when they perform on inline deduplication done in the memory. Scale out storage there ensures that adding more capacity automatically gives you more memory to handle the new IOs and deduplication help control the requirement of storage since it could otherwise go for a big on budget.

Monday, 2 December 2013

How to backup Big Data?

The industry has been struggling a lot with the backups of the data they have been having for long now. Traditional tape based backup solutions seem good only for small size environments now. Though they have been growing in individual capacities and number of slots, better disk based options are pushing them more towards being secondary rather than the primary backup mediums.

Big data needs better care anyway for being big, and perhaps a bit more meaningful than the databases with invoice records or product records. The new technologies like deduplication and better compression algorithms like LZOB and ZLIB are making it more cost effective to back them up by bringing down their size.

What is also important is the cost of retaining this large volume of data and the varied sources of this unstructured data.

Ace Data’s Abhraya Cloud based backup offering resolves this challenge for its customers. Its flexible backup policies allow organizations to keep latest data close to them locally, and send the remaining to a cloud based offering. Being cloud based, they pay for what they backup and not invest on large growth assumptions. Furthermore as the backup grows old, it can be automatically archived to low cost disks reducing the cost of long term retention while ensuring data availability for long time.

The solution is capable of backing up smartphones, mobile laptops, large volumes of file servers apart from backing up the large servers and databases thereby ensuring that all sources of data can be backed up through a single solution.

Wednesday, 13 November 2013

How to Store and Manage BIG Data?

While I mentioned in my previous blog that any size of data is no problem, I often get questioned upon how to store and manage the huge volumes. This is a typical concern of an enterprise faced with increasing data size.

Storage vendors have seen and known this problem as it grew, and have scaled-up or rather scaled-out to help handle this massive growth. Both NAS and SAN vendors have gone beyond the traditional methods of upgrading the storage infrastructure by adding additional shelves and disks. The challenge that the traditional method has is that you end up upgrading capacity with shelves and disks with limited enhancements in processing power. This ends up in performance reduction.

The Scale-out method helps upgrade the storage by adding new nodes which include processing power, memory and capacity, thereby keeping the overall performance consistent with practically no dip in user experience. This is true for both SAN and NAS based storages. These storages can be expanded to PBs on a single storage, or even a single file system, by simply plugging in a new node. It is viable commercially also, as the cost per GB goes down as you keep adding more nodes.

So don’t worry about handling your Big Data as the storage devices are now available to store them more efficiently.

Tuesday, 5 November 2013

How much Data is good for business?

When you talk of sources of data generation, there is an endless list. Any business stream would have a long list to show how data is getting generated and how much data is being generated. Often businesses get scared with so much of data as they think handling it is a mammoth task. Indeed it is a mammoth task as it needs good investments and infrastructure to handle it. However, if utilized properly, the benefits are much higher. The way businesses are competing, it would soon become inevitable to handle it carefully.

The more data you have, more opportunity you get to see how your products, services and customers behave. There are many examples of business being able to analyse their data patterns and offer more discounts or value added services to give their customer a delightful experience. The new databases handling this Big Data have come up with Multi Parallel Processing technologies and the new applications to handle unstructured data ensure that even if you have PetaBytes of data, you can still do real time analysis and produce results in nano seconds.

Let us enjoy this new revolution on the way technology and businesses are getting shaped up and reap the benefits of these.

Friday, 18 October 2013

Developing an ILM strategy

Information Lifecycle Management or ILM as it is known popularly is perhaps the most important aspect of any organization today. This means that every organization needs to think over how it wants to handle its data right from the time it is created to the time it looses its value.

With the kind of data growth we are witnessing, it is becoming even more important for the organizations to understand how frequently they need to access the data, and how long do they need to retain that data.

Compliance regulations are one of the driving factors to define the overall retention period and business practices help define the criticality of the data.

It is for this reason that I believe that ILM is more of a business function than a pure IT function. In Indian context, you can co-relate this with the VAT authorities. If they have a query of current year data, they call you same day or next day. If it is a couple of years old case, you get 15-20 days to respond to every query and if it goes to 5-6 years old data, sometimes the case goes on for another year. Even they don’t ask for more than 10 years old data.

The only difference is that the business owner stored his sales files earlier at different locations based on their age and now on different disks and storages based on the same factor. The driving factor has always been the criticality and compliance for that data.

By categorizing your data into active and non-active data, and based upon the urgency of its availability, you can store this on tiered storage. This enables you to store the most recent and critical data on your fastest and most accessible devices (or cloud), and retire the rest to archivals, thereby saving both cost and resources.

Monday, 7 October 2013

Quick Go To Market

One of the key advantages for opting for a Cloud Based Solution or Application is the speed of deployment. Actually, you don’t have to deploy much on your own. So the Cloud brings in minimum procurement cycles and minimum deployment cycles. A typical example is an ERP application or a mailing application where the application is set and ready to use. All you need to do is some basic customization as per your needs, and start using it. More people joining in means more users created and getting operational quickly.

Another example comes from Ace Data flagship product Abhraya. All you need to do is sign up with us on the cost and the SLA. When you add a new server or end point device, just create its backup set and you are ready. Even for large data, it does not take more than 8 to 10 working hours to import it manually, depending on the volume. In a comparitive study done at one of our customers, who in parallel started the process for his traditional backup solution, we got very interesting results. While it took six weeks to complete the procurement and delivery cycles for licenses and tape media, we were in production in eight hours with 250 GB of MS Exchange database. That’s the speed of Cloud.

Thursday, 3 October 2013

Cloud Makes Enterprise Class Applications Affordable

Cloud computing has made it very easy for even startup organizations to get the benefits of enterprise class products without the need to implement huge infrastructure and bother about managing it.

Consider starting up your new venture and needing an ERP for a 5 user organization. It would not be very easy decision and for long the organization would run along on spreadsheets and Word documents to take care of their sales planning, purchase, invoicing etc.

With cloud computing gaining momentum, leading enterprise class ERP softwares are available on the Cloud. This means that even a small organization does not need to spend on the application and the hardware infrastructure behind it. No infrastructure means no maintenance as well. No technical team required to take care of operations. Add to this the flexibility of using an application customized for your specific needs and paying on a per user basis. This means your cost increases only when more users get added.

Apart from ERP, other professional applications like mailing and backup are also available on similar basis. Startups and SMBs should consider evaluating Cloud service providers for their basic needs to enjoy the benefits of Enterprise Class Applications at a low cost.

Wednesday, 2 October 2013

Cloud for Better Management

One big benefit that cloud computing brings in is the hands-off management for the technology user and the internal IT team. When you are hosting applications in-house and build your own infrastructure, you need a lot of expertise in managing the same. You need to have a dedicated application & database manager who is well-qualified to handle this portfolio.

For a large enterprise this may not be a challenge, and they can take care of the staff augmentation challenges. For SMBs though, this could be a big concern. It is for this reason most of them are not able to leverage the benefits of enterprise applications.

Getting the same application on the Cloud brings in the management components with it. A good service provider will ensure that your applications are taken well care of and they are up and running 24x7 with optimal performance. An exceptional service provider would even have a disaster recovery site to ensure smooth operations even if his own Cloud Data Center poses any threat or challenge.

Thursday, 26 September 2013

Using Cloud for Disaster Recovery

The base premise of disaster recovery is the way you handle your business in case your primary business site meets a disaster. In my opinion, disaster recovery can be considered as an IT subset of the overall Business Continuity Planning including non IT factors of business as well.

When an organization chooses to host applications on cloud, it automatically gets into the first level of disaster recovery. The application along with all its data is already in a remote data center and the loss of your primary business premise does not affect access to applications and data. They can still be accessed from any other computer and internet connection.

You can also opt for a disaster recovery site for your service provider. This could add to the budget but ensures your data safety and continuity even if their data center is in trouble.

For a low cost disaster recovery option, Ace Data Abhraya would be the right choice for organizations where investment in full online disaster recovery site is still a long way to go. Ace Data Abhraya offers unique backup propositions wherein your backed up data can be used remotely in the event of a primary site disaster, while it continues to offer all the benefits of daily backups which are compressed and deduplicated on-premise to minimize the load on the bandwidth.

In the coming days, we would explain how using Ace Data Abhraya in different deployment scenarios helps organizations achieve disaster recovery for their critical data.

Monday, 23 September 2013

Cloud Eases out Infrastructure Deployment

Remember those old days of the need to have your own computers and servers depending on what you want to do. I have seen many programmers not getting enough resources for practicing and enhancing their skills due to constraints related to the infrastructure requirement.

Cloud computing has brought a paradigm shift into this. Getting IT infrastructure has become so easy today using the backend virtualization techniques. Sitting in your living room, you can now order for a new server online and pay by your credit card on a periodical rental system. You can choose flexible configuration and operating system licensed along with it. Once your task is over, you can choose to discard this server and stop paying further. This is not same as rental server since you are not getting anything physical with you and you do not need a mini Data Center to host it.

The configurations could be flexible and enhanced on-the-fly without waiting for a delivery cycle.

For developers, even the development platforms and respective licenses can be obtained the same way making it easy for them to use a platform. This is not only good for SMBs and IT organizations, it is good for students as well.

Friday, 20 September 2013

Is Virtualization the driving factor for Cloud Computing?

Virtualization has been prevalent for decades in various forms. In the recent times, virtualization of servers and desktops over x86-platform has made a significant change in the way IT was being used. Virtualization gives you the flexibility of using multiple virtual servers created out of dynamic resources from a few physical servers. Resource allocation and re-allocation is very convenient, and most of the functions happen online. Most of these functions can be automated based on usage pattern.

For technologists finding ways of sharing resources, virtualization came up as a blessing. Cloud computing allows different organizations/departments to provision and utilize virtual servers for their own individual use while residing on the same physical hardware. The physical hardware hosted at a service provider’s Data Center allows access to different virtual servers separately.

Virtualization platforms have in-built auto-provisioning, security features and billing systems which charge based on periodic usage. Many service providers have developed web based self-service portals that allow users to create and use their own servers without any external intervention.

So sit back and create your own server with custom configuration and reap the dual benefits of virtualization and cloud computing.

Wednesday, 18 September 2013

Backup or Archive – What suits your need best?

Many organizations have been equating long term retention of backups as archiving whereas backup and archiving are too separate things and should not be treated as same or interchangeable.

Archiving helps reduce the backup load by moving the older data from the production systems to an archival system. This reduces the amount of data on the production systems enhancing performance and reducing backup window for the production systems.

While moving the data, the archival application leaves a stub on the production system. For mailing applications, individual emails are moved from production system reducing the production mailbox size and load considerably. When the user accesses the data from production system, he actually accesses the stub which in turn accesses the data from archival system to fetch it for him. The archival applications offer flexibility in terms of automatically choosing what to archive based on the size, attachments, date & time of creation or access etc. The other advantage is the long term retention of data without any load on production systems and its availability in its native format when you need it.

Backup on the other hand helps preserve multiple copies of the production data on a different media to help in the event of loss of production data. Backup does not move any data from production systems and does not reduce any load there. Moreover, backups are not stored in the native format; you need to recover them back to the native format from the backup device.

Tuesday, 17 September 2013

Does Agentless Backup makes sense?

Traditionally, all backup applications need agents on each desktop & server to be backed up for processing and running algorithms like compression, deduplication & encryption (if applicable). When we position our Abhraya Backup Vault service with agentless architecture, many people get doubtful. Abhraya Cloud Backup Vault has overcome this and works in its own unique way.

Being agentless brings some key benefits to the production systems:

1. Speed of deployment is enhanced as you do not need to wait for off-peak hours and risk of reboot while deploying the solution.

2. Limits the memory and processor utilization on the production system as it’s the TCP/IP or Unix-SSH that provides the data to the central Backup Vault Client. All other algorithms are the function of Backup Vault Client not the production server.

3. Ease of upgrades by upgrading only the central Backup Vault client and not the production server. The transition is smooth enough that it does not need to disturb the production system working.

4. Ease of troubleshooting & supporting as all logs & errors are generated on the central server only. You don’t need to login to production servers for these.

I believe this new architecture of a backup application is making it more convenient to deploy and manage a backup infrastructure putting negligible load on the production servers & applications.

Saturday, 14 September 2013

How Workforce Mobility is Threatening Your Precious Data

The usage of laptops long overtook the usage of desktop computers, and tablets & smartphones have already overtaken the usage of laptops. What this means is that your data – from emails, documents and presentations to project plans and financial data is spread all over the world. Theft, accidental damage or sheer negligence on someone's part can lead to loss of crucial data.
People are forgetful, lazy and ignorant in technical nuances about their devices. To expect them to keep their devices backed up and secure all the time is not a practical idea. Something better has be thought of and devised to overcome this problem.
Using Agentless Architecture, backup and recovery services like Ace Data Abhraya Cloud Backup Vault ensure that transparently to users of static devices, the data is properly backed up based on the rules setup by the system administrator. Mobile users only need to install a small client software, and once setup, can forget about their backup worries. The moment they are connected, the backup will automatically start and keep it uptodate. In case of a device loss, virtual device can be setup to get back to business urgently, and data image can be restored to a dissimilar device for further work.
All this within the strict and universally compliant security framework.
To know more, visit our website www.ace-data.com.