Datastor
Software-Based Data Deduplication Technology
Data Storage Group, Inc., trade name dataStor™, has developed patent pending software-based data deduplication technology that removes redundant data simultaneously across multiple physical and virtual machines.
As redundant data is removed, backup storage requirements are significantly reduced to 1/20th the original capacity and the resulting network transfer rates are 20 times faster. This dramatically streamlines and optimizes the data backup and recovery process. dataStor™'s efficient deduplication technology, using minimal system resources, allows our products to scale from large enterprise environments to single-user netbooks. Presently, we are the only vendor that can offer this level of scalability.
This innovative software application family is called, dataStor Shield™, and was designed from the ground up to solve all of the problems associated with legacy tape-based systems, at a substantially lower cost to the customer!
Q. What is data deduplication?
A. Data deduplication is a technology that reduces storage requirements by identifying and removing redundant data.
Q. What data deduplication techniques does dataStor ShieldTM use?
A. All of dataStor ShieldTM products were built around the same enterprise class, patent pending Adaptive Content FactoringTM engine. When you identify the product that best suits your environment, you can be confident you have the same deduplication technology that currently protects multinational data centre sites.
dataStor ShieldTM uses three data deduplication techniques to identify and remove redundant data. First, all data is compressed using advanced data compression. Second, a global single-instance-storage (G-SIS) is used to remove redundant data at the file-level regardless of file name, path or even server. Third, active files are analysed at a sub-file level to remove redundancy which helps with problem files like PSTs, Exchange EDBs and SQL databases.
Q. Does dataStor ShieldTM replace my existing backup application?
A. Yes, dataStor Shield can protect data without a third-party backup application. This is just one more cost saving offered by the solution.
Q. Is data deduplicated before it touches the network?
A. . Yes, often called "source-based" data deduplication, dataStor ShieldTM removes redundant data on the protected server before any data is transferred across LAN or WAN network connections.
Q. Does dataStor ShieldTM require proprietary hardware?
A. No, dataStor ShieldTM is a software only solution that runs on Microsoft Windows. The administrator has freedom to choose the best hardware to fit their needs and budget.
Q. Can dataStor ShieldTM protect data stored on non-Windows based servers (Linux, Solaris, HP-UX, etc. . .)?
A. Yes, dataStor ShieldTM can "post process" data that has been transferred from non-Windows based servers. For example, a database backup on HP-UX is transferred to the dataStor ShieldTM server and then deduplicated and stored efficiently by a local protection plan.
Currently, dataStor ShieldTM only supports "source based" data deduplication on Microsoft Windows.
Q. What is the difference between "source based", "post process" and "in-line" data deduplication?
A. One important aspect of data deduplication is WHERE the redundant data is processed and removed. "Source based" products process and remove redundant data on the protected server, before it is transferred across the network. "Post process" and "in-line" products process data in a central location and only store unique data. "Post process" also requires extra disk space to cache the data before redundant data is removed.
Q. Can iSCSI connected storage be used by dataStor ShieldTM to store deduplicated data?
A. Yes, dataStor ShieldTM uses standard NTFS volumes to store deduplicated data. These NTFS volumes can be internal, iSCSI and Fibre Channel connected.
Q. Can NAS connected storage be used by dataStor ShieldTM to store deduplicated data?
A. Yes, dataStor ShieldTM 3.0 and later versions support NAS connected shares (CIFSSMB, NFS) to store deduplicated data. These shares do not require NTFS.
Q. Does dataStor ShieldTM install agent software on all the protected servers?
A. No, the only thing that dataStor ShieldTM puts on the protected server is a scheduled task. This scheduled task remotely executes the deduplication process directly from the dataStor ShieldTM server. This configuration simplifies future software upgrades because only the dataStor ShieldTM server must be upgraded. Every scheduled task is still managed centrally through the dataStor ShieldTM management interface.
Q. What is the overhead (CPU and memory) of the deduplication process running on the protected server?
A. The memory usage of the deduplication process running on the protected server is less than 20MB. The CPU utilization varies based on the number and speed of the CPU(s). On most modern servers the CPU utilisation ranges between 25-35% while the plan is running.
Q. Can the dataStor ShieldTM server be a virtual machine?
A. Yes, since dataStor ShieldTM fully distributes the data deduplication process across the protected servers the overhead on the dataStor ShieldTM server is much less. One thing to note is that backend processes, like data expiration and data verification, will require more CPU and memory. These backend processes will take longer if the dataStor ShieldTM server is running in a virtual machine.
Storage scalability should also be considered when dataStor ShieldTM is running in a virtual machine. Determine how much storage capacity can be connected to the virtual machine and verify this meets the needs of your environment.
Q. Can dataStor ShieldTM deduplicate and store virtual machine images (VMDK, VHD, XVA)?
A. Our best practice is to run a protection plan within the VM as if it were a physical server. The advantages are several. If the VM is an application server, our Exchange and SQL support will quiesce the system during the protection plan run. G-SIS will more efficiently store the files than with VM image processing. As well, you will also see a shorter backup window, allowing for additional plan runs per day. However, if you need to protect VM image files dataStor ShieldTM will process the large image files themselves. Simply store a copy of the virtual machine images directly on the dataStor ShieldTM server and schedule a local protection plan to efficiently store these images. The original image can be overwritten every day with new images while dataStor ShieldTM is keeping a deduplicated backup history.
Q. Can dataStor ShieldTM deduplicate and store Microsoft Exchange storage groups?
A. Yes, dataStor ShieldTM supports plans for Exchange 2003 and Exchange 2007, integrating with Exchange VSS Writer found in Windows 2003 or later to capture a consistent image of Exchange storage groups while they are mounted. After dataStor ShieldTM has a consistent image it uses sub-file data deduplication to remove redundant data found in the large EDB files. Every recovery point is a FULL backup, but the disk space used is far less.
Exchange plans automatically discover storage group file locations, perform integrity checks on all EDB databases, and truncate logs after successful backup.
Q. Can dataStor ShieldTM deduplicate and store Microsoft SQL databases?
A. Yes, dataStor ShieldTM supports plans for SQL in both Simple Recovery mode and Full Recovery mode, integrating with SQL VSS Writer found in Windows 2003 or later to capture a consistent image of SQL databases while they are mounted. After dataStor ShieldTM has a consistent image it uses sub-file data deduplication to remove redundant data found in the large MDF files. Every recovery point is a FULL backup, but the disk space used is far less.
View all Datastor products...




