The following are the recommendations to get the maximum data deduplication ratios with the Virtual Tape Library (VTL) software data deduplication feature

  • Disable software data compression by the backup application
  • Enable hardware compression for the tape drive to maximize space savings
  • Disable multiplexing of backup streams
  • Disable encryption of backup streams
  • Disable data deduplication by the backup application itself

Tape Block Size

The block size used to write to tape plays an important part in deduplication ratios achieved. The simple rule is that larger the block size the higher the deduplication ratios. The following lists the default block size used by different backup applications. NOTE: For certain backup applications a newer block size will only be picked up after a relabeling of the virtual tape

Symantec NetBackup

The default block size used for tape is 64K. This is not optimal for data deduplication and can be increased to 256K by specifying SIZE_DATA_BUFFERS as 256K

IBM Tivoli Storage Manager

The default block size used is 256K and is optimal for data deduplication

IBM iSeries BRMS

The deduplication engine is content-aware and tries to identify file data, database block etc. within a backup stream. However the deduplication engine is not aware of the backup format for iSeries backups. Splitting the backup stream into 32KB blocks for iSeries backups a decent deduplication ratio can be achieved.

To set the dedupe block size for the VTL

/quadstorvtl/bin/vtconfig -m -v <vtl name> -b <dedupe block size> 

Dedupe block size can be one of 0, 4096, 8192, 16384, 32768, 65536, 131072 or 262144. Setting the block size to 0 will revert to the original behavior of trying to scan the data for dedupe points.

For example

/quadstorvtl/bin/vtconfig -m -v VTL1 -b 32768 

4096 block size will give the best possible deduplication of iSeries backups. On the down side, if the VTL is unable to find duplicate blocks or the deduplication ratio is less then this block size will lead to a big drop in read/write performance. 32768 is the recommended setting.

Another point to consider is that the compression ratios of iSeries backups are usually high. It might be worthwhile to backup to vcartridges in non-dedupe storage pools which will give a consistent read/write performance along with very high space savings due to compression. 

Veeam Backup & Replication

The following advanced properties for a backup repository are recommended

Veeam Repository Properties

Decompressing backup data blocks before storing on the backup repository can however increase the storage requirements of the repository. If the underlying storage or file system supports compression, then enabling compression at the target side will reduce storage utilization. Please see Veeam Help Center for more information.

Computer Associates ARCServe

The default block size used for tape is 64K and is not optimal for data deduplication. This can be changed to 256K as described in https://support.ca.com/cadocs/0/CA%20ARCserve%20%20Backup%20r16-ENU/Bookshelf_Files/HTML/tapelibr/index.htm?toc.htm?ag_tl_cfg_lib_2_func_as_vtl.htm

CommVault Simpana

The default block size used for tape is 64K and is not optimal for data deduplication. This can be increased to 256K in the "Data Path Properties" of a library. For example see "Increasing Block Size" section in http://documentation.commvault.com/hds/v10/article?p=features/performance_tuning/tunable_parameters.htm

Symantec Backup Exec

The default block size used for tape is 64K and is not optimal for data deduplication. This can be changed for each drive by selecting its properties and changing the "Block Size" to 256K

HP Data Protector

The default block size used is 256K and is optimal for data deduplication

EMC NetWorker

The default block size used is 128k which is good but may be increased to 256K for optimal data deduplication

Dell/Quest Netvault

The default block size used is 32k which is not optimal for data deduplication. This can be changed for each drive by selecting its properties and changing the "Block Size" to 256K