De-duplication is applied when backing up multiple virtual machines that have identical blocks (for example, if virtual machines were created on the basis of the same template), or in the case of virtual machines with a great amount of free space on their logical disks (known as white space).
Veeam Backup & Replication does not store zero byte blocks or space that has been pre-allocated but not used. With de-duplication, identical blocks or blocks of free space are eliminated, which decreases the size of the created backup file. Veeam will also exclude the blocks used for the swap file, thus reducing the amount of data even further.
If you use data blocks of small size to deduplicate a large backup file, the backup file will be cut into a great number of data blocks. As a result, Veeam Backup & Replication will produce a very large deduplication metadata table which can potentially overgrow memory and CPU resources of your backup repository. Large data blocks produce a smaller metadata table that requires less memory and CPU resources to process. So, depending on the type of storage you select as a backup target, Veeam Backup & Replication uses data blocks of different size to process VMs, which optimizes the size of a backup file and job performance.
There are several storage optimization options available to you when configuring a backup job (or a replication job):
•The Local target (16 TB + backup size) option is recommended for backup jobs that can produce very large full backup files — larger than 16 TB. With this option selected, Veeam Backup & Replication will use data blocks of 8 MB. Note, however, that this storage optimization option will provide the lowest deduplication ratio and the largest size of incremental backup files.
•The Local target option is recommended for backup to SAN, DAS or local storage. The SAN identifies larger blocks of data (1024 KB) and therefore can process large amounts of data at a time. This option provides the fastest job performance but reduces the deduplication ratio, because with larger data blocks it is less likely to find identical blocks.
•The LAN target option is recommended for target NAS and onsite backup/replication. It provides a better deduplication ratio and reduces the file size due to reduced data block sizes (512 KB).
•The WAN target option is recommended if you are planning to use WAN for offsite backup/replication. Veeam Backup & Replication will use small data blocks (256 KB), which will result in the maximum deduplication ratio and the smallest file size, allowing you to reduce the amount of traffic over the WAN connection.
The various recommended use cases for the different targets above are general rules of thumb, but there may be situations where using the various modes makes sense outside of these scenarios. For example, a very high change rate VM may see significant savings from using WAN target mode, even for local backup or replication, and you may be willing to sacrifice the extra CPU load and overhead for this benefit.
You can also refer to the corresponding section of the Frequently Asked Questions list on Veeam community forums at http://forums.veeam.com/veeam-backup-replication-f2/read-this-first-frequently-asked-questions-v7-t17633.html#p95276.