Data Domain Solutions Design: Downloadable Content
Data Domain Solutions Design: Downloadable Content
Data Domain Solutions Design: Downloadable Content
SOLUTIONS DESIGN
DOWNLOADABLE CONTENT
DOWNLOADABLE CONTENT
Internal Use - Confidential
Table of Contents
©
Internal Use - Confidential Copyright 2020 Dell Inc. Page iii
Data Domain Solutions Design
• Technology Review
• Performance Tuning
• Architecture Design
• Performance Tuning
• More Design Considerations
• Best Practices
Data integrity
The DD OS Data Invulnerability Architecture protects against data loss from
hardware and software failures.
Data Compression
Using Global Compression, a Data Domain system eliminates redundant data from
each backup image and stores only unique data.
Restore Operations.
© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 1
Data Domain Solutions Design
Use the Data Domain Enterprise Manager to perform initial system configuration.
Make configuration changes after initial configuration, display system and
component status, and generate reports and charts.
Note: For more information about the basics, check the Data Domain
Fundamentals course.
Page
Internal Use - Confidential 2 © Copyright 2020 Dell Inc.
Data Domain Solutions Design
Performance Tuning 1
Number of Shelves: For high end DDRs max performance can be impacted by
having too few shelves in the solution.
Encryption options: The default AES 256-bit (CBC) encryption is not the most
secure encryption. The AES 256-bit Galois/Counter Mode (GWindowsCM) is the
most secure algorithm but it is slower than the Cipher block Chaining (CBC) mode.
Cleaning: A default schedule runs the cleaning operation every Tuesday at 6 a.m.
(Tuesday @ 0600). You can change the schedule, or you can run the operation
manually. Data Domain recommends running the cleaning operation after a week.
© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 3
Data Domain Solutions Design
During the cleaning operation, the file system is available for all normal operations
including backup (write) and restore (read). Although cleaning uses a significant
amount of system resources, cleaning is self-throttling and gives up system
resources in the presence of user traffic.
Performance Tuning 2
RAID Rebuilds: Data Domain Systems that are undergoing a RAID rebuild
experience degradation in maximum performance.
Reading Old Data: For cleaning as data ages on the DD system, reading aged
data may get slower especially if it is old.
Number of MTrees: Different Data Domain models have different MTree limits.
The table for these specifications is shown on the next slide.
Number of Files: Data Domain recommends storing no more than 1 billion files on
a system. The overall performance for the Data Domain system falls to
unacceptable levels if the system is required to support the maximum file amount
Page
Internal Use - Confidential 4 © Copyright 2020 Dell Inc.
Data Domain Solutions Design
and the workload from the client machines is not carefully controlled.
When the file system passes the billion file limit, several processes or operations
might be adversely affected, for example: Cleaning may take a long time to
complete - several days. AutoSupport operations may take more time and any
process or command that must enumerate all the files.
Data Domain:
–Supports Data Domain’s released products
–No new Data Domain equipment required
–Requires DD Boost License
Workload:
–Avamar Data Store manages data for the Avamar solution and stores the
metadata
–Data Domain system is the backup target.
© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 5
Data Domain Solutions Design
Most companies use high-performance storage array system such as Dell EMC
VMAX3and XtremIO to store critical business data. ProtectPoint integrates with
these arrays to identify and capture data reliably and then moves this data to Data
Domain protection storage providing fast, efficient and flexible data protection.
The method that is used to capture the data depends on the array and solution type
being used, however essentially a point in time copy is created using a snapshot on
the primary storage. This snapshot is then moved to the protection storage via a
Storage Area Network (SAN) to the Data Domain system. This process can be
repeated, so to accommodate multiple versions and can also be reversed to
provide restore capabilities.
Page
Internal Use - Confidential 6 © Copyright 2020 Dell Inc.
Data Domain Solutions Design
There are essentially four types of devices you can configure in NetWorker to use a
Data Domain system for backup storage. These devices are:
1.file (NFS)
2.adv_file (NFS or CIFS)
3.tape drives (Data Domain VTL)
4.Data Domain (DD Boost) In the first three types of devices (file, adv_file, and
VTL), NetWorker is unaware of the fact that backup data is deduplicated in the
Data Domain system.
The Data Domain storage system can also be configured as a virtual tape library
(VTL) to emulate one or more tape libraries. NetWorker can use the Data Domain
© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 7
Data Domain Solutions Design
storage device as a VTL in an existing SAN environment. NDMP over the LAN
uses the integrated NDMP Tape Server. With the Data Domain device and DD
Boost, NetWorker shares in the deduplication processing with the Data Domain
system by using Distributed Segment Processing (DSP).
These considerations are for sizing. Refer to the EBSS and associated
questionnaires to gather such information.
The baseline information that is needed is the same as would be used to calculate
the requirements for any storage requirement.
Page
Internal Use - Confidential 8 © Copyright 2020 Dell Inc.
Data Domain Solutions Design
The effects of deduplication get factored in later. Essentially the size, frequency,
and type of the backups must be known as well as the retention and replication
requirements.
The baseline information that is needed is the same as would be used to calculate
the requirements for any storage requirement.
The effects of deduplication get factored in later. Essentially the size, frequency,
and type of the backups must be known as well as the retention and replication
requirements.
© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 9
Data Domain Solutions Design
From the data you gathered, the first calculation to make is to determine the
amount of storage capacity needed. This calculation is where a sense of what to
expect from the effects of deduplication is needed.
Compression rates vary depending on the characteristics of the data and how
many backup sets are retained. This can vary from small values in the range of 5x
to much larger values of 20x or more. It is helpful to have a realistic estimate.
Some types of data, clients, and situations do not yield the best deduplication
results. This result includes encrypted and compressed files; rich media and email
messages yield little initial commonality. This factor may come from tests or
experience with the actual data set, or it may come from some general
compression examples.
Page
Internal Use - Confidential 10 © Copyright 2020 Dell Inc.
Data Domain Solutions Design
Compression Examples
The highest rates are seen when many full backups are stored. General average
rates can be used as a starting point for the calculations and then the numbers can
be refined after real data is available.
© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 11
Data Domain Solutions Design
Capacity calculations are a matter of adding up the space that is needed for each
backup that is retained.
When subsequent full backups run, it is likely that they compress at a much higher
rate so in this case 25x is used. Four daily incrementals needing 10 GB and one
weekly backup needing 40 GB yield a burn rate of 80 GB per week. Running the 80
GB weekly burn rate over the full 8-week retention period means that an estimated
640 GB is needed to store the daily incrementals and the weekly fulls.
Adding this compression to the original full backup results in 840 GB needed. Use
a Data Domain system with 1 TB of useable capacity for this scenario. That
configuration would mean that the unit would operate at about 84% of capacity.
This configuration may be ok. However, a system with a larger capacity or that can
have more storage added might be a better choice to allow for data growth.
Page
Internal Use - Confidential 12 © Copyright 2020 Dell Inc.
Data Domain Solutions Design
Burn Rate
Performance measurements that are based on tests or real backup cycles can be
found and compared to original calculations.
A good source of burn rate information is from the data available on the Data
Domain system itself. The filesys compression report is available in the command
line and is recorded in the daily Autosupport message.
Throughput Requirements
© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 13
Data Domain Solutions Design
In this sample calculation, the full backup set of 6500 GB is required to be backed
up within a 10-hour window.
This throughput yields a raw requirement of being able to process at least 650 GB
per hour.
Page
Internal Use - Confidential 14 © Copyright 2020 Dell Inc.
Data Domain Solutions Design
Performance Buffer
Good practice is to select a unit that could operate at no more than 75% to 85% of
its rated capacity.
In this example, the throughput requirement of 3.6 TB per hour would load the
DD2200 to about 95% of capacity. A better choice would be a model with higher
throughput capability such as the DD2500.
Again, results vary according to the environment. Consult the Dell EMC Support
website and consider proof-of-concept testing to confirm the achievable
performance.
© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 15
Data Domain Solutions Design
Sizing Considerations/Assumptions
Here are some additional assumptions and considerations to keep in mind when
selecting the DD model to implement.
All data that is backed up during the first week has minimal compression rate.
Minimum of two weeks retention is required.
DPS Solution Builder uses preprogrammed compression ratios. The default values
are a starting point and not the definitive values to be used in all sizing.
You must change the default Daily Change % to match that of your customer’s
environment. Data Domain default change rates may be changed when in Detail
Level “High.”
Page
Internal Use - Confidential 16 © Copyright 2020 Dell Inc.
Data Protection Design Scenarios
© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 17
Data Protection Design Scenarios
This scenario is the same as the previous scenario. The difference here is that data
goes from the Disaster Recovery Data Domain System to the Public Cloud.
This scenario shows a direct connect from the Primary Data Domain System to the
Public Cloud and direct to the Disaster Recovery Data Domain System.
This scenario show data coming from the Primary Data Domain System to the DD
VE Cloud and then to the Public Cloud.
Page
Internal Use - Confidential 18 © Copyright 2020 Dell Inc.
Data Protection Design Scenarios
In this scenario, Primary Storage sends to the Backup appliance and then to the
Isolated Recovery System. It is isolated because it is online only to periodically
update the Isolated Recovery System.
© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 19
Data Protection Design Scenarios
Data coming from a Remote Office/Branch Office to the Primary DD system can
then be sent to any of the other scenario configurations.
Page
Internal Use - Confidential 20 © Copyright 2020 Dell Inc.
Case studies
Case studies
ABC Corp currently uses NetBackup to protect data within their environment.
They are experiencing capacity issues and new requirements that are directed from
corporate management.
ABC Corp. has a moderately sized NetBackup environment that has been in place
for more than five years.
During the past two years, the environment has seen moderate growth and the
infrastructure has not grown to support it.
ABC Corp. is looking to upgrade their existing backup environment to resolve the
issues currently causing backup failures and better position themselves for future
growth and requirements.
ABC Corp. wants to address the current backup deficiencies in their environment.
They want to address new corporate guidelines requiring data be copied offsite
within 48 hours of initial backup. They are currently happy with their backup
software and are not looking to replace it.
They are interested in supplementing it and or state the options such as option A,
option B, or both modifying their design to better meet their requirements.
© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 21
Case studies
Slow tape speeds and insufficient number of drives and media causes
frequent failures and missed backup windows.
Only 10% of clients have more than one copy of backup data due to media
limitations. There is overhead associated with driving tapes frequently to DR
site.
They would prefer to improve the capability for offsite backups of a greater
number of clients.
Page
Internal Use - Confidential 22 © Copyright 2020 Dell Inc.
Case studies
A solution that contains their current infrastructure while addressing all concerns
includes:
* Replicate Data Domain devices between primary and DR locations to ensure that
data for all clients is maintained offsite.
Summary:
While addressing all their concerns, a solution that best applies their current
infrastructure is to replace their tape targets with DD Boost devices. These devices
use replication between primary and DR location.
Also, they could repurpose existing tape library/drives for use only for data
requiring long-term retention.
© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 23
Case studies
Document required hardware and state the options such as option A, option B, or
both software
GH Corp. has been using DellEMC NetWorker for many years as their sole backup
management and recovery (BMR) solution. They have recently invested heavily in
virtualizing their data centers. They are increasing the number of servers running
on VMs from 10% to 80%.
They are currently backing up all servers from a guest level and treating VMs the
same as physical hosts. With the new push to virtualize most of their environment,
they are interested in applying more advanced backup strategies for VMware. GH
Corp. is not opposed to changing their BMR product, however they are cost
conscious and would use as much of their existing environment as possible.
Also, they are interested in potential tape reduction given the high costs they
currently experience with tape management overhead.
Page
Internal Use - Confidential 24 © Copyright 2020 Dell Inc.
Case studies
All clients are backed up at a level full weekly and incremental daily. After per
month full backups are cloned to tape and sent offsite for long-term storage.
Backup retention varies by client type, but ranges 45 through 90 days onsite and
one year offsite.
A solution that best applies their current infrastructure while addressing all
concerns includes:
© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 25
Case studies
Direct all backups to Data Domain targets and use tape only for long-term retention
of the monthly backups via cloning.
The most direct solution is to upgrade their existing NetWorker implementation to
support NetWorker VMware Protection for image backups.
After this is completed, physical, or virtual storage nodes can be used to perform
image and file level backups of clients. Employ NMM for Microsoft Exchange and
Microsoft SQL backups.
Also, use a Data Domain system to replace tape as the primary backup target. The
tape library could then be repurposed to provide a clone target only for long-term
retention.
CAB Inc. is a current Backup Exec customer that is expecting significant growth
within the environment. They are looking to move to an enterprise solution that can
address specific issues that they are facing. One issue is slow backups of some
dense file systems.
Use the information that is provided to: Design a solution that addresses the
customer’s requirements.
Page
Internal Use - Confidential 26 © Copyright 2020 Dell Inc.
Case studies
for all backups. While their existing server count is small, they are expecting
significant growth in the next 12 months.
They are currently experiencing issues with their existing backup infrastructure and
are looking to make a change to an enterprise solution. CAB Inc. has a small
VMware footprint and is looking to expand soon.
CAB Inc. is experiencing some specific issues with their existing BRM solution that
they must have addressed by any new solution. They have three servers that are
used as dense file servers. While the data sizes of these servers are all between
400 GB - 1 TB, they each contain between 4 -7 million files.
Backup times of these file systems are slow, less than 1 MB/s and backups often
take 20+ hours to complete causing frequently missed backup windows.
Offsite backups are stored at the DR site three states away from the primary data
center.
Tape costs are growing significantly. CAB Inc. is interested in alternatives, however
corporate policy will not change to preclude full offsite backups from being on tape.
Getting tapes offsite is problematic. They would be interested in any solution that
would help with this issue.
Would prefer to be able to use VMware with new backup servers, if possible.
© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 27
Case studies
Currently have 100 Linux and Windows clients with expected 100% growth in the
next 12 months.
All clients are backed up daily with full backups on the weekend and incrementals
daily.
All backups are currently sent to LTO3 tape drives.–Libraries are maintained at
both the primary and DR locations.
Corporate policy dictates that a copy of all full backups is kept offsite on tape.
Backup retention varies by type and server role; 30 days onsite and 1 through 3
Page
Internal Use - Confidential 28 © Copyright 2020 Dell Inc.
Case studies
year offsite.
Three 2-node Windows file server clusters are present hosting dense file systems:
Each server has 4 through 7 million files though data size is only between 400 GB
– 1 TB.
Write data to Data Domain devices, and replicate to the remote site. Data can then
be cloned to tape. Implement block based backups (BBB) to address slow backups
of dense file servers.
The suggested solution includes a NetWorker 9.X server on a VM and two physical
storage nodes (one local and one remote at the DR site). The storage nodes would
do all the data movement leaving the NetWorker server just as a Management
Server.
Backup data would then be replicated to the DR site. After data is replicated, clone
jobs could be run to write full backups to tape at the DR site on a weekly basis.
Because data at the remote site is tracked by NetWorker with clone-controlled
replication, data at the remote Data Domain system can be written to tape. This
process is without must traverse the WAN except for replication.
© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 29
Case studies
Page
Internal Use - Confidential 30 © Copyright 2020 Dell Inc.
Data Protection
Data Protection
Data Protection
Technology Architect,
Data Protection
(C) - Classroom
© Copyright
Internal Use - Confidential 2020 Dell Inc. Page 31
Data Domain Solutions Design