_Chapter02HDFS—HadoopDistributedFileSyste (3)
_Chapter02HDFS—HadoopDistributedFileSyste (3)
_Chapter02HDFS—HadoopDistributedFileSyste (3)
& ZooKeeper
Foreword
⚫ This chapter describes the big data distributed storage system HDFS and
the ZooKeeper distributed service framework that resolves some
frequently-encountered data management problems in distributed
applications. This chapter lays a solid foundation for subsequent
component learning.
1 Huawei Confidential
Objectives
2 Huawei Confidential
Contents
1. HDFS
◼ HDFS Overview
HDFS-related Concepts
HDFS Architecture
HDFS Key Features
HDFS Data Read/Write Process
3 Huawei Confidential
Dictionary and File System
Xinhua
Dictionary
4 Huawei Confidential
Distributed File System
5 Huawei Confidential
HDFS Overview
⚫ Hadoop Distributed File System (HDFS) is a distributed file system designed to run on
commodity hardware.
⚫ HDFS was originally built as the infrastructure for the Apache Nutch web crawler software
project.
⚫ HDFS is part of the Apache Hadoop Core project.
⚫ HDFS is highly fault-tolerant and is deployed on cost-effective hardware.
⚫ HDFS provides high-throughput access to application data and is suitable for applications
with large data sets.
⚫ HDFS relaxes Portable Operating System Interface (POSIX) requirements to enable
streaming access to file system data.
6 Huawei Confidential
Use Cases of HDFS
8 Huawei Confidential
Contents
1. HDFS
HDFS Overview
◼ HDFS-related Concepts
HDFS Architecture
HDFS Key Features
HDFS Data Read/Write Process
9 Huawei Confidential
Basic HDFS System Architecture
Metadata (Name, replicas, ...):
NameNode /home/foo/data,3...
Block ops
Client
Replication
Blocks Blocks
Client
Rack 1 Rack 2
10 Huawei Confidential
Client
⚫ Clients are the most common way of using HDFS. HDFS provides a client during
deployment.
⚫ It is a library that contains HDFS interfaces that hide most of the complexity in
HDFS implementation.
⚫ Strictly speaking, it is not part of HDFS.
⚫ It supports common operations such as opening, reading, and writing, and provides
a shell-like command line mode to access data in HDFS.
⚫ HDFS also provides Java APIs that serve as client programming interfaces for
applications to access the file system.
11 Huawei Confidential
NameNode
⚫ In HDFS, the NameNode manages the namespace of the distributed file system and stores two core
data structures: FsImage and EditLog.
FsImage maintains the metadata of the file system tree and all the files and folders in that file tree.
EditLog records all the operations on files, such as creation, deletion, and renaming.
Root directory
NameNode
EditLog File
12 Huawei Confidential
DataNode
⚫ A DataNode is a worker node of HDFS. It stores and retrieves data based on the
scheduling of Clients or NameNodes, and periodically sends the list of blocks stored
on it to NameNodes.
⚫ Data on each DataNode is stored in the local Linux file system of the node.
13 Huawei Confidential
NameNodes vs. DataNodes
NameNode DataNode
14 Huawei Confidential
Block
⚫ The default size of a block is 128 MB. A file is divided into multiple blocks. A block is
the storage unit.
⚫ The block size is much larger than that of a common file system, minimizing the
addressing overhead.
⚫ A block has the following benefits:
Large-scale file storage
Simplified system design
Data backup
15 Huawei Confidential
Contents
1. HDFS
HDFS Overview
HDFS-related Concepts
◼ HDFS Architecture
HDFS Key Features
HDFS Data Read/Write Process
16 Huawei Confidential
HDFS Architecture
File name or block ID
Client NameNode
Local Linux file Local Linux file Local Linux file Local Linux file
system system system system
Backup
Rack 1 Rack N
17 Huawei Confidential
HDFS Namespace Management
⚫ An HDFS namespace contains directories, files, and blocks.
⚫ HDFS uses the traditional hierarchical file system. Therefore, users can create and
delete directories and files, move files between directories, and rename files in the
same way as with common file systems.
⚫ NameNodes maintain file system namespaces. Any changes to the file system
namespace or its attributes are recorded by the NameNode.
18 Huawei Confidential
Communication Protocol
⚫ HDFS is a distributed file system deployed on a cluster. Therefore, a large amount of data
needs to be transmitted over the network.
All HDFS communication protocols are based on TCP/IP.
The Client initiates a TCP connection to the NameNodes through a configurable port and uses the
Client protocol to interact with the NameNodes.
The NameNode and the DataNodes interact with each other through the DataNode protocol.
Interaction between the Client and the DataNodes is implemented through the Remote Procedure
Call (RPC). In design, the NameNode does not initiate an RPC request, but responds to RPC requests
from the Client and DataNodes.
19 Huawei Confidential
Disadvantages of the HDFS Single-NameNode Architecture
⚫ Only one NameNode is configured for HDFS, which significantly simplifies the system's
design but also has the following disadvantages:
Namespace limitation: NameNodes are stored in the memory. Therefore, the number of objects
(files and blocks) that can be contained in a NameNode is limited by memory size.
Performance bottleneck: The throughput of the entire distributed file system is limited by the
throughput of a single NameNode.
Isolation: Because there is only one NameNode and one namespace in the cluster, different
applications cannot be isolated.
Cluster availability: Once the only NameNode is faulty, the entire cluster becomes unavailable.
20 Huawei Confidential
Contents
1. HDFS
HDFS Overview
HDFS-related Concepts
HDFS Architecture
◼ HDFS Key Features
HDFS Data Read/Write Process
21 Huawei Confidential
HDFS High Availability
ZooKeeper ZooKeeper ZooKeeper
Heartbeat Heartbeat
EditLog
ZKFC ZKFC
JN JN JN
Reads logs.
Writes logs.
22 Huawei Confidential
Metadata Persistence
Active NameNode Secondary NameNode
2. Obtain Editlog and FsImage from the
Editlog FsImage active NameNode.
(FsImage is downloaded only during
NameNode initialization and used as a
local file later.)
1. Send notifications.
Editlog
FsImage
.new Editlog
3. Merge the
FsImage and
EditLog files.
FsImage FsImage
.ckpt .ckpt
5. Roll back FsImage. 4. Upload the newly generated
FsImage file to the active NameNode.
Editlog FsImage
23 Huawei Confidential
HDFS Federation
App Client-1 Client-K Client-N
More More
NS1 NS-K
NS-N
Pool
Pool 1 Pool N
Block storage
Block pools
Common storage
DataNode 1 DataNode 2 DataNode N
More More More
24 Huawei Confidential
Data Replica Mechanism
Distance = 4
Distance = 0 Distance = 4
Distance = 2
Node 2 Node 2 Node 2
25 Huawei Confidential
HDFS Data Integrity Assurance
⚫ HDFS ensures component reliability and the integrity of stored data.
⚫ Rebuilding the replica data of failed data disks:
When the DataNode fails to report data to the NameNode, the NameNode initiates replica rebuilding to restore
lost replicas.
⚫ Cluster data balancing:
HDFS incorporates the data balancing mechanism, which ensures that data is evenly distributed on DataNodes.
⚫ Metadata reliability:
The log mechanism records operations on metadata, and metadata is stored on the active and standby
NameNodes.
The snapshot mechanism of file systems ensures that data can be restored promptly in the case of
misoperations.
⚫ Security mode:
HDFS provides a unique security mode to prevent faults from spreading when DataNodes or disks are faulty.
26 Huawei Confidential
Other Key Design Points of the HDFS Architecture
⚫ Space reclamation mechanism:
HDFS supports the recycle bin mechanism and dynamic setting of replicas.
⚫ Data organization:
Data is stored by block in HDFS of the operating system.
⚫ Access mode:
HDFS allows data access through Java APIs, HTTP, or Shell scripts.
27 Huawei Confidential
Common Shell Commands
Type Command Description
-cat Displays file content.
-ls Displays the directory list.
-rm Deletes a file.
-put Uploads directories or files to HDFS.
dfs Downloads directories or files to the
-get
local host from HDFS.
-mkdir Creates a directory.
-chmod/-chown Changes the group of a file.
... ...
Performs operations in security
-safemode
mode.
dfsadmin
-report Reports service status.
28 Huawei Confidential
Contents
1. HDFS
HDFS Overview
HDFS-related Concepts
HDFS Architecture
HDFS Key Features
◼ HDFS Data Read/Write Process
29 Huawei Confidential
HDFS Data Write Process
FSData NameNode
6. Closes files. OutputStream
Client node
4 4
DataNode DataNode DataNode
5 5
30 Huawei Confidential
HDFS Data Read Process
Client node
5. Reads data.
4. Reads data.
31 Huawei Confidential
Contents
1. HDFS
32 Huawei Confidential
ZooKeeper Overview
⚫ The ZooKeeper distributed service framework is used to solve some data management
problems that are frequently encountered in distributed applications and provide distributed
and highly available coordination service capabilities.
⚫ As an underlying component, ZooKeeper is widely used and depended upon by upper-layer
components, such as Kafka, HDFS, HBase, and Storm. It provides functions such as
configuration management, naming service, distributed lock, and cluster management.
⚫ ZooKeeper encapsulates complex and error-prone key services and provides users with easy-
to-use APIs and efficient and stable systems.
⚫ In security mode, ZooKeeper depends on Kerberos and LdapServer for security
authentication.
33 Huawei Confidential
Contents
1. HDFS
34 Huawei Confidential
ZooKeeper Architecture
⚫ A ZooKeeper cluster consists of a group of servers. In this group, there is only one leader node, with the
other nodes being followers.
⚫ The leader is elected during startup.
⚫ ZooKeeper uses the custom atomic message protocol to ensure data consistency among nodes in the
entire system.
⚫ After receiving a data change request, the leader node writes data to the disk and then to the memory.
ZooKeeper service
Leader
35 Huawei Confidential
ZooKeeper Disaster Recovery
⚫ ZooKeeper can provide services for external systems after the election is complete.
During ZooKeeper election, if an instance obtains more than half of the votes, the
instance becomes the leader.
⚫ For a service with N instances, N may be an odd or even number.
When N is an odd number, assume that n = 2x + 1, the node needs to obtain x+1 votes
to become the leader, and the DR capability is x.
When N is an even number, assume that n = 2x + 2, the node needs to obtain x+2
(more than half) votes to become the leader, and the DR capability is x.
36 Huawei Confidential
Key Features of ZooKeeper
⚫ Eventual consistency: All servers are displayed in the same view.
⚫ Real-time: Clients can obtain server updates and failures within a specified period of
time.
⚫ Reliability: A message will be received by all servers.
⚫ Wait-free: Slow or faulty clients cannot intervene in the requests of rapid clients so
that the requests of each client can be processed effectively.
⚫ Atomicity: Data transfer either succeeds or fails, but no transaction is partial.
⚫ Sequence consistency: Updates sent by the client are applied in the sequence in
which they are sent.
37 Huawei Confidential
Read Function of ZooKeeper
⚫ According to the consistency of ZooKeeper, the client obtains the same view regardless of
the server connected to the client. Therefore, read operations can be performed between the
client and any node.
Client
38 Huawei Confidential
Write Function of ZooKeeper
1. Write request
Client
6. Write response
39 Huawei Confidential
Common Commands for the ZooKeeper Client
⚫ Invoking a ZooKeeper client:
zkCli.sh –server 172.16.0.1:24002
40 Huawei Confidential
Quiz
41 Huawei Confidential
Quiz
3. Why is the size of an HDFS data block larger than that of a disk block?
42 Huawei Confidential
Summary
⚫ The distributed file system is an effective solution for large-scale data storage in the era of
big data. The open source HDFS implements GFS and distributed storage of massive data by
using a computer cluster formed by inexpensive hardware.
⚫ HDFS is compatible with inexpensive hardware devices, stream data read and write, large
data sets, simple file models, and powerful cross-platform compatibility. However, HDFS has
its own limitations. For example, it is not suitable for low-latency data access, cannot
efficiently store a large number of small files, and does not support multi-user write and
arbitrary file modification.
⚫ "Block" is the core concept of HDFS. A large file is split into multiple blocks. HDFS adopts
the abstract block concept, supports large-scale file storage, simplifies system design, and is
suitable for data backup.
⚫ The ZooKeeper distributed service framework is used to solve some data management
problems that are frequently encountered in distributed applications and provide distributed
and highly available coordination service capabilities.
43 Huawei Confidential
Acronyms and Abbreviations
⚫ POSIX: Portable Operating System Interface
⚫ NN: NameNode
⚫ DN: DataNode
⚫ NS: Namespace
⚫ RPC: remote procedure call
⚫ ZKFC: ZooKeeper Failover Controller
⚫ HA: High Availability
⚫ JN: JournalNode, which stores the Editlog generated by the active NameNode.
44 Huawei Confidential
Recommendations
⚫ Huawei Talent
https://e.huawei.com/en/talent
⚫ Huawei Enterprise Product & Service Support
https://support.huawei.com/enterprise/en/index.html
⚫ Huawei Cloud
https://www.huaweicloud.com/intl/en-us/
45 Huawei Confidential
Thank you. 把数字世界带入每个人、每个家庭、
每个组织,构建万物互联的智能世界。
Bring digital to every person, home, and
organization for a fully connected,
intelligent world.