Skip to content

Commit bd75ebe

Browse files
更新简介说明
1 parent 7d237c5 commit bd75ebe

File tree

5 files changed

+99
-286
lines changed

5 files changed

+99
-286
lines changed

CMakeLists.txt

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,9 +7,11 @@ cmake_minimum_required(VERSION 3.9)
77
project(leveldb VERSION 1.23.0 LANGUAGES C CXX)
88

99
# C standard can be overridden when this is used as a sub-project.
10+
# 看是否设置过CMAKE_C_STANDARD
1011
if (NOT CMAKE_C_STANDARD)
1112
# This project can use C11, but will gracefully decay down to C89.
1213
set(CMAKE_C_STANDARD 11)
14+
# 是否支持C语言扩展的非标准语法 通过将 CMAKE_C_EXTENSIONS 设置为 OFF,可以增强代码的可移植性,确保代码在不同的编译器和平台上都以标准的 C 语言规范进行编译。
1315
set(CMAKE_C_STANDARD_REQUIRED OFF)
1416
set(CMAKE_C_EXTENSIONS OFF)
1517
endif (NOT CMAKE_C_STANDARD)
@@ -18,7 +20,9 @@ endif (NOT CMAKE_C_STANDARD)
1820
if (NOT CMAKE_CXX_STANDARD)
1921
# This project requires C++11.
2022
set(CMAKE_CXX_STANDARD 11)
23+
2124
set(CMAKE_CXX_STANDARD_REQUIRED ON)
25+
# 是否支持C++扩展的非标准语法
2226
set(CMAKE_CXX_EXTENSIONS OFF)
2327
endif (NOT CMAKE_CXX_STANDARD)
2428

demo/test.cpp

Lines changed: 19 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -10,26 +10,35 @@
1010

1111
#include <type_traits>
1212
#include <utility>
13+
#include <thread>
14+
#include <set>
1315

14-
using namespace std;
1516

17+
using namespace std;
1618

1719

1820
const uint64_t CREATE_SQL_STR_LEN = 1024 * 1024;
1921

2022
int main(int argc, char *argv[]) {
2123

2224

23-
struct sql {
24-
sql() {std::cout << "==========" << std::endl;}
25-
uint64_t data;
26-
int sql_;
27-
char name[CREATE_SQL_STR_LEN];
28-
char tea;
29-
};
25+
std::set<std::string> newSet{};
26+
newSet.insert("3");
27+
newSet.insert("4");
28+
newSet.insert("5");
29+
newSet.insert("7");
30+
newSet.insert("1");
31+
newSet.insert("2");
32+
newSet.insert("2");
33+
34+
35+
for (auto& s : newSet) {
36+
std::cout << s << std::endl;
37+
}
38+
39+
40+
3041

31-
typename std::aligned_storage<sizeof(sql),
32-
alignof(sql)>::type instance_storage_;
3342

3443

3544
return 0;

doc/instruction/impl.adoc

Lines changed: 4 additions & 132 deletions
Original file line numberDiff line numberDiff line change
@@ -23,174 +23,46 @@ endif::rootpath[]
2323

2424
## Files
2525

26-
The implementation of leveldb is similar in spirit to the representation of a
27-
single [Bigtable tablet (section 5.3)](https://research.google/pubs/pub27898/).
28-
However the organization of the files that make up the representation is
29-
somewhat different and is explained below.
30-
31-
Each database is represented by a set of files stored in a directory. There are
32-
several different types of files as documented below:
3326

3427
### Log files
3528

36-
A log file (*.log) stores a sequence of recent updates. Each update is appended
37-
to the current log file. When the log file reaches a pre-determined size
38-
(approximately 4MB by default), it is converted to a sorted table (see below)
39-
and a new log file is created for future updates.
4029

41-
A copy of the current log file is kept in an in-memory structure (the
42-
`memtable`). This copy is consulted on every read so that read operations
43-
reflect all logged updates.
4430

4531
## Sorted tables
4632

47-
A sorted table (*.ldb) stores a sequence of entries sorted by key. Each entry is
48-
either a value for the key, or a deletion marker for the key. (Deletion markers
49-
are kept around to hide obsolete values present in older sorted tables).
50-
51-
The set of sorted tables are organized into a sequence of levels. The sorted
52-
table generated from a log file is placed in a special **young** level (also
53-
called level-0). When the number of young files exceeds a certain threshold
54-
(currently four), all of the young files are merged together with all of the
55-
overlapping level-1 files to produce a sequence of new level-1 files (we create
56-
a new level-1 file for every 2MB of data.)
57-
58-
Files in the young level may contain overlapping keys. However files in other
59-
levels have distinct non-overlapping key ranges. Consider level number L where
60-
L >= 1. When the combined size of files in level-L exceeds (10^L) MB (i.e., 10MB
61-
for level-1, 100MB for level-2, ...), one file in level-L, and all of the
62-
overlapping files in level-(L+1) are merged to form a set of new files for
63-
level-(L+1). These merges have the effect of gradually migrating new updates
64-
from the young level to the largest level using only bulk reads and writes
65-
(i.e., minimizing expensive seeks).
33+
6634

6735
### Manifest
6836

69-
A MANIFEST file lists the set of sorted tables that make up each level, the
70-
corresponding key ranges, and other important metadata. A new MANIFEST file
71-
(with a new number embedded in the file name) is created whenever the database
72-
is reopened. The MANIFEST file is formatted as a log, and changes made to the
73-
serving state (as files are added or removed) are appended to this log.
7437

7538
### Current
7639

77-
CURRENT is a simple text file that contains the name of the latest MANIFEST
78-
file.
7940

8041
### Info logs
8142

82-
Informational messages are printed to files named LOG and LOG.old.
8343

8444
### Others
8545

86-
Other files used for miscellaneous purposes may also be present (LOCK, *.dbtmp).
8746

8847
## Level 0
8948

90-
When the log file grows above a certain size (4MB by default):
91-
Create a brand new memtable and log file and direct future updates here.
9249

93-
In the background:
94-
95-
1. Write the contents of the previous memtable to an sstable.
96-
2. Discard the memtable.
97-
3. Delete the old log file and the old memtable.
98-
4. Add the new sstable to the young (level-0) level.
9950

10051
## Compactions
10152

102-
When the size of level L exceeds its limit, we compact it in a background
103-
thread. The compaction picks a file from level L and all overlapping files from
104-
the next level L+1. Note that if a level-L file overlaps only part of a
105-
level-(L+1) file, the entire file at level-(L+1) is used as an input to the
106-
compaction and will be discarded after the compaction. Aside: because level-0
107-
is special (files in it may overlap each other), we treat compactions from
108-
level-0 to level-1 specially: a level-0 compaction may pick more than one
109-
level-0 file in case some of these files overlap each other.
110-
111-
A compaction merges the contents of the picked files to produce a sequence of
112-
level-(L+1) files. We switch to producing a new level-(L+1) file after the
113-
current output file has reached the target file size (2MB). We also switch to a
114-
new output file when the key range of the current output file has grown enough
115-
to overlap more than ten level-(L+2) files. This last rule ensures that a later
116-
compaction of a level-(L+1) file will not pick up too much data from
117-
level-(L+2).
118-
119-
The old files are discarded and the new files are added to the serving state.
120-
121-
Compactions for a particular level rotate through the key space. In more detail,
122-
for each level L, we remember the ending key of the last compaction at level L.
123-
The next compaction for level L will pick the first file that starts after this
124-
key (wrapping around to the beginning of the key space if there is no such
125-
file).
126-
127-
Compactions drop overwritten values. They also drop deletion markers if there
128-
are no higher numbered levels that contain a file whose range overlaps the
129-
current key.
130-
131-
### Timing
132-
133-
Level-0 compactions will read up to four 1MB files from level-0, and at worst
134-
all the level-1 files (10MB). I.e., we will read 14MB and write 14MB.
135-
136-
Other than the special level-0 compactions, we will pick one 2MB file from level
137-
L. In the worst case, this will overlap ~ 12 files from level L+1 (10 because
138-
level-(L+1) is ten times the size of level-L, and another two at the boundaries
139-
since the file ranges at level-L will usually not be aligned with the file
140-
ranges at level-L+1). The compaction will therefore read 26MB and write 26MB.
141-
Assuming a disk IO rate of 100MB/s (ballpark range for modern drives), the worst
142-
compaction cost will be approximately 0.5 second.
143-
144-
If we throttle the background writing to something small, say 10% of the full
145-
100MB/s speed, a compaction may take up to 5 seconds. If the user is writing at
146-
10MB/s, we might build up lots of level-0 files (~50 to hold the 5*10MB). This
147-
may significantly increase the cost of reads due to the overhead of merging more
148-
files together on every read.
14953

150-
Solution 1: To reduce this problem, we might want to increase the log switching
151-
threshold when the number of level-0 files is large. Though the downside is that
152-
the larger this threshold, the more memory we will need to hold the
153-
corresponding memtable.
15454

155-
Solution 2: We might want to decrease write rate artificially when the number of
156-
level-0 files goes up.
55+
### Timing
15756

158-
Solution 3: We work on reducing the cost of very wide merges. Perhaps most of
159-
the level-0 files will have their blocks sitting uncompressed in the cache and
160-
we will only need to worry about the O(N) complexity in the merging iterator.
16157

16258
### Number of files
16359

164-
Instead of always making 2MB files, we could make larger files for larger levels
165-
to reduce the total file count, though at the expense of more bursty
166-
compactions. Alternatively, we could shard the set of files into multiple
167-
directories.
168-
169-
An experiment on an ext3 filesystem on Feb 04, 2011 shows the following timings
170-
to do 100K file opens in directories with varying number of files:
171-
172-
173-
| Files in directory | Microseconds to open a file |
174-
|-------------------:|----------------------------:|
175-
| 1000 | 9 |
176-
| 10000 | 10 |
177-
| 100000 | 16 |
178-
179-
So maybe even the sharding is not necessary on modern filesystems?
18060

18161
## Recovery
18262

183-
* Read CURRENT to find name of the latest committed MANIFEST
184-
* Read the named MANIFEST file
185-
* Clean up stale files
186-
* We could open all sstables here, but it is probably better to be lazy...
187-
* Convert log chunk to a new level-0 sstable
188-
* Start directing new writes to a new log file with recovered sequence#
63+
18964

19065
## Garbage collection of files
19166

192-
`RemoveObsoleteFiles()` is called at the end of every compaction and at the end
193-
of recovery. It finds the names of all files in the database. It deletes all log
194-
files that are not the current log file. It deletes all table files that are not
195-
referenced from some level and are not the output of an active compaction.
67+
19668

0 commit comments

Comments
 (0)