2009 First International Workshop on Database Technology and Applications
Using Flash Memory as Storage for Read-intensive Database
Ming Du Yan Zhao, Jiajin Le
Glorious Sun School of Business and Management College of Information Science and Technology
Donghua University Donghua University
Shanghai, China Shanghai, China
duming@dhu.edu.cn zyambition@163.com, lejiajin@dhu.edu.cn
Abstract—Flash memory is now widely deployed as data Hard disk is mechanical; data are read and written by
storage for mobile devices such as mobile phones, digital moving an arm over rotational platters [2]. Flash memory,
cameras and PDA’s. Since the gap of access time to main having no mechanically moving part, is a pure electronic
memory and disk continue growing, flash memory becomes device. It has many unique characteristics such as
more attractive for speeding up data analysis application as a asymmetric speed of read and write, no in-place update and
faster non-volatile storage in laptops and personal computers. limited number of erase/write cycles.
We may expect that flash memory will finally take the place of Many researches have started recently for making better
magnetic disk in the next few years. Therefore, it is possible for use of the characteristics of flash memory. [3] proposed a
us to consider running a database system on the flash
Flash Translation Layer (FTL), which made flash memory
computing platforms. However, disk-based database
like a magnetic disk for the host. [4] designed a hashing
technology can not be used on flash memory directly because
of its different characteristics from magnetic disk. In this directory structure to reduce the data file mount time on flash
paper, we make an analysis of the characteristics of flash memory. [5] proposed using the flash and the magnetic disk
memory, and propose that a read-intensive database system at the same level of the memory hierarchy.
should be run on the flash platform. It makes use of the In this paper, we will focus on the read-intensive
characteristics of flash memory effectively to acquire more database because a large quantity of data in real life is used
benefits by using traditional database technology on flash for read and query such as web pages and search engines.
memory. We first make an analysis of the characteristics of flash
memory, and then conduct an experiment to run a business
Keywords- Read-intensive Database; Flash-Based DBMS; database on flash-based disks (SSD), which utilizes the
NAND Flash;Non-volatile storage; Write-intensive Database; characteristics of flash memory effectively and acquire more
benefits than traditional magnetic disk. Our main
I. INTRODUCTION contributions are listed as follows.
• We suggested using the flash disk as storage for
With the rapid progress of computer hardware read-intensive database, which effectively used the
technology, the gap of access time to main memory and disk characteristics of flash memory. Our experiment
continue growing. Faster non-volatile storage has been showed that more performance can be gained by
drawn more attention. Flash memory therefore is now widely using flash instead of magnetic disk.
deployed as data storage for mobile devices such as mobile • We used read-intensive database and executed
phones, digital cameras and PDA’s. As its capacity is limited write and update operation on the flash
constantly growing, while price is proportionally dropping, memory, which avoid frequent erase work and
flash memory, especially NAND flash, is more attractive in a garbage collection. The lifetime of flash memory
wider spectrum of computing devices such as laptops, will be prolonged. It is very helpful for large amount
desktop computers and large servers. of information access to flash memory.
At present, many computer manufacturers start launching
new lines of portable computers with gigabytes of flash disk II. FLASH MEMORY
instead of magnetic disk drives. This year, Samsung
Electronics announced 128 gigabytes solid-state disk (SSD), Flash memory offers many advantages such as light
a NAND flash-based replacement for hard disk driver [1]. It weight, small size, physical stability and low power
is expected that flash disk as commodity hardware will fully consumption. We will introduce the flash memory in details.
take the place of traditional magnetic disk in the near future. A. NOR flash and NAND flash
It’s necessary for us to consider running a full database on
the flash computing platform. Flash memory is a type of nonvolatile, electrically-
However, Flash memory has some distinct characteristics erasable programmable read-only memory (EEPROM). It
that make current disk-based database technology unsuitable. can be divided into two types: NOR flash and NAND flash,
depending on the logic gate type. The major difference
978-0-7695-3604-0/09 $25.00 © 2009 IEEE 472
DOI 10.1109/DBTA.2009.80
Authorized licensed use limited to: Universitas Brawijaya. Downloaded on June 21,2025 at 15:01:22 UTC from IEEE Xplore. Restrictions apply.
between the two types of flash is related to the addressing 3) Asymmetric Speed of Read/Write
mode [6]. NOR flash, like main memory, is directly Flash memory has asymmetric read and write speed. As
addressable by the processor. NAND flash is indirectly is shown in table 1, the read time for 2 Kbyte is typically
addressable. It must be accessed through a controller like 80µm, while the write time is 200µm since it takes longer to
disk I/O interface. So it can be used easily like a magnetic inject a charge into a memory cell than reading its status. For
disk. NAND flash is available in considerably higher storage most flash memory, read speed is twice as fast as write speed.
densities at lower costs and thus is appropriate for large data So using read-intensive database on flash memory is more
storage. In this paper, we will focus on NAND flash memory. favorable than write-intensive database.
B. Architecture of NAND flash memory 4) Limited Number of Writes
Flash memory has a limited number of erase/write cycles,
NAND flash memory chip is composed of blocks, and typically around 10 000 to 100 000 to each block. After the
each block has a fixed number of sectors. Sector is basic unit cycle limit has been exceeded, the block becomes unreliable.
of read and write. In order to replace the hard disk, NAND Although most flash memory devices adopt wear leveling to
flash is designed in the same manner with magnetic disk, and evenly distribute erase cycles across the entire memory
thus a sector is 512 bytes. As is shown in figure 1, one block segment, it is also a disadvantage. Using flash memory as
consists of 32 sectors, and its size is about 16 Kbytes [1]. storage for read-intensive database will avoid frequent
Recently, large block NAND flash is developed for high-end erase/write operation and prolong the lifetime of the whole
applications with 128 sectors and 4 Kbyte in a sector [7]. flash-based disk.
These flash chips combine into a flash-based disk such as
solid state disk (SSD). III. RUNNING READ-INTENSIVE DATABASE ON SSD
C. Characteristics of Flash Memory In order to overcome the problems caused by the
Flash memory has many unique I/O characteristics. It characteristics of flash memory, we present to use it as
needs to be taken into account when designing applications storage for read-intensive database. With FTL, flash-based
for such disk. The most important characteristics of flash disk (SSD) can be used directly instead of conventional
memory are as follows: magnetic disk [8]. Experiment shows that access efficiency
of read-intensive database on SSD is much higher than
1) No In-Place Updates
which on magnetic disk.
Flash memory, unlike magnetic disk, cannot overwrite
sectors. In order to overwrite existing data, the entire block A. Experimental setting
(usually spans many sectors) must be erased first and then We ran a business database on two computer systems
the new data can be written back, which means all the other respectively, each with a 1.8GHz Intel Pentium dual-core
sectors in the block need be read out and then write back. processor and 1 GB RAM. The operating system is Linux.
This becomes a bottle-neck characteristic when running a These two computer systems are identical except one is
database on flash memory disk. To make better use of flash equipped with a magnetic disk and the other with a flash
memory, update operation is necessary to be reduced. We memory SSD. The magnetic disk is Seagate ST3160815AS
may run a read-intensive database to avoid frequent update with 160GB capacity, 7200 rpm and SATA interface. The
work. flash memory SSD disk is SoliWare S100 with 32 GB
2) No Mechanical Latency capacity and 2.5 inch SATA interface, which internally
Flash memory is a pure electronic device and has no deploys 16 Samsung K9GAG08U0M MLC NAND flash
mechanical moving parts. So it has no seek and rotational chips with 2 GB capacity each. K9GAG08U0M is composed
latency as magnetic disk. As is shown in table 1[7], reading of 4096 erase blocks. Every block consists of 128 sectors and
2K bytes from flash memory will take 80µs, while reading each sector is 4KB.
the same capacity bytes from magnetic disk need 12.7ms. The business database is set to access the storage as a raw
Flash memory has faster access speed than magnetic disk. device in order to minimize interference from data caching
This characteristic is very important for us to using flash by the operating system, and the size of a buffer pool is
memory as storage for read-intensive database. limited to 20MB [7]. A sample table is about 400 MB, which
consists of 640,000 records of 650 bytes each. Because each
sector is 4KB, a sector stores 5 records. The table is spanned
about 1,000 erase blocks.
TABLE I. ACCESS SPEED: MAGNETIC DISK VS. NAND FLASH
Access time
Media
Read Write Erase
12.7ms 13.7ms
Magnetic Disk N/A
(2K) (2K)
80µs 200us 1.5 ms
NAND flash
(2K) (2K) (128K)
Figure 1 Architecture of flash block
473
Authorized licensed use limited to: Universitas Brawijaya. Downloaded on June 21,2025 at 15:01:22 UTC from IEEE Xplore. Restrictions apply.
B. Read Performance W2: updated one record every other record once in the
In this section, a comparison experiment about the read table, such as 0, 2, 4, . . ., 1, 3, 5, . . . .
performance of flash-base SSD and conventional magnetic W3: updated five consecutive records once, and then
disk was executed. updated another five consecutive records next time. The
We ran four read queries R1, R 2, R 3 and R 4 to the records updated in sequence were apart by one block with
business database on two computers mentioned above 640 records in the table. Repeated this until each record was
respectively. The detailed description of the four patterns of updated only once. The order was as follows, 0, 1, 2, 3, 4,
read queries is given below. 640, 641, 642, 643, 644, . . ., 5, 6, 7, 8, 9, 645, 646, 647, 648,
R1: scanned the entire table sequentially. 649, . . . .
R2: read one record every other record once in the table, W4: updated one record once and two records updated in
such as 0, 2, 4, . . ., 1, 3, 5, . . . . sequence were apart by 2560 records in the table. Repeated
R3: read five consecutive records once, and then read this until each record was read only once. The update query
another five consecutive records next time. The records read was in the following order: 0, 2560, 5120, . . . , 1, 2561,
in sequence were apart by one block with 640 records in the 5121, . . . .
table. Repeated this until each record was read only once. The result was presented in figure 3. The response time
The order was as follows, 0, 1, 2, 3, 4, 640, 641, 642, 643, of magnetic disk is similar to that in the read performance.
644, . . ., 5, 6, 7, 8, 9, 645, 646, 647, 648, 649, . . . . However, the write performance of flash memory changed a
R4: read one record once and two records read in lot, which is completely different from the read performance.
sequence were apart by 2560 records in the table, for the This result showed that flash memory had the characteristics
buffer is 20MB, which contains 40 blocks with 2560 records. of Asymmetric speed of read/writ and No in-place updates.
Repeated this until each record was read only once. The read Although a SSD has executed the wear leveling [6], when all
query was in the following order: 0, 2560, 5120, . . . , 1, 2561, the clear blocks have used out, erase operations have to be
5121, . . . . done finally. This is the most time-consuming operation.
The result was presented in figure 2. The response time
600
of disk increased greatly with the read query pattern
changing from sequential R1 to semi-random R2, R3 and R4. 500
On the contrary, in the case of flash-based SSD, the response 400
time increased slightly. This result showed that flash Disk
Sec
memory had the characteristic of No mechanical latency. 300
Flash
With the random access continue increasing, conventional 200
magnetic disk has to do a large number of seeking actions
100
and disk arm move frequently. Magnetic disk executes more
I/O operations, which take more read time. As a result, the 0
W1 W2 W3 W4
cost of randomly read on magnetic disk is much higher than Write Query Pattern
flash-based SSD.
Figure 3 Write performances of flash and disk
300
250 D. Performance of read-intensive database
200
As mentioned above, flash memory gains absolutely
Disk
advantages in read operation over conventional magnetic
sec
150
Flash
100 disk. However it has awful performance in random write
50
operation. So we propose that it is appropriate to run read-
intensive database on flash-based SSD.
0
R1 R2 R3 R4
Read Qurey Pattern
IV. CONCLUSION
In this paper, we make an analysis of the characteristics
Figure 2 Read performances of flash and disk
of flash memory, and run a business database on two
computer systems respectively that are identical except one
is equipped with a magnetic disk driver and the other with a
C. Write Performance flash-base SSD driver.
In order to learn about the write performance of flash From the read and write performance experiments, we
memory, we ran another four update queries to the same draw a conclusion that flash memory gains absolutely
business database on the two computer systems as well. The advantages in read operation over conventional magnetic
detailed description of the queries W1, W2, W3 and W4 are disk, however it has awful performance in random write
given below. operation.
W1: updated each record of the table sequentially. Considering many requirements for information, we
propose using flash memory as storage for read-intensive
474
Authorized licensed use limited to: Universitas Brawijaya. Downloaded on June 21,2025 at 15:01:22 UTC from IEEE Xplore. Restrictions apply.
database. Experiments prove that it extremely improves the [3] Intel Corporation, “Understanding the Flash Translation Layer (FTL)
read access efficiency. However, it is sometimes difficult to Specification”, Application Note AP-684, Intel Corporation,
December 1998.
determine whether a database is read-intensive or not. How
[4] Seung-Ho Lim, Chul Lee and Kyu-Ho Park, “Hashing Directory
to improve the performance of write-intensive database is Scheme for NAND Flash Flile System”, http://www-
another issue we should focus on. We will pay more core.kaist.ac.kr/paper_list/ 2007_ICACT_DMFFS.pdf. February
attention to the whole performance of database on the flash 2007.
platform in the future. [5] Loannis Koltsidas, Stratis D. Viglas, “Flashing Up the Storage
Layer”, in VLDB 2008.
[6] Daniel Myers, “On the Use of NAND Flash Memory in High-
REFERENCES Performance Relational Databases,” master degree paper of MIT,
February 2008.
[1] Sang-Won Lee, Won Kim, “On Flash-Based DBMSs: Issues for
Architectural Re-Examination”, Journal of object Technology, Vol.6, [7] Rosenblum, Mendel and Ousterhout, John K, “The Design and
No.8, September-October 2007, pp.39-49. Implementation of a Log-Structured File System”, Berkeley, CA:
ACM, 1991.
[2] Hans Olav Norheim, “How Flash Memory Changes the DBMS
World”, http://hansolav.net/blog/content/binary/ [8] Sang-Won Lee, Bongki Moon, Chanik Park, Jae-Myung Kim and
HowFlashMemory.pdf. April 2008. Sang-Woo Kim, “A Case for Flash Memory SSD in Enterprise
Database Application”, In ACM SIGMOD, 2008.
475
Authorized licensed use limited to: Universitas Brawijaya. Downloaded on June 21,2025 at 15:01:22 UTC from IEEE Xplore. Restrictions apply.