Compression Algorithms For Efficient Big Data Storage

The article discusses the importance of compression algorithms in managing the challenges of big data storage, highlighting both lossless and lossy methods. It evaluates various algorithms, such as Snappy, Zstandard, Gzip, and Brotli, and their integration into frameworks like Hadoop and Spark to enhance storage efficiency and data accessibility. The paper also explores future trends in compression technology, including AI and machine learning techniques, to address ongoing challenges in big data environments.

Uploaded by

Editor IJTSRD

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views7 pages

Compression Algorithms For Efficient Big Data Storage

Uploaded by

Editor IJTSRD

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

International Journal of Trend in Scientific Research and Development (IJTSRD)

Volume 9 Issue 3, May-Jun 2025 Available Online: www.ijtsrd.com e-ISSN: 2456 – 6470

Compression Algorithms for Efficient Big Data Storage

Dr. Gopal Prasad Sharma1, Prof. Raj Kumar Thakur2, Prof. Dr. Pawan Kumar Jha3
1
Associate Professor, Purbanchal University School of Science & Technology (PUSAT), Biratnagar, Nepal
2,3
Professor, Purbanchal University School of Science & Technology (PUSAT), Biratnagar, Nepal

ABSTRACT How to cite this paper: Dr. Gopal

In today's digital world, big data is growing at an unprecedented rate, Prasad Sharma | Prof. Raj Kumar Thakur
presenting significant challenges in terms of storage and | Prof. Dr. Pawan Kumar Jha
management. Compression algorithms play a pivotal role in "Compression Algorithms for Efficient
Big Data Storage"
addressing these challenges by reducing data size without
Published in
compromising essential information. This article provides an in-depth International Journal
exploration of data compression techniques, categorizing them into of Trend in
lossless and lossy compression methods, and evaluates their Scientific Research
effectiveness in big data applications. The integration of these and Development
algorithms into popular big data frameworks such as Hadoop, Spark, (ijtsrd), ISSN: 2456- IJTSRD81127
and cloud storage systems is discussed, highlighting their impact on 6470, Volume-9 |
storage efficiency and data accessibility. A comparison of several Issue-3, June 2025, pp.788-794, URL:
compression algorithms, including Snappy, Zstandard, Gzip, and www.ijtsrd.com/papers/ijtsrd81127.pdf
Brotli, is also presented to guide the selection of the most suitable
algorithm based on compression ratio, speed, and the type of data Copyright © 2025 by author (s) and
International Journal of Trend in
being processed. The article concludes with a look into future trends Scientific Research and Development
and innovations in compression, including AI and machine learning- Journal. This is an
based techniques, adaptive methods, and the potential for quantum Open Access article
compression. These advancements offer exciting prospects for distributed under the
improving data storage and processing capabilities, while addressing terms of the Creative Commons
ongoing challenges in big data environments. Attribution License (CC BY 4.0)
(http://creativecommons.org/licenses/by/4.0)
KEYWORDS: Big data, Compression Algorithms, Hadoop, Lossless
Compression, Lossy Compression

I. INTRODUCTION
The digital age's massive data generation has Compression algorithms make big data storage easier.
transformed business, government, and relationships. Compression algorithms allow organisations to store
IoT devices, social media, and advanced analytics more data in less space while retaining essential
have increased data production to unprecedented information. Real-time applications need this
levels. This trend, called "big data," involves massive, optimisation to lower storage costs and boost
complex datasets that are difficult to store, process, processing and data transfer speeds. Lossless
and analyse. Big data helps identify patterns, improve compression is the optimal choice for mission-critical
decision-making, and drive innovation in medicine, data, while losing compression allows certain
economics, retail, and science [1]. Big data applications to make acceptable quality and size
management and storage are difficult despite its many compromises [2]. Together, these algorithms support
benefits. Big data's size and complexity make storage current data management methods. Compression
difficult. Infrastructure costs are high because algorithms are important and complicated in big data
structured and unstructured data require a lot of storage. It discusses their principles, rates popular
hardware. Keeping such data accessible, reliable, and methods, and integrates them into big data
latency-free complicates storage. Businesses are frameworks. The article discusses recent and future
always looking for ways to maximise their limited compression algorithm developments to demonstrate
resources because data is growing faster than most their importance in big data storage and management.
storage solutions can handle. Due to this need, storage
II. FUNDAMENTALS OF DATA
solutions must be efficient, scalable, and affordable
COMPRESSION
while protecting data integrity and accessibility.
Compressing data reduces its size without
compromising its quality. Data compression reduces

@ IJTSRD | Unique Paper ID – IJTSRD81127 | Volume – 9 | Issue – 3 | May-Jun 2025 Page 788
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
space and bandwidth needed to transmit and store formats. This causes inefficient storage use, latency,
data to maximise resource use. Data-driven industries and higher costs.
and the need to efficiently manage massive amounts
Because managing large datasets is complicated, data
of data have made compression crucial. Lossless and
accessibility, security, and integrity are difficult to
lossy compression are the main data compression
ensure. Big data companies aim to reduce storage
methods [3].
costs and increase data accessibility [6]. Servers, data
The original data can be perfectly reconstructed from centres, and other storage infrastructure are expensive
compressed data using lossless compression. This upfront and over time. Effective data management
method is often used for high-fidelity data files like can reduce these expenses and improve system
PNG images or text files like ZIP. Lossless performance and data retrieval speeds. Improved
compression preserves all data, making it essential for accessibility ensures quick data retrieval and
applications where small errors can have big effects. processing, enabling timely decision-making and
Lossy compression reduces file size while competitive advantage preservation.
maintaining quality by using some of the original
data. Multimedia files like images, audio, and video
use this method because even a small data loss doesn't
affect user experience. JPEG and MP3 use lossy
compression to balance file size and quality [4].
Lossy methods help streaming services save
bandwidth. A compression algorithm's effectiveness
is measured by compression ratio, speed, and
FIGURE 1 Data Flow Before and After
efficiency. Data compression is often expressed as a
Compression (Source: Self-Created)
percentage of its original size. For real-time
applications, data compression and decompression Effective compression algorithms are needed to
times matter. An algorithm is efficient if it address these issues. This algorithm greatly reduces
compresses data with little processing power. These data file sizes, allowing businesses to store more data
metrics help decide if a compression method is right with less resources. Compression lowers
for a task. infrastructure costs and hardware upgrades by
reducing storage footprint [7]. Compression speeds
Big data requires compression to overcome
up data processing and transfers, which is crucial for
processing and storage limitations. Big data is too
real-time applications and analytics. Strong
large to send or store unprocessed. Data compression
compression techniques are needed to make storage
saves computing resources, speeds network data
strategies scalable, cost-effective, and operationally
transfer, and lowers storage costs. Compression
efficient as big data grows.
algorithms in Hadoop and Spark help organisations
manage large datasets. Therefore, big data IV. TYPES OF COMPRESSION
management strategies must include compression. ALGORITHMS FOR BIG DATA
Massive data sets require compression algorithms for
III. OVERVIEW OF BIG DATA STORAGE
data management. These algorithms are mostly
CHALLENGES
lossless and lossy. Both have different approaches,
Big data is defined by its 5Vs: volume, velocity,
benefits, and drawbacks, making them optimal for
variety, veracity, and value. Volume terabytes to
specific applications. Hybrid compression methods
exabytes of data generated daily is called volume.
maximise both approaches.
This massive data flood comes from enterprise apps,
social media, and IoT devices. Due to data generation A. LOSSLESS COMPRESSION ALGORITHMS
and processing velocity, real-time or near-real-time The early lossless compression algorithm Huffman
analytics systems are needed. Big data includes Coding stands out. To make it work, frequently
structured databases, unstructured text, photos, appearing symbols have shorter binary codes and
videos, and sensor data. Truthfulness highlights the rarely appearing symbols have longer codes. Huffman
challenges of data accuracy and reliability in the face Coding uses a symbol frequency-based binary tree
of noise and inconsistencies. Finally, big data analysis structure to ensure no code is a prefix for efficient
and use yield practical insights and benefits [5]. decoding [8]. The ZIP and GZIP file formats use this
Traditional storage methods struggle with these method. Its main benefit is reproducing initial data.
features. Traditional database and file storage systems Due to low compression ratios, Huffman Coding may
cannot handle big data's volume and velocity. They're not benefit datasets with normal frequency
not flexible or scalable enough to handle all data distributions.

@ IJTSRD | Unique Paper ID – IJTSRD81127 | Volume – 9 | Issue – 3 | May-Jun 2025 Page 789
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
Arithmetic coding, another lossless method, converts variable data because repetitions are rare, preventing
messages to integers between zero and one. compression gains.
Arithmetic Coding represents the probability of the
C. HYBRID COMPRESSION TECHNIQUES
entire data stream, unlike Huffman Coding, which
Hybrid compression algorithms combine lossy and
uses discrete symbols, making it efficient for datasets
lossless algorithms for optimal results. Multimedia
with skewed symbol distributions [9]. This algorithm
codecs like HEVC and H.264 use transform coding
is used in text and multimedia codecs. Although it
for lossy video frame compression and lossless
outperforms Huffman Coding in compression ratio, it
entropy coding for metadata and headers, like
is computationally heavy and may not work well in
Arithmetic Coding [13]. This combination preserves
real time.
important data while achieving high compression
Compression algorithms like LZ77 and LZW ratios. These methods excel in big data applications
(Lempel-Ziv) can replace repetitive patterns with like streaming and video analytics, where efficiency
dictionary-based shorter references. LZ77 replaces and quality are crucial.
repeating strings with references to earlier ones, while
D. ADVANTAGES AND LIMITATIONS OF
LZW creates a data stream pattern dictionary. GIF
EACH METHOD
and ZIP use these methods extensively. Their
Compression algorithms are optimal for certain tasks
adaptability to different data types is one of their
due to their pros and cons. Use LZW, Arithmetic
many advantages.
Coding, or Huffman Coding for lossless data recovery
Quick and efficient lossless compression algorithms of text files, databases, or important logs. However,
Snappy and Zstandard are new [10]. Google's Snappy their compression ratios are lower than lossy
prioritises fast compression and decompression for methods. Although they may cause artefacts or
real-time applications like log management. quality loss, lossy algorithms like DCT and wavelet-
Facebook's Zstandard balances speed and based methods compress multimedia files well.
compression ratio and offers adjustable parameters. In Hybrid approaches can optimise size and quality, but
fast-processing big data frameworks like Spark and they are complex and computationally intensive.
Hadoop, these algorithms are becoming more
Big data compression algorithms depend on data type,
popular. Though efficient, they may not compress as
use, and storage efficiency vs. processing demands.
well as more complicated algorithms like Arithmetic
Lossy algorithms work better for photos and videos
Coding.
than lossless algorithms for structured and critical
B. LOSSY COMPRESSION ALGORITHMS data. For storage and resource efficiency, big data
Lossy compression relies on transform coding, which ecosystems will always need compression algorithms.
includes the DCT. Data can be converted from spatial
V. ROLE OF COMPRESSION IN BIG
to frequency domain to remove low-frequency
DATA FRAMEWORKS
components (important features) and keep high-
Large amounts of data are created and processed in
frequency components (details). JPEG and MPEG use
the big data era, making compression essential for
this method to compress images and videos [11]. Its
storage management and performance optimisation.
main benefit is size reduction, but it may lose fine
Cloud storage platforms and big data ecosystems like
details, making it unsuitable for some uses.
Hadoop and Spark manage massive amounts of
Wavelet transforms use transform coding at multiple diverse data [14]. These frameworks improve storage
resolutions to improve data representation. costs, data transfer speeds, and processing efficiency
Progressive transmission and scalable storage make by using efficient compression algorithms. Modern
this method ideal for audio and image compression. big data applications use compression to scale, speed
Wavelet-based compression allows JPEG 2000 to up, and reduce storage.
compress at high ratios without sacrificing quality.
A. INTEGRATION OF COMPRESSION
However, these methods may be too computationally
ALGORITHMS IN BIG DATA
intensive for real-time big data processing.
ECOSYSTEMS
Run-Length Encoding (RLE) is a simple and efficient Big data frameworks are used for datasets too large
lossy compression method that replaces repeated for conventional systems. Since these systems are
values with a single value and count [12]. A string of scalable, data compression algorithms are integrated
ten "A"s is "A10." Bitmaps and other data with long into their processing and storage layers to optimise
repeated values are good RLE candidates. Processing performance. Compressing data before storage allows
is fast and computational overhead is low due to its these systems to fit datasets into distributed storage
simplicity. However, RLE doesn't work for highly clusters [15]. Data compression during processing

@ IJTSRD | Unique Paper ID – IJTSRD81127 | Volume – 9 | Issue – 3 | May-Jun 2025 Page 790
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
speeds up analysis and reduces network bandwidth by speeds up read/write speeds and lowers transmission
reducing data read and transferred. Big data and storage costs in Spark, making it ideal for remote
frameworks like Hadoop and Spark compress and computing and large datasets. Spark and the
decompress large datasets before processing them to algorithm's distributed nature enable efficient
reduce storage costs. These frameworks allow real- decompression and I/O bottleneck reduction.
time data analytics with high compression ratios by Spark uses Brotli, originally for web compression, for
balancing processing speed and use-case-specific data storage and transmission. Spark tasks that handle
compression methods. log files, JSON, and other textual data may benefit
from its ability to compress large volumes of text.
Brotli is a suitable alternative to Gzip for Spark
workload optimisation, especially in cloud situations,
because it has greater compression ratios and
comparable speeds.
3. CLOUD STORAGE: GZIP AND PARQUET
Modern big data architectures use Amazon S3,
Google Cloud Storage, and Azure Blob Storage.
Optimising storage and data transport requires
compression. Cloud storage companies use Gzip and
Parquet for massive data compression. Gzip is a
FIGURE 2 Big Data Storage Architecture popular data compression method. It is supported by
B. USE CASES OF COMPRESSION various cloud storage services and big data
ALGORITHMS IN BIG DATA frameworks because to its efficiency and simplicity.
FRAMEWORKS Gzip is ideal for compressing CSV files, logs, and
1. HADOOP: SNAPPY AND LZO other unstructured data because to its excellent
Hadoop, a popular distributed processing and storage compression ratios [18]. Cloud systems often
framework, optimises storage and performance with compress data with Gzip before storage to reduce
compression. In the Hadoop Distributed File System storage costs and network bandwidth.
(HDFS), which distributes data across many nodes, Columnar storage file format Parquet is suitable for
Snappy and LZO are widely used to compress data Hadoop and Spark. Parquet includes snappy
[16]. compression and lets users choose from Gzip and
LZO. Because it compresses columns, Parquet is
Snappy's speed-focused compression ratio attracts
great for analytical queries because it decreases
Hadoop users. Its efficient compression and
storage costs and improves query performance. For
decompression mechanisms make it ideal for real-
big data applications that need scalable, fast data
time or near-real-time applications.
processing, cloud-stored Parquet files are perfect for
Snappy speeds up data processing, which is essential storing enormous datasets for analytics.
for Hadoop's batch processing model despite its low
C. CASE STUDIES AND EXAMPLES OF
compression ratios. Many workflows use it, including
COMPRESSION IN REAL-WORLD BIG
log management, data ingestion, and streaming
DATA FRAMEWORKS
analytics. Another lightweight Hadoop algorithm is
1. Netflix: Compression for Streaming Video
LZO, which prioritises speed and moderate
Analytics
compression ratios. Applications that need real-time
Industry leaders like Netflix employ Spark and
data compression and decompression without delays
Hadoop to process large amounts of user and
benefit from this feature. For time-sensitive large-
streaming data. Video storage, transport, and user
scale data processing, LZO and Snappy are faster
data analysis depend on compression. Snappy and
than more computationally intensive algorithms
Zstandard compression help Netflix save money on
despite having lower compression ratios.
storage and improve its recommendation systems
2. SPARK: ZSTD AND BROTLI [19]. Netflix compresses logs and user behaviour data
Apache Spark's in-memory big data processing is to process billions of user interactions daily with low
quick thanks to efficient compression methods. Spark latency and high throughput.
often uses ZSTD and Brotli compression to boost
2. LinkedIn: Real-time Data Processing with
performance.
Kafka
ZSTD, an advanced compression method, balances LinkedIn uses distributed streaming technology
processing speed and compression ratio [17]. ZSTD Apache Kafka for real-time data analytics. Snappy

@ IJTSRD | Unique Paper ID – IJTSRD81127 | Volume – 9 | Issue – 3 | May-Jun 2025 Page 791
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
and Gzip compress Kafka messages to save space and Modern compression techniques are needed to solve
bandwidth. LinkedIn reduces its data storage large-scale data system storage and processing
demands by compressing streaming data before problems, which will only expand as big data grows.
storage without affecting its ability to quickly process VI.
COMPARISON OF POPULAR
huge volumes of incoming data. LinkedIn can provide COMPRESSION ALGORITHMS
real-time user interactions and platform performance
Different compression algorithms work optimal with
metrics [21]. Compression strategies boost massive
different data formats and have different speed,
data framework efficiency. These methods reducecompression ratios, and efficacy. Knowing these
storage needs, speed up data transport, and improve
differences is crucial when picking a large data
data processing task performance, ensuring huge data
algorithm since they directly affect processing speed
system scalability. Data and application requirements
and storage efficiency. The table below compares
should determine lossless or lossy compression.popular big data compression methods. These
algorithms are Snappy, ZSTD, LZ77/LZW, Gzip, and
Brotli.
TABLE 1 COMPARISON OF POPULAR COMPRESSION ALGORITHM
Speed
Compression Compression Suitability for Different
(Compression/ Use Cases
Algorithm Ratio Types of Data
Decompression)
Very Fast Ideal for structured and Hadoop, Spark,
(Compression: Fast, semi-structured data such real-time
Snappy Moderate
Decompression: as logs, CSV, or simple processing, log
Very Fast) datasets data
Moderate Suitable for large, high-
Hadoop, Spark,
Zstandard (Compression: Fast, volume datasets such as
High cloud storage, file
(ZSTD) Decompression: transactional logs and
systems
Very Fast) analytical data
General-purpose
Moderate to Works well with text data,
LZ77/LZW Fast to Moderate compression,
High codebooks, and simple data
text-based files
Optimal for compressing Cloud storage,
Gzip High Moderate to Slow text-based data, including file systems, web
logs, CSV, and XML applications
Effective for compressing
Web applications,
text and web data,
Brotli Very High Moderate cloud storage,
especially HTTP
static file serving
compression

A. COMPRESSION RATIO fast compression and decompression rates, especially

Zstandard (ZSTD) and Gzip have the optimal in high-throughput applications. Gzip is slower than
compression ratios for space savings. Gzip excels at LZ77/LZW but better for compression ratio over
text-heavy material, but ZSTD balances speed and speed [22].
compression. Snappy may not have the optimal
C. SUITABILITY FOR DIFFERENT TYPES OF
compression ratios, but its lightning-fast performance
DATA
makes it suitable for real-time or high-performance
Snappy is optimal for semi-structured or log-based
settings where every second counts. LZ77/LZW
data if speed trumps compression ratio.
compresses moderately, making it suitable for smaller
datasets or older compression methods. ZSTD is adaptable and works with multimedia files,
large databases, and transactional logs [23]. With
B. SPEED
sophisticated, high-volume data storage activities,
Snappy's lightning-fast compression and
LZ77/LZW may not be as efficient as with text-based
decompression speeds are ideal for real-time
data or simpler datasets. Gzip can compress logs,
applications that value speed over compression
CSV, and XML in distributed storage systems with
efficiency. Its fast data processing benefits Hadoop
limited bandwidth. Brotli optimises web-based data
and Spark, two massive data systems. Zstandard
compression, especially HTTP traffic, and
(ZSTD) is ideal for storage and processing due to its

@ IJTSRD | Unique Paper ID – IJTSRD81127 | Volume – 9 | Issue – 3 | May-Jun 2025 Page 792
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
compresses static things like scripts and images storage's future is bright with better and more
faster. adaptable compression approaches.
D. CHOOSING THE RIGHT ALGORITHM REFERENCE
Due to their lightning-fast speeds and optimum [1] M. Pandey, S. Shrivastava, S. Pandey, and S.
compression/decompression combo, Snappy and Shridevi, "An enhanced data compression
ZSTD surpass the competition in real-time processing algorithm," in 2020 International Conference
in Spark and Hadoop. Gzip and ZSTD reduce storage on Emerging Trends in Information Technology
and processing time, making them ideal for cloud and Engineering (ic-ETITE), 2020, pp. 1–4.
storage or archive data that needs compression ratio.
[2] J. Latif, P. Mehryar, L. Hou, and Z. Ali, "An
Brotli optimises web traffic and HTTP compression efficient data compression algorithm for real-
for websites that use CDNs or need fast data transfer time monitoring applications in healthcare," in
[24]. Storage efficiency, data type, and processing 2020 5th International Conference on
speed determine which compression method is ideal Computer and Communication Systems
for a big data application. Gzip and Brotli are good (ICCCS), 2020, pp. 71–75.
for high-compression applications, whereas Snappy
[3] T. A. S. Srinivas, S. Ramasubbareddy, G.
and ZSTD are good for speed and real-time
Kannayaram, and C. P. Kumar, "Storage
performance. Understanding compression ratio and
optimization using file compression techniques
speed trade-offs helps big data frameworks optimise
for big data," in FICTA (2), 2020, pp. 409–416.
storage and performance.
[4] A. N. Kahdim and M. E. Manaa, "Design an
VII. FUTURE TRENDS AND INNOVATIONS
efficient Internet of Things data compression
AI and machine learning could dramatically improve
for healthcare applications," Bulletin of
compression methods in the future. These methods
Electrical Engineering and Informatics, vol.
are promising for context-aware and adaptive
11, no. 3, pp. 1678–1686, 2022.
compression, which optimises processing speed and
storage economy by making real-time adaptations [5] H. Astsatryan, A. Lalayan, A. Kocharyan, and
based on data attributes. AI-driven compression D. Hagimont, "Performance-efficient
improves compression efficiency without quality loss recommendation and prediction service for big
by studying data patterns. Quantum computing can data frameworks focusing on data compression
use quantum states to compress enormous amounts of and in-memory data storage indicators,"
data at record rates, which could revolutionise data Scalable Computing: Practice and Experience,
compression. Thus, quantum compression research is vol. 22, no. 4, pp. 401–412, 2021.
intriguing. AI-driven approaches are computationally [6] S. Kalaivani, C. Tharini, K. Saranya, and K.
complex, large-scale systems must process in real Priyanka, "Design and implementation of
time, and quantum computing's practical uses are hybrid compression algorithm for personal
unclear. Compression techniques are constantly health care big data applications," Wireless
changing, but they can improve data storage, Personal Communications, vol. 113, no. 1, pp.
processing efficiency, and big data application 599–615, 2020.
possibilities while overcoming these challenges.
[7] K. Meena and J. Sujatha, "Reduced time
VIII. CONCLUSION compression in big data using MapReduce
Compression techniques can aid with enormous data approach and Hadoop," Journal of Medical
storage, as mentioned in this article. We covered data Systems, vol. 43, no. 8, p. 239, 2019.
compression basics, including lossless and lossy
compression and big data applications. Compression [8] J. Song, S. Hu, Y. Bao, and G. Yu, "Compress
algorithms include Huffman coding, Zstandard, blocks or not: Tradeoffs for energy
Snappy, and Gzip. Each has strengths in compression consumption of a big data processing system,"
ratio, speed, and data type compatibility. We IEEE Transactions on Sustainable Computing,
examined how Hadoop and Spark use the algorithms vol. 7, no. 1, pp. 112–124, 2020.
to improve storage optimisation and data [9] Bakir, "New blockchain based special keys
accessibility. For optimal compression, storage security model with path compression
efficiency, processing speed, and data type, choose algorithm for big data," IEEE Access, vol. 10,
the right algorithm. Data storage and processing may pp. 94738–94753, 2022.
change soon due to AI-driven and quantum
compression. Though the field's constant evolution [10] Yu, S. Lu, T. Wang, X. Zhang, and S. Wan,
presents challenges and opportunities, big data "Towards higher efficiency in a distributed

@ IJTSRD | Unique Paper ID – IJTSRD81127 | Volume – 9 | Issue – 3 | May-Jun 2025 Page 793
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
memory storage system using data [17] S. Vatedka and A. Tchamkerten, "Local decode
compression," International Journal of Bio- and update for big data compression," IEEE
Inspired Computation, vol. 20, no. 4, pp. 232– Transactions on Information Theory, vol. 66,
240, 2022. no. 9, pp. 5790–5805, 2020.
[11] S. Qi, J. Wang, M. Miao, M. Zhang, and X. [18] G. Xiong, "Research on big data compression
Chen, "Tinyenc: Enabling compressed and algorithm based on BIM," in 2021 IEEE
encrypted big data stores with rich query International Conference on Power, Intelligent
support," IEEE Transactions on Dependable Computing and Systems (ICPICS), 2021, pp.
and Secure Computing, vol. 20, no. 1, pp. 176– 97–100.
192, 2021. [19] S. Pal, S. Mondal, G. Das, S. Khatua, and Z.
[12] Carpentieri, "Data compression in massive data Ghosh, "Big data in biology: The hope and
storage systems," in 2024 International present-day challenges in it," Gene Reports,
Conference on Artificial Intelligence, vol. 21, p. 100869, 2020.
Computer, Data Sciences and Applications [20] K. Sansanwal, G. Shrivastava, R. Anand, and
(ACDSA), 2024, pp. 1–6. K. Sharma, "Big data analysis and compression
[13] Hu, F. Wang, W. Li, J. Li, and H. Guan, for indoor air quality," in Handbook of IoT and
"QZFS: QAT accelerated compression in file Big Data, CRC Press, 2019, pp. 1–21.
system for application agnostic and cost [21] S. A. Abdulzahra, A. K. M. Al-Qurabat, and A.
efficient data storage," in 2019 USENIX Annual K. Idrees, "Data reduction based on
Technical Conference (USENIX ATC 19), 2019, compression technique for big data in IoT," in
pp. 163–176. 2020 International Conference on Emerging
[14] U. Narayanan, V. Paul, and S. Joseph, "A novel Smart Computing and Informatics (ESCI),
system architecture for secure authentication 2020, pp. 103–108.
and data sharing in cloud enabled big data [22] Zhang et al., "Compress DB: Enabling efficient
environment," Journal of King Saud compressed data direct processing for various
University-Computer and Information Sciences, databases," in Proceedings of the 2022
vol. 34, no. 6, pp. 3121–3135, 2022. International Conference on Management of
[15] H. Yao, Y. Ji, K. Li, S. Liu, J. He, and R. Data, 2022, pp. 1655–1669.
Wang, "HRCM: An efficient hybrid referential [23] A. Abdo, T. Salem Karamany, and A. Yakoub,
compression method for genomic big data," "Enhanced data security and storage efficiency
BioMed Research International, vol. 2019, no. in cloud computing: A survey of data
1, p. 3108950, 2019. compression and encryption techniques," vol. 6,
[16] J. Chen, M. Daverveldt, and Z. Al-Ars, "FPGA no. 2, pp. 81–88, 2024.
acceleration of ZSTD compression algorithm," [24] R. Pratap, K. Revanuru, R. Anirudh, and R.
in 2021 IEEE International Parallel and Kulkarni, "Efficient compression algorithm for
Distributed Processing Symposium Workshops multimedia data," in 2020 IEEE Sixth
(IPDPSW), 2021, pp. 188–191. International Conference on Multimedia Big
Data (BigMM), 2020, pp. 245–250.

@ IJTSRD | Unique Paper ID – IJTSRD81127 | Volume – 9 | Issue – 3 | May-Jun 2025 Page 794

Roberval Counter Machine
100% (1)
Roberval Counter Machine
15 pages
Cercon Art DFU
100% (1)
Cercon Art DFU
88 pages
Wireless Mobile Charger
100% (4)
Wireless Mobile Charger
87 pages
IS481 Week 3 Assignment
No ratings yet
IS481 Week 3 Assignment
5 pages
The Power of Big Data: Transforming Industries and Shaping the Future
From Everand
The Power of Big Data: Transforming Industries and Shaping the Future
Tom Henricksen
No ratings yet
Data Compression Techniques
No ratings yet
Data Compression Techniques
14 pages
Optimizing Big Data Storage and Analysis
No ratings yet
Optimizing Big Data Storage and Analysis
12 pages
Modin for Scalable Data Science: The Complete Guide for Developers and Engineers
From Everand
Modin for Scalable Data Science: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
A Survey On Compression Algorithms in Hadoop
No ratings yet
A Survey On Compression Algorithms in Hadoop
4 pages
Enterprise Data Protection with Rubrik: Definitive Reference for Developers and Engineers
From Everand
Enterprise Data Protection with Rubrik: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Bigdata Overview PDF
No ratings yet
Bigdata Overview PDF
98 pages
Learn Hadoop in 24 Hours
From Everand
Learn Hadoop in 24 Hours
Alex Nordeen
No ratings yet
Practical Data Strategies and Recipes
From Everand
Practical Data Strategies and Recipes
Tom Henricksen
No ratings yet
Hadoop Ecosystem for Big Data
From Everand
Hadoop Ecosystem for Big Data
Dr. Zemelak Goraga
No ratings yet
GreptimeDB Essentials: The Complete Guide for Developers and Engineers
From Everand
GreptimeDB Essentials: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Data Modeling
No ratings yet
Data Modeling
12 pages
Big Data
No ratings yet
Big Data
12 pages
Edge Computing Applications in Supply Chain Management
From Everand
Edge Computing Applications in Supply Chain Management
Bo Li
No ratings yet
Selected Topic 2
No ratings yet
Selected Topic 2
8 pages
Unit - I - Types of Digital Data
No ratings yet
Unit - I - Types of Digital Data
45 pages
Practical Parquet Engineering: Definitive Reference for Developers and Engineers
From Everand
Practical Parquet Engineering: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Efficient Time-Series Data Management with TimescaleDB: The Complete Guide for Developers and Engineers
From Everand
Efficient Time-Series Data Management with TimescaleDB: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
1.big Data and Its Importance
No ratings yet
1.big Data and Its Importance
17 pages
What Is Data
No ratings yet
What Is Data
20 pages
Algorithms and Tools of Big Dat3
No ratings yet
Algorithms and Tools of Big Dat3
66 pages
ADBMS-Module 1 Notes
No ratings yet
ADBMS-Module 1 Notes
18 pages
Unit 2
No ratings yet
Unit 2
61 pages
Big Data Analysis by Deshbandhu
No ratings yet
Big Data Analysis by Deshbandhu
368 pages
BIG DATA Notes
No ratings yet
BIG DATA Notes
11 pages
Overview of Big Data
No ratings yet
Overview of Big Data
4 pages
Bda Unit 1
No ratings yet
Bda Unit 1
47 pages
I Jcs It 2015060405
No ratings yet
I Jcs It 2015060405
6 pages
1 Introduction To Big Data Management and Processing
No ratings yet
1 Introduction To Big Data Management and Processing
42 pages
Introduction To Data
No ratings yet
Introduction To Data
34 pages
Selected Topic
No ratings yet
Selected Topic
14 pages
Data Structure
No ratings yet
Data Structure
54 pages
Seminar Report Alisha
No ratings yet
Seminar Report Alisha
22 pages
Big Data
No ratings yet
Big Data
10 pages
Big Data (Analytics) in Power Systems
No ratings yet
Big Data (Analytics) in Power Systems
20 pages
Big Data Analysis
No ratings yet
Big Data Analysis
3 pages
Big Data in Pharmaceutical Industry
No ratings yet
Big Data in Pharmaceutical Industry
10 pages
ChallengesBigDataStorage GlobalJournalofIT2016
No ratings yet
ChallengesBigDataStorage GlobalJournalofIT2016
11 pages
Practical TimescaleDB Solutions: Definitive Reference for Developers and Engineers
From Everand
Practical TimescaleDB Solutions: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
What Is Big Data87
No ratings yet
What Is Big Data87
4 pages
Big Data in Telecommunications
No ratings yet
Big Data in Telecommunications
10 pages
Self Prepared
No ratings yet
Self Prepared
147 pages
Big Data Pyq 2023 Solution
No ratings yet
Big Data Pyq 2023 Solution
18 pages
Informatica Solutions and Data Integration: Definitive Reference for Developers and Engineers
From Everand
Informatica Solutions and Data Integration: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Big Data With Cloud Computing Discussions and Challenges
No ratings yet
Big Data With Cloud Computing Discussions and Challenges
9 pages
Understanding The Big Data Problems and Their Solutions Using Hadoop and Map-Reduce
No ratings yet
Understanding The Big Data Problems and Their Solutions Using Hadoop and Map-Reduce
7 pages
BD U-1 (Anupam Sir)
No ratings yet
BD U-1 (Anupam Sir)
20 pages
Big Data Chapter 1
No ratings yet
Big Data Chapter 1
7 pages
Ethiopin Tecica University Departement of Ict Cours Title: Big Data
No ratings yet
Ethiopin Tecica University Departement of Ict Cours Title: Big Data
15 pages
Emergency Chapter Two
No ratings yet
Emergency Chapter Two
41 pages
Chapter 2 - Data Science
No ratings yet
Chapter 2 - Data Science
20 pages
Unit I LM
No ratings yet
Unit I LM
12 pages
Big Data Analytics Unit1
No ratings yet
Big Data Analytics Unit1
20 pages
Mastering LlamaIndex: Simplifying Data Access for Large Language Models
From Everand
Mastering LlamaIndex: Simplifying Data Access for Large Language Models
Robert Johnson
No ratings yet
DBMS Unit1
No ratings yet
DBMS Unit1
30 pages
Big Data and Blockchain Basics: Dr. Poonam Saini Poonamsaini@pec - Edu.in
No ratings yet
Big Data and Blockchain Basics: Dr. Poonam Saini Poonamsaini@pec - Edu.in
42 pages
Basics of Big Data
No ratings yet
Basics of Big Data
14 pages
A.I Project
No ratings yet
A.I Project
12 pages
Big Data Analytics
No ratings yet
Big Data Analytics
82 pages
InfluxDB Essentials: Definitive Reference for Developers and Engineers
From Everand
InfluxDB Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Development and Evaluation of Oleanolic Acidloaded Nanostructured Lipid Carriers For Brain Cancer
No ratings yet
Development and Evaluation of Oleanolic Acidloaded Nanostructured Lipid Carriers For Brain Cancer
6 pages
Formulation and Optimization of Lupeol Loaded Nanostructured Lipid Carriers For Target Brain Cancer Therapy
No ratings yet
Formulation and Optimization of Lupeol Loaded Nanostructured Lipid Carriers For Target Brain Cancer Therapy
7 pages
The Role of Artificial Intelligence in Evolving Genetic Operators Trends and Perspectives
No ratings yet
The Role of Artificial Intelligence in Evolving Genetic Operators Trends and Perspectives
6 pages
Modernization of Corporate Taxation in Germany Enhancing Flexibility, Reducing Bureaucracy, and Embracing Globalization
No ratings yet
Modernization of Corporate Taxation in Germany Enhancing Flexibility, Reducing Bureaucracy, and Embracing Globalization
7 pages
Study of Rainfall Harvesting System On A Local Perspective and Design of System For Valsad
No ratings yet
Study of Rainfall Harvesting System On A Local Perspective and Design of System For Valsad
7 pages
Review Article On Garbhini Chardi WSR To Emesis Gravidarum
No ratings yet
Review Article On Garbhini Chardi WSR To Emesis Gravidarum
4 pages
Adaptive Threat Detection Using Lightweight Hybrid Learning in Cloud Scale Environments
No ratings yet
Adaptive Threat Detection Using Lightweight Hybrid Learning in Cloud Scale Environments
8 pages
Impacts of Solid Waste Management Practices On Environment and Public Health in Juba County, South Sudan
No ratings yet
Impacts of Solid Waste Management Practices On Environment and Public Health in Juba County, South Sudan
11 pages
Blockchain in The Maritime Industry
No ratings yet
Blockchain in The Maritime Industry
10 pages
Employee Turnover in Non Profit Organizations Understanding Drivers and Developing Retention Strategies
No ratings yet
Employee Turnover in Non Profit Organizations Understanding Drivers and Developing Retention Strategies
6 pages
Fast Food Nation A Sociological Inquiry Into Health, Culture, and Consumption Patterns
No ratings yet
Fast Food Nation A Sociological Inquiry Into Health, Culture, and Consumption Patterns
7 pages
Ayurvedic Review Article On Streevandhyatwa WSR To Anovulation
No ratings yet
Ayurvedic Review Article On Streevandhyatwa WSR To Anovulation
6 pages
Stewardship and The Performance of Quoted Consumer Goods Manufacturing Companies in Nigeria
No ratings yet
Stewardship and The Performance of Quoted Consumer Goods Manufacturing Companies in Nigeria
11 pages
Narrating The Nation Cultural Identity and Postcolonial Consciousness in Indian English Fiction
No ratings yet
Narrating The Nation Cultural Identity and Postcolonial Consciousness in Indian English Fiction
6 pages
The Effect of Multilateral Debt On Gross Domestic Product of Nigeria
No ratings yet
The Effect of Multilateral Debt On Gross Domestic Product of Nigeria
11 pages
Impact of Corruption On Security in Nigeria
No ratings yet
Impact of Corruption On Security in Nigeria
13 pages
Big Data in Media and Entertainment
No ratings yet
Big Data in Media and Entertainment
10 pages
CRISPR Based Diagnostics A New Frontier in Viral Detection and Surveillance
No ratings yet
CRISPR Based Diagnostics A New Frontier in Viral Detection and Surveillance
4 pages
Exploring The Role of Influencer Marketing As A Strategic Tool For Publicity in The Digital Age
No ratings yet
Exploring The Role of Influencer Marketing As A Strategic Tool For Publicity in The Digital Age
7 pages
Structural Change On Financial Leverage and Shareholders Risk of Returns in Oil and Gas Companies in Nigeria
No ratings yet
Structural Change On Financial Leverage and Shareholders Risk of Returns in Oil and Gas Companies in Nigeria
10 pages
Adoption of Telemedicine Enabled Health Insurance Plans Among Urban Working Professionals
No ratings yet
Adoption of Telemedicine Enabled Health Insurance Plans Among Urban Working Professionals
7 pages
Protective Effect of Wheatgrass Juice On Biochemical Parameters Blood Glucose and Serum Cholesterol in PCOS Mice
No ratings yet
Protective Effect of Wheatgrass Juice On Biochemical Parameters Blood Glucose and Serum Cholesterol in PCOS Mice
4 pages
Optimization of The CI Model For Wireless Channel Characterization of 5G Cellular Networks at 3.5 GHZ in An Urban Environment
No ratings yet
Optimization of The CI Model For Wireless Channel Characterization of 5G Cellular Networks at 3.5 GHZ in An Urban Environment
7 pages
The Rise of The Gig Economy Economic Implications and Labour Market Trends in The Last Decade
No ratings yet
The Rise of The Gig Economy Economic Implications and Labour Market Trends in The Last Decade
8 pages
A Comparative Study On The Occurrence of Infectious Diseases Among Toddlers Fed by Exclusive Breastfeeding and Bottle Feeding in Selected Areas of District Hoshiarpur, Punjab
No ratings yet
A Comparative Study On The Occurrence of Infectious Diseases Among Toddlers Fed by Exclusive Breastfeeding and Bottle Feeding in Selected Areas of District Hoshiarpur, Punjab
6 pages
Artificial Intelligence in Architecture Opportunity, Challenge, and Responsibility
No ratings yet
Artificial Intelligence in Architecture Opportunity, Challenge, and Responsibility
6 pages
Design of A Satellite Communication System With Low Latency For Enhanced Data Transmission Enabled by AI-Driven Capabilities
No ratings yet
Design of A Satellite Communication System With Low Latency For Enhanced Data Transmission Enabled by AI-Driven Capabilities
9 pages
Hybrid IAM Deployments Bridging On Premises Security With Cloud Identity Services
No ratings yet
Hybrid IAM Deployments Bridging On Premises Security With Cloud Identity Services
9 pages
Internet Sex Addiction
No ratings yet
Internet Sex Addiction
7 pages
MapReduce Based Algorithms For Efficient Big Data Processing
No ratings yet
MapReduce Based Algorithms For Efficient Big Data Processing
7 pages
Pressure Vessel Calculations Manual Examples
100% (2)
Pressure Vessel Calculations Manual Examples
2 pages
Factory Design Suite
No ratings yet
Factory Design Suite
4 pages
Riyaj Real World PeRiyaj Real World Performance Issues RAC Focusrformance Issues RAC Focus
No ratings yet
Riyaj Real World PeRiyaj Real World Performance Issues RAC Focusrformance Issues RAC Focus
40 pages
Computer and Internet: 1. Who Is The Father of Computer?
No ratings yet
Computer and Internet: 1. Who Is The Father of Computer?
18 pages
What Is Solipsism?: Property of STI
No ratings yet
What Is Solipsism?: Property of STI
1 page
Physics Questions Half Yearly
No ratings yet
Physics Questions Half Yearly
7 pages
Validity and Reliability
No ratings yet
Validity and Reliability
25 pages
Electric Heaters and Accessories
No ratings yet
Electric Heaters and Accessories
10 pages
Syllabus of Aieee Maths
No ratings yet
Syllabus of Aieee Maths
5 pages
Design & LS Perala Center
No ratings yet
Design & LS Perala Center
26 pages
Black Viper's Windows 7 Service Pack 1 Service Configurations
No ratings yet
Black Viper's Windows 7 Service Pack 1 Service Configurations
22 pages
Transport Mechanisms of The Cell Membrane
No ratings yet
Transport Mechanisms of The Cell Membrane
31 pages
Fuel System
No ratings yet
Fuel System
2 pages
Assignment On Thread Process and Management
No ratings yet
Assignment On Thread Process and Management
7 pages
Planning 1
No ratings yet
Planning 1
17 pages
2017 AMC8真题
No ratings yet
2017 AMC8真题
10 pages
International Indian School, Riyadh: Subject: Mathematics STD: I
No ratings yet
International Indian School, Riyadh: Subject: Mathematics STD: I
8 pages
Seams Used in Garments
No ratings yet
Seams Used in Garments
11 pages
Inspection and Repair of Tanks (API-653)
No ratings yet
Inspection and Repair of Tanks (API-653)
88 pages
Google Ems English
No ratings yet
Google Ems English
22 pages
Saga H Safe Instruction Eng PDF
No ratings yet
Saga H Safe Instruction Eng PDF
4 pages
Henderson PDF
No ratings yet
Henderson PDF
4 pages
Deep Foundation Design Methods
No ratings yet
Deep Foundation Design Methods
30 pages
VoLTE Parameters Setting-20151213
No ratings yet
VoLTE Parameters Setting-20151213
16 pages
Bayesian Games: 1: Definition and Equilibrium
No ratings yet
Bayesian Games: 1: Definition and Equilibrium
20 pages
ALC PDH RADIO Technical Training Siae Mi
No ratings yet
ALC PDH RADIO Technical Training Siae Mi
141 pages

Compression Algorithms For Efficient Big Data Storage

Uploaded by

Compression Algorithms For Efficient Big Data Storage

Uploaded by

International Journal of Trend in Scientific Research and Development (IJTSRD)

Compression Algorithms for Efficient Big Data Storage

ABSTRACT How to cite this paper: Dr. Gopal

A. COMPRESSION RATIO fast compression and decompression rates, especially

You might also like