V3i205 PDF
V3i205 PDF
V3i205 PDF
Abstract The Data-leak cases, human mistakes are one of the causes of data loss. Deliberately planned attacks,
inadvertent and human mistakes lead to most of the data-leak incidents. The detecting solutions of inadvertent sensitive
data leaks caused by human mistakes and provide alerts for organizations. A common approach is to screen content in the
storage and transmission for exposed sensitive information. Such an approach requires the detection operation to be
conducted in secrecy. The data-leak detection (DLD) privacy- preserving solution to solve the special set of sensitive data
digests is used in detection. The advantage of data owner is safely delegate the detection operation to a semihonest
provider without revealing sensitive data to the provider. Internet service providers can offer their customers DLD as an add-
on service with strong privacy guarantees. Evaluation results support accurate detection with very small number of false
alarms under various data-leak scenarios. Host-assisted mechanism for the complete data-leak detection for large-scale
organizations. To design the Host-assisted mechanism for DLD, using data signature and fuzzy ngerprint.
Keywords Data leak, network security, privacy, fuzzy fingerprint, data-leak detection.
noises and real leaks. It is the data owner, who post- suited for enabling the privacy in location-aware
processes the potential leaks sent back by the DLD applications. This show by providing two multi-party
provider and determines whether there is any the real protocols for the privacy-preserving computation of
data leak. This model supports detection operation location information, based on the known
delegation. The ISPs can provide data-leak detection as homomorphic properties of public key encryption
an add-on the service to their customers using this schemes.
model. The design, implements, and evaluates an
efficient technique the fuzzy ngerprint, for privacy- K. Borders and A. Prakash (2009) [3] routes of
preserving data-leak detection. information leakage are various, for example, human,
paper, the Internet, and USB ash memory. It is
Fuzzy ngerprints are special sensitive data digests difficult to nd information leakage by calculating the
prepared by data owner for release to the DLD number of characters of HTTP requests in cases where
provider. These results indicate high accuracy achieved the leaked number of characters is not large. If calculate
by this underlying scheme with very low false positive the approximate entropy, the value is small on the
rate. The filtering steps and data preparation can take whole because ignore a lot of repeated information.
considerable amount of processing time but once
preprocessing is done the data become more reliable H. Yin, D. Song, M. Egele, C. Kruegel, and E. Kirda
and robust results are achieved. They have conducted (2007) [4] malware has become a significant, complex,
extensive experiments to validate the accuracy, and widespread problem within the computer
efficiency and privacy of these solutions. The result industry. The classification model is based on an
provide by host log detect the sensitive data leak examination of eight the malware samples and it
detection. The host-assisted mechanism for data-leak identifies four malware commonalities and
detection the complete for large-scale organizations. To classifications based on dimensions of the persistence
design the Host-assisted mechanism for DLD, using and stealth. The article goal is to provide a better
data signature and fuzzy ngerprint. understanding of when the cyber-conflict will happen
and to help defenders better mitigate the potential
2. LITERATURE REVIEW damage.
Xiaokui Shu, Danfeng Yao and Elisa Bertino, Fellow
(2015) [1] has studied that among multiple data-leak K. Borders, E. V. Weele, B. Lau, and A. Prakash (2009)
cases, human mistakes are one of the main causes of [5] practical and powerful device based isolation
data loss. Detecting inadvertent sensitive data leaks approach for the information security and application
caused by the human mistakes and to provide alerts for of demonstrate in preserving the condentiality of
organizations. They present privacy- preserving data- cryptographic keys. The device-based isolation is
leak detection (DLD) solution to solve the where a dened by isolating the storage and operations related
special set of the sensitive data digests is used in to data with dierent security requirements through
detection. The advantage of method is that it enables computing multiple devices. The isolation should not
the data owner delegate to safely the detection hinder the use and access of the data for practical
operation is to a semihonest provider without revealing applications.
the sensitive data to the provider. The internet service
providers can offer their customers DLD as add-on A. Nadkarni and W. Enck (2013) [6] the exposure of
service with the strong privacy guarantees. The sensitive data in storage and transmission poses a
evaluation results show that method can support serious threat to the organizational and personal
accurate the detection with very small number of false security. Auto-FBI guarantees the secure access of
alarms under various data-leak scenarios. sensitive data on the web. It achieves this guarantee by
automatically generating a new browser instance for
X. Shu and D. Yao (2012) [2] the focus on the latter kind sensitive content. Aquifer is a policy framework and
of services, where location information is essentially system. It helps prevent accidental information
used to determine the membership of one or more disclosure in OS.
geographic sets. This address problem using Bloom
Filters (BF), a compact data structure for representing G. Karjoth and M. Schunter (2002) [7] privacy policy
sets. In particular present an extension of the original specification and enforcement has become a hotbed of
Bloom lter idea: the Spatial Bloom Filter (SBF). The the research activity over past few years as Internet use
SBFs are designed to manage the spatial, geographical has been on the rise around the globe. The number of
information in a space ecient way, and are well- consumers participating in grows online activities; it
becomes increasingly imperative for the organizations of digests or ngerprints from the sensitive data and
to express their privacy practices in an accurate, then discloses only a small amount of them to the DLD
accessible, and useful way. The quality criteria used in provider. This implement detection system and
the software requirements specification can be used to perform extensive experimental evaluation on 2.6 GB
evaluate the privacy policies specified using P3P and Enron dataset, Internet surng traffic of 20 users, and
EPAL. also 5 simulated real-world data-leak scenarios to
measure its privacy guarantee, efficiency and detection
Y. Jang, S. P. Chung, B. D. Payne, and W. Lee (2014) [8] rate.
have proposed a way to capture richer semantics of the
users intent. The method is based on the observation 2. Data Preprocessing: Sentiment or Emotion analysis of
that for the most text-based applications, users intent social networking data involves a lot of data
will be displayed the entirely on screen, text, and the preprocessing. The data preparation and filtering steps
user will make modications. Based on this idea, they can take considerable amount of processing time but
have implemented of prototype called Gyrus2 which once preprocessing is done the data become reliable
enforces correct behavior the applications by capturing and robust results are achieved. Data preprocessing is
user intent. Since this is attack agnostic, it will scale done to eliminate the incomplete, noisy and
better than the traditional security systems. inconsistent data.
6. Host Logs: This describe the server check out by logs assisted mechanism for complete the data-leak
and decide to which model to send in a system. detection for large-scale organizations.
AUTHOR PROFILE