Data Grid Research Papers

In this paper medical applications on a Grid infrastructure, the MAGIC-5 Project, are presented and discussed. MAGIC-5 aims at developing Computer Aided Detection (CADe) software for the analysis of medical images on distributed databases... more

A high throughput Basic Local Alignment Search Tool (BLAST) system based on Web services is implemented. It provides an alternative BLAST service and allows users to perform multiple BLAST queries at one run in a distributed, parallel... more

The Network Common Data Form (netCDF) is one of the primary methods of self-documenting community and recent evolution toward tion via messages in the defacto standard XML language. XML is a text-based language while netCDF is based on a... more

Bookmark
Download
- by Yasser MANSOURI
- •
- 12
  Computer Science, Distributed Computing, Quality of Service, Grid

Data replication is the creation and maintenance of multiple copies of the same data. Replication is used in Data Grid to enhance data availability and fault tolerance. One of the main objectives of replication strategies is reducing... more

Abstract. The Social Informatics Data Grid is a new infrastructure designed to transform how social and behavioral scientists collect and annotate data, collaborate and share data, and analyze and mine large data repositories. An... more

High resolution climatology-towards climate change services

Large distributed systems such as Computational/Data Grids require large amounts of data to be colocated with the computing facilities for processing. Ensuring that the data is there in time for the computation in today's Internet is a... more

Bookmark
Download
- by William Allcock
- •
- 18
  Computer Science, Distributed Computing, Design, Distributed System

Selecting optimal resources for submitting jobs on a computational Grid or accessing data from a data grid is one of the most important tasks of any Grid middleware. Most modern Grid software today satisfies this responsibility and gives... more

Grid computing has recently gained in popularity. Grid applications can be very demanding of the data storage facilities in the Grid. The existing data grid services are often insufficient and additional optimization of the data access is... more

Bookmark
Download
- by Renata G Słota
- •
- 9
  Computer Science, Grid Computing, Data Management, Data storage

The Sloan Digitial Sky Surveys (SDSS) have been collecting imaging and spectoscopic data since 1998. These data as well as their derived data products are made publicly available through regular data releases, of which the 13th took place... more

Bookmark
Download
- by Ani Thakar
- •
- 3
  Physics, QC, Sky

In data grids, data replication on variant nodes can change some problems such as response time and availability. Also, in data replication, there are some challenges to finding the best replica efficiently in relation to performance and... more

Bookmark
Download
- by Alireza Souri
- •
- 7
  Computer Science, Distributed Computing, Data Access, Grid

We propose a novel approach for joint denoising and interpolation of noisy Bayer-patterned data acquired from a digital imaging sensor (e.g., CMOS, CCD). The aim is to obtain a full-resolution RGB noiseless image. The proposed technique... more

Abstract: Recently, there have been many efforts to develop middlewares supporting applications in managing distributed data in Grid computing environment. Although these research activities address various requirements in Grid data... more

Bookmark
Download
- by Hoàng Ngọc Võ
- •
- 9
  Computer Science, Grid Computing, Data Management, Middleware

In different scientific disciplines, large-scale data are generated with enormous storage requirements. Therefore, effective data management is a critical issue in distributed systems such as the cloud. As tasks can access a nearby site... more

GriPhyN (Grid Physics Network) is a large US collaboration to build grid services for large physics experiments, one of which is LIGO, a gravitational-wave observatory. This paper explains the physics and computing challenges of LIGO, and... more

Bookmark
Download
- by Joe Romano
- •
- 12
  Computer Science, LIGO, Virtual Observatory, Gravitational Wave Detectors

Bookmark
Download
- by José L. Vivas
- •
- 6
  Computer Science, Grid Computing, Architecture, Grid

The success of grid computing depends on the existence of grid middleware that provides core services such as security, data management, resource information, and resource brokering and scheduling. Current general-purpose grid resource... more

Bookmark
Download
- by Le Huy
- •
- 12
  Computer Science, Distributed Computing, Grid Computing, Data Management

The Grid provides mechanisms to share dynamic, heterogeneous, distributed resources spanned across multiple administrative domains. Resources required to execute a job are identified from the resource pool based on the desired set of... more

Bookmark
Download
- by Mohamed Othman
- •
- 15
  Computer Science, Distributed Computing, Grid Computing, Informatics

Current trend of network-based multimedia storage, distributed scientific simulations and distributed geographic information system applications are require both compute intensive and data intensive. This is unavoidable since grid and... more

The SHAMAN project targets a framework integrating advances in the data grid, digital library, and persistent archives communities in order to archive a longterm preservation environment. Within the project we identified several... more

Data warehouse view maintenance is an important issue due to the growing use of warehouse technology for information integration and data analysis. Given the dynamic nature of modern distributed environments, both data updates and schema... more

Some of the most recently proposed algorithms for the incremental maintenance of materialized data warehouses (DW), such as SWEEP and PSWEEP, offer several significant advantages over previous solutions, such as high-performance, no... more

Bookmark
Download
- by Elke Rundensteiner
- •
- Computer Science

High resolution climatology-towards climate change services

We describe a Grid market for exchanging data mining services based on the Catallactic market mechanism proposed by von Hayek. This market mechanism allows selection between multiple instances of services based on operations required in a... more

Nowadays, the process of data mining is one of the most important topics in scientific and business problems. There is a huge amount of data that can help to solve many of these problems. However, data is geographically distributed in... more

Bookmark
Download
- by Alberto Sanchez
- •
- 11
  Computer Science, Grid Computing, Architecture, Data Mining

The Italica Project is the implementation of an Electronic Health Record system at the Italian Hospital of Buenos Aires. The present work shows the implementation of a Medical Signal Grid Repository module and its integration to the... more

Data integration over multiple heterogeneous data sources has become increasingly important for modern applications. The integrated data is usually stored in materialized views for better access, performance and high availability. Such... more

Bookmark
Download
- by Elke Rundensteiner
- •
- 8
  Computer Science, Concurrency, Data Integrity, Data Model

Data integration over multiple heterogeneous data sources has become increasingly important for modern applications. The integrated data is usually stored in materialized views (MV) to allow better access, performance and high... more

Data integration over multiple heterogeneous data sources has become increasingly important for modern applications. The integrated data is usually stored in materialized views (MV) to allow better access, performance and high availability. MV must be maintained after the data sources change. In a loosely-coupled environment, such as the Data Grid, the data sources are autonomous. Hence the source updates can be concurrent and cause erroneous maintenance results. State-of-the-art maintenance strategies apply compensating queries to correct such errors, making the restricting assumption that all source schemata remain static over time. However, in such dynamic environments, the data sources may change not only their data but also their schema, query capabilities or semantics. Consequently, either the maintenance queries or compensating queries would fail. In this paper, we propose a framework called DyDa that overcomes these limitations and handles both source data updates and schema changes. First, we identify three types of maintenance anomalies, caused by either data updates and/or rename and/or drop schema changes. We propose a compensation algorithm to solve the first two types of anomalies. We identify that the third type of anomaly is caused by the violation of dependencies between the maintenance processes. We propose a detection and correction algorithm to remove such anomalies based on the formalisms of dependencies. A new view adaptation algorithm is designed to incrementally adapt some complex updates introduced by the correction algorithm. Put together, these algorithms are the first complete solution to the concurrency problems for MV maintenance in loosely-coupled environments. We have implemented the DyDa system. The experimental results show that our new concurrency handling strategy imposes a minimal overhead on normal data update processing while allowing for the extended functionality to maintain the materialized views even under concurrent schema changes.

Data integration over multiple heterogeneous data sources has become increasingly important for modern applications. The integrated data is usually stored as materialized views to allow better access, performance, and high availability.... more

The Grid-based Virtual Laboratory AMsterdam (VLAM-G) provides a science portal for distributed analysis in applied scientific research. By facilitating access to distributed compute and information resources held by multiple... more

Data Grid environment is a geographically distributed that deal with date-intensive application in scientific and enterprise computing. Dealing with large amount of data makes the requirement for efficiency in data access more critical.... more

Recently, Grid computing activities in Asia-Pacific have been drawn attention, includes in high learning education institutes in Malaysia. Many university and institute in Malaysia are started to build their campus grid infrastructure.... more

Bookmark
Download
- by Suhaimi Napis, PhD
- •
- 9
  Computer Science, Grid Computing, Resource sharing, Middleware

Summary: Timely worldwide distribution of biosequence and bioinformatics data depends on high performance networking and advances in Internet transport methods. The Bio-Mirror project focuses on providing up-to-date distribution of this... more

Data Grids deal with geographically-distributed large-scale data-intensive applications. Schemes scheduled for data grids attempt to not only improve data access time, but also aim to improve the ratio of data availability to a node,... more

An SDDS-2000 server currently manages only buckets in its RAM storage [C01]. Several buckets can coexist. When many files are created however, RAM storage space may not be sufficient for all the buckets simultaneously. When an application... more

Bookmark
Download
- by Witold Litwin
- •
- 3
  Computer Science, OPERATING SYSTEM, Cache

The CMS experiment at CERN is setting up a Grid infrastructure required to fulfil the needs imposed by Terabyte scale productions for the next few years. The goal is to automate the production and at the same time allow the users to... more

Bookmark
Download
- by Asad Samar
- •
- 14
  Computer Science, Grid Computing, Data Analysis, Data Management

The GDMP client-server software system is a generic file replication tool that replicates files securely and efficiently from one site to another in a Data Grid environment using Globus Grid tools. In addition, it manages replica... more

Bookmark
Download
- by Shahzad Muzaffar and +1
  Asad Samar
- •
- 10
  Engineering, Computer Science, Computers and Computing Cultures, Grid

Data Grid

Log In