Skip to main content

Gagan Agrawal

Ohio State University, Computer Science and Engineering, Graduate Student

Followers

108

Following

1

Public Views

Weng-Long Chang

Indiana University

Dimitri Mavriplis

Lawrence Rauchwerger

Texas A&M University

Reinhard von Hanxleden

Interests

Uploads

Papers by Gagan Agrawal

Interprocedural compilation of irregular applications for distributed memory machines

Abstract Data parallel languages like High Performance Fortran (HPF) are emerging as the architec... more Abstract Data parallel languages like High Performance Fortran (HPF) are emerging as the architecture independent mode of programming distributed memory parallel machines. In this paper, we present the interprocedural optimizations required for compiling applications having irregular data access patterns, when coded in such data parallel languages.

GMProf: A Low-Overhead, Fine-Grained Profiling Approach for GPU Programs

ABSTRACT Driven by the cost-effectiveness and the power-efficiency, GPUs are being increasingly u... more ABSTRACT Driven by the cost-effectiveness and the power-efficiency, GPUs are being increasingly used to accelerate computations in many domains. However, developing highly efficient GPU implementations requires a lot of expertise and effort. Thus, tool support for tuning GPU programs is urgently needed, and more specifically, lowoverhead mechanisms for collecting fine-grained runtime information are critically required.

Effective and efficient sampling methods for deep web aggregation queries

Abstract A large part of the data on the World Wide Web resides in the deep web. Executing struct... more Abstract A large part of the data on the World Wide Web resides in the deep web. Executing structured, high-level queries on deep web data sources involves a number of challenges, several of which arise because query execution engines have a very limited access to data. In this paper, we consider the problem of executing aggregation queries involving data enumeration on these data sources, which requires sampling. The existing work in this area (HDSampler and its variants) is based on simple random sampling.

Self-adaptation in a middleware for processing data streams

Abstract Many stream-based applications share a common set of characteristics, which makes grid-b... more Abstract Many stream-based applications share a common set of characteristics, which makes grid-based and adaptive processing desirable or even a necessity. We view the problem of flexible and adaptive processing of distributed data streams as a grid computing problem. We have been developing a middleware for processing of distributed data streams. Our system is referred to as GATES (grid-based adaptive execution on streams). It is designed to use the existing grid standards and tools to the extent possible.

Supporting sql-3 aggregations on grid-based data repositories

There is an increasing trends towards distributed and shared repositories for storing scientific ... more There is an increasing trends towards distributed and shared repositories for storing scientific datasets. Developing applications that retrieve and process data from such repositories involves a number of challenges. First, these data repositories store data in complex, low-level layouts, which should be abstracted from application developers. Second, as data repositories are shared resources, part of the computations on the data must be performed at a different set of machines than the ones hosting the data.

Dynamic resource provisioning for data streaming applications in a cloud environment

Abstract The recent emergence of, cloud computing is making the, vision, of, utility computing, r... more

Supporting GPU sharing in cloud environments with a transparent runtime consolidation framework

ABSTRACT Driven by the emergence of GPUs as a major player in high performance computing and the ... more ABSTRACT Driven by the emergence of GPUs as a major player in high performance computing and the rapidly growing popularity of cloud environments, GPU instances are now being offered by cloud providers. The use of GPUs in a cloud environment, however, is still at initial stages, and the challenge of making GPU a true shared resource in the cloud has not yet been addressed. This paper presents a framework to enable applications executing within virtual machines to transparently share one or more GPUs.

Communication and memory optimal parallel data cube construction

Abstract Data cube construction is a commonly used operation in data warehouses. Because of the v... more Abstract Data cube construction is a commonly used operation in data warehouses. Because of the volume of data that is stored and analyzed in a data warehouse and the amount of computation involved in data cube construction, it is natural to consider parallel machines for this operation. This paper addresses a number of algorithmic issues in parallel data cube construction. First, we present an aggregation tree for sequential (and parallel) data cube construction, which has minimally bounded memory requirements.

Elastic cloud caches for accelerating service-oriented computations

Abstract Computing as a utility, that is, on-demand access to computing and storage infrastructur... more Abstract Computing as a utility, that is, on-demand access to computing and storage infrastructure, has emerged in the form of the Cloud. In this model of computing, elastic resource allocation, ie, the ability to scale resource allocation for specific applications, should be optimized to manage cost versus performance.

Language extensions and compilation techniques for data intensive computations

Abstract Processing and analyzing large volumes of data plays an increasingly important role in m... more Abstract Processing and analyzing large volumes of data plays an increasingly important role in many domains of scientific research. Typical examples of very large scientific datasets include long running simulations of time-dependent phenomena that periodically generate snapshots of their state, archives of raw and processed remote sensing data, and archives of medical images. High-level language and compiler support for developing applications that analyze and process such datasets has, however, been lacking so far.

A framework for data-intensive computing with cloud bursting

Abstract For many organizations, one attractive use of cloud resources can be through what is ref... more Abstract For many organizations, one attractive use of cloud resources can be through what is referred to as cloud bursting or the hybrid cloud. These refer to scenarios where an organization acquires and manages in-house resources to meet its base need, but can use additional resources from a cloud provider to maintain an acceptable response time during workload peaks. Cloud bursting has so far been discussed in the context of using additional computing resources from a cloud provider.

Supporting fault-tolerance in streaming grid applications

Abstract This paper considers the problem of supporting and efficiently implementing fault-tolera... more Abstract This paper considers the problem of supporting and efficiently implementing fault-tolerance for tightly-coupled and pipelined applications, especially streaming applications, in a grid environment. We provide an alternative to basic checkpointing and use the notion of light-weight summary structure (LSS) to enable efficient failure-recovery. The idea behind LSS is that at certain points during the execution of a processing stage, the state of the program can be summarized by a small amount of memory.

New sampling-based estimators for olap queries

Abstract One important way in which sampling for approximate query processing in a database envir... more Abstract One important way in which sampling for approximate query processing in a database environment differs from traditional applications of sampling is that in a database, it is feasible to collect accurate summary statistics from the data in addition to the sample. This paper describes a set of sampling-based estimators for approximate query processing that make use of simple summary statistics to to greatly increase the accuracy of sampling-based estimators.

An Autonomic Framework for Time and Cost Driven Execution of MPI Programs on Cloud Environments

Abstract This paper gives an overview of a framework for making existing MPI applications elastic... more Abstract This paper gives an overview of a framework for making existing MPI applications elastic, and executing them with user-specified time and cost constraints in a cloud framework. Considering the limitations of the MPI implementations currently available, we support adaptation by terminating one execution and restarting a new program on a different number of instances. The key component of our system is a decision layer.

Optimizing reduction computations in a distributed environment

Abstract We investigate runtime strategies for data-intensive applications that invovle generaliz... more Abstract We investigate runtime strategies for data-intensive applications that invovle generalized reductions on large, distributed datasets. Our set of strategies includes replicated filter state, partitioned filter state, and hybrid options between these two extremes. We evaluate these strategies using emulators of three real applications, different query and output sizes, and a number of configurations. We consider execution in a homogeneous cluster and in a distributed environment where only a subset of nodes hst the data.

An integrated runtime and compile-time approach for parallelizing structured and block structured applications

Abstract In compiling applications for distributed memory machines, runtime analysis is required ... more Abstract In compiling applications for distributed memory machines, runtime analysis is required when data to be communicated cannot be determined at compile-time. One such class of applications requiring runtime analysis is block structured codes. These codes employ multiple structured meshes, which may be nested (for multigrid codes) and/or irregularly coupled (called multiblock or irregularly coupled regular mesh problems).

Availability of coding based replication schemes

Abstract The availability of a coding-based replication scheme where simple voting is used to mai... more Abstract The availability of a coding-based replication scheme where simple voting is used to maintain correctness of replicated data is evaluated. It is shown that the storage requirement for maintaining the data with a given availability is reduced significantly. The ways that some of the extensions of the voting scheme can be modified to manage this coding-based replication are also described. The availability of these is evaluated, and the reduction in the storage space requirements achieved is studied

2006 Reviewers List

An interprocedural framework for placement of asynchronous I/O operations

Abstract Overlapping memory a standard technique modern architectures, archies. In this~ a~ er, a... more

Using Metadata for Automatic Wrapper Generation

In scientific data-intensive computing, data from multiple sources may have to be analyzed using ... more In scientific data-intensive computing, data from multiple sources may have to be analyzed using a variety of analysis tools or services. This can introduce a number of challenges. In recent years, several research groups have initiated work addressing some of these challenges. For examples, Chimera is a system for supporting virtual data views and demand-driven data derivation [3, 11]. Similarly, CoDIMS-G is a system providing grid services for data and program integration [2].

Interprocedural compilation of irregular applications for distributed memory machines

Abstract Data parallel languages like High Performance Fortran (HPF) are emerging as the architec... more Abstract Data parallel languages like High Performance Fortran (HPF) are emerging as the architecture independent mode of programming distributed memory parallel machines. In this paper, we present the interprocedural optimizations required for compiling applications having irregular data access patterns, when coded in such data parallel languages.

GMProf: A Low-Overhead, Fine-Grained Profiling Approach for GPU Programs

ABSTRACT Driven by the cost-effectiveness and the power-efficiency, GPUs are being increasingly u... more ABSTRACT Driven by the cost-effectiveness and the power-efficiency, GPUs are being increasingly used to accelerate computations in many domains. However, developing highly efficient GPU implementations requires a lot of expertise and effort. Thus, tool support for tuning GPU programs is urgently needed, and more specifically, lowoverhead mechanisms for collecting fine-grained runtime information are critically required.

Effective and efficient sampling methods for deep web aggregation queries

Abstract A large part of the data on the World Wide Web resides in the deep web. Executing struct... more Abstract A large part of the data on the World Wide Web resides in the deep web. Executing structured, high-level queries on deep web data sources involves a number of challenges, several of which arise because query execution engines have a very limited access to data. In this paper, we consider the problem of executing aggregation queries involving data enumeration on these data sources, which requires sampling. The existing work in this area (HDSampler and its variants) is based on simple random sampling.

Self-adaptation in a middleware for processing data streams

Abstract Many stream-based applications share a common set of characteristics, which makes grid-b... more Abstract Many stream-based applications share a common set of characteristics, which makes grid-based and adaptive processing desirable or even a necessity. We view the problem of flexible and adaptive processing of distributed data streams as a grid computing problem. We have been developing a middleware for processing of distributed data streams. Our system is referred to as GATES (grid-based adaptive execution on streams). It is designed to use the existing grid standards and tools to the extent possible.

Supporting sql-3 aggregations on grid-based data repositories

There is an increasing trends towards distributed and shared repositories for storing scientific ... more There is an increasing trends towards distributed and shared repositories for storing scientific datasets. Developing applications that retrieve and process data from such repositories involves a number of challenges. First, these data repositories store data in complex, low-level layouts, which should be abstracted from application developers. Second, as data repositories are shared resources, part of the computations on the data must be performed at a different set of machines than the ones hosting the data.

Dynamic resource provisioning for data streaming applications in a cloud environment

Abstract The recent emergence of, cloud computing is making the, vision, of, utility computing, r... more

Supporting GPU sharing in cloud environments with a transparent runtime consolidation framework

ABSTRACT Driven by the emergence of GPUs as a major player in high performance computing and the ... more ABSTRACT Driven by the emergence of GPUs as a major player in high performance computing and the rapidly growing popularity of cloud environments, GPU instances are now being offered by cloud providers. The use of GPUs in a cloud environment, however, is still at initial stages, and the challenge of making GPU a true shared resource in the cloud has not yet been addressed. This paper presents a framework to enable applications executing within virtual machines to transparently share one or more GPUs.

Communication and memory optimal parallel data cube construction

Abstract Data cube construction is a commonly used operation in data warehouses. Because of the v... more Abstract Data cube construction is a commonly used operation in data warehouses. Because of the volume of data that is stored and analyzed in a data warehouse and the amount of computation involved in data cube construction, it is natural to consider parallel machines for this operation. This paper addresses a number of algorithmic issues in parallel data cube construction. First, we present an aggregation tree for sequential (and parallel) data cube construction, which has minimally bounded memory requirements.

Elastic cloud caches for accelerating service-oriented computations

Abstract Computing as a utility, that is, on-demand access to computing and storage infrastructur... more Abstract Computing as a utility, that is, on-demand access to computing and storage infrastructure, has emerged in the form of the Cloud. In this model of computing, elastic resource allocation, ie, the ability to scale resource allocation for specific applications, should be optimized to manage cost versus performance.

Language extensions and compilation techniques for data intensive computations

Abstract Processing and analyzing large volumes of data plays an increasingly important role in m... more Abstract Processing and analyzing large volumes of data plays an increasingly important role in many domains of scientific research. Typical examples of very large scientific datasets include long running simulations of time-dependent phenomena that periodically generate snapshots of their state, archives of raw and processed remote sensing data, and archives of medical images. High-level language and compiler support for developing applications that analyze and process such datasets has, however, been lacking so far.

A framework for data-intensive computing with cloud bursting

Abstract For many organizations, one attractive use of cloud resources can be through what is ref... more Abstract For many organizations, one attractive use of cloud resources can be through what is referred to as cloud bursting or the hybrid cloud. These refer to scenarios where an organization acquires and manages in-house resources to meet its base need, but can use additional resources from a cloud provider to maintain an acceptable response time during workload peaks. Cloud bursting has so far been discussed in the context of using additional computing resources from a cloud provider.

Supporting fault-tolerance in streaming grid applications

Abstract This paper considers the problem of supporting and efficiently implementing fault-tolera... more Abstract This paper considers the problem of supporting and efficiently implementing fault-tolerance for tightly-coupled and pipelined applications, especially streaming applications, in a grid environment. We provide an alternative to basic checkpointing and use the notion of light-weight summary structure (LSS) to enable efficient failure-recovery. The idea behind LSS is that at certain points during the execution of a processing stage, the state of the program can be summarized by a small amount of memory.

New sampling-based estimators for olap queries

Abstract One important way in which sampling for approximate query processing in a database envir... more Abstract One important way in which sampling for approximate query processing in a database environment differs from traditional applications of sampling is that in a database, it is feasible to collect accurate summary statistics from the data in addition to the sample. This paper describes a set of sampling-based estimators for approximate query processing that make use of simple summary statistics to to greatly increase the accuracy of sampling-based estimators.

An Autonomic Framework for Time and Cost Driven Execution of MPI Programs on Cloud Environments

Abstract This paper gives an overview of a framework for making existing MPI applications elastic... more Abstract This paper gives an overview of a framework for making existing MPI applications elastic, and executing them with user-specified time and cost constraints in a cloud framework. Considering the limitations of the MPI implementations currently available, we support adaptation by terminating one execution and restarting a new program on a different number of instances. The key component of our system is a decision layer.

Optimizing reduction computations in a distributed environment

Abstract We investigate runtime strategies for data-intensive applications that invovle generaliz... more Abstract We investigate runtime strategies for data-intensive applications that invovle generalized reductions on large, distributed datasets. Our set of strategies includes replicated filter state, partitioned filter state, and hybrid options between these two extremes. We evaluate these strategies using emulators of three real applications, different query and output sizes, and a number of configurations. We consider execution in a homogeneous cluster and in a distributed environment where only a subset of nodes hst the data.

An integrated runtime and compile-time approach for parallelizing structured and block structured applications

Abstract In compiling applications for distributed memory machines, runtime analysis is required ... more Abstract In compiling applications for distributed memory machines, runtime analysis is required when data to be communicated cannot be determined at compile-time. One such class of applications requiring runtime analysis is block structured codes. These codes employ multiple structured meshes, which may be nested (for multigrid codes) and/or irregularly coupled (called multiblock or irregularly coupled regular mesh problems).

Availability of coding based replication schemes

Abstract The availability of a coding-based replication scheme where simple voting is used to mai... more Abstract The availability of a coding-based replication scheme where simple voting is used to maintain correctness of replicated data is evaluated. It is shown that the storage requirement for maintaining the data with a given availability is reduced significantly. The ways that some of the extensions of the voting scheme can be modified to manage this coding-based replication are also described. The availability of these is evaluated, and the reduction in the storage space requirements achieved is studied

2006 Reviewers List

An interprocedural framework for placement of asynchronous I/O operations

Abstract Overlapping memory a standard technique modern architectures, archies. In this~ a~ er, a... more

Using Metadata for Automatic Wrapper Generation

In scientific data-intensive computing, data from multiple sources may have to be analyzed using ... more In scientific data-intensive computing, data from multiple sources may have to be analyzed using a variety of analysis tools or services. This can introduce a number of challenges. In recent years, several research groups have initiated work addressing some of these challenges. For examples, Chimera is a system for supporting virtual data views and demand-driven data derivation [3, 11]. Similarly, CoDIMS-G is a system providing grid services for data and program integration [2].