Abstract Data parallel languages like High Performance Fortran (HPF) are emerging as the architec... more Abstract Data parallel languages like High Performance Fortran (HPF) are emerging as the architecture independent mode of programming distributed memory parallel machines. In this paper, we present the interprocedural optimizations required for compiling applications having irregular data access patterns, when coded in such data parallel languages.
ABSTRACT Driven by the cost-effectiveness and the power-efficiency, GPUs are being increasingly u... more ABSTRACT Driven by the cost-effectiveness and the power-efficiency, GPUs are being increasingly used to accelerate computations in many domains. However, developing highly efficient GPU implementations requires a lot of expertise and effort. Thus, tool support for tuning GPU programs is urgently needed, and more specifically, lowoverhead mechanisms for collecting fine-grained runtime information are critically required.
Abstract A large part of the data on the World Wide Web resides in the deep web. Executing struct... more Abstract A large part of the data on the World Wide Web resides in the deep web. Executing structured, high-level queries on deep web data sources involves a number of challenges, several of which arise because query execution engines have a very limited access to data. In this paper, we consider the problem of executing aggregation queries involving data enumeration on these data sources, which requires sampling. The existing work in this area (HDSampler and its variants) is based on simple random sampling.
Abstract Many stream-based applications share a common set of characteristics, which makes grid-b... more Abstract Many stream-based applications share a common set of characteristics, which makes grid-based and adaptive processing desirable or even a necessity. We view the problem of flexible and adaptive processing of distributed data streams as a grid computing problem. We have been developing a middleware for processing of distributed data streams. Our system is referred to as GATES (grid-based adaptive execution on streams). It is designed to use the existing grid standards and tools to the extent possible.
There is an increasing trends towards distributed and shared repositories for storing scientific ... more There is an increasing trends towards distributed and shared repositories for storing scientific datasets. Developing applications that retrieve and process data from such repositories involves a number of challenges. First, these data repositories store data in complex, low-level layouts, which should be abstracted from application developers. Second, as data repositories are shared resources, part of the computations on the data must be performed at a different set of machines than the ones hosting the data.
Abstract The recent emergence of, cloud computing is making the, vision, of, utility computing, r... more Abstract The recent emergence of, cloud computing is making the, vision, of, utility computing, realizable, ie, computing resources and services from a cloud can be delivered, utilized, and paid for in the same fashion as utilities like water or electricity.
ABSTRACT Driven by the emergence of GPUs as a major player in high performance computing and the ... more ABSTRACT Driven by the emergence of GPUs as a major player in high performance computing and the rapidly growing popularity of cloud environments, GPU instances are now being offered by cloud providers. The use of GPUs in a cloud environment, however, is still at initial stages, and the challenge of making GPU a true shared resource in the cloud has not yet been addressed. This paper presents a framework to enable applications executing within virtual machines to transparently share one or more GPUs.
Abstract Data cube construction is a commonly used operation in data warehouses. Because of the v... more Abstract Data cube construction is a commonly used operation in data warehouses. Because of the volume of data that is stored and analyzed in a data warehouse and the amount of computation involved in data cube construction, it is natural to consider parallel machines for this operation. This paper addresses a number of algorithmic issues in parallel data cube construction. First, we present an aggregation tree for sequential (and parallel) data cube construction, which has minimally bounded memory requirements.
Abstract Computing as a utility, that is, on-demand access to computing and storage infrastructur... more Abstract Computing as a utility, that is, on-demand access to computing and storage infrastructure, has emerged in the form of the Cloud. In this model of computing, elastic resource allocation, ie, the ability to scale resource allocation for specific applications, should be optimized to manage cost versus performance.
Abstract Processing and analyzing large volumes of data plays an increasingly important role in m... more Abstract Processing and analyzing large volumes of data plays an increasingly important role in many domains of scientific research. Typical examples of very large scientific datasets include long running simulations of time-dependent phenomena that periodically generate snapshots of their state, archives of raw and processed remote sensing data, and archives of medical images. High-level language and compiler support for developing applications that analyze and process such datasets has, however, been lacking so far.
Abstract For many organizations, one attractive use of cloud resources can be through what is ref... more Abstract For many organizations, one attractive use of cloud resources can be through what is referred to as cloud bursting or the hybrid cloud. These refer to scenarios where an organization acquires and manages in-house resources to meet its base need, but can use additional resources from a cloud provider to maintain an acceptable response time during workload peaks. Cloud bursting has so far been discussed in the context of using additional computing resources from a cloud provider.
Abstract This paper considers the problem of supporting and efficiently implementing fault-tolera... more Abstract This paper considers the problem of supporting and efficiently implementing fault-tolerance for tightly-coupled and pipelined applications, especially streaming applications, in a grid environment. We provide an alternative to basic checkpointing and use the notion of light-weight summary structure (LSS) to enable efficient failure-recovery. The idea behind LSS is that at certain points during the execution of a processing stage, the state of the program can be summarized by a small amount of memory.
Abstract One important way in which sampling for approximate query processing in a database envir... more Abstract One important way in which sampling for approximate query processing in a database environment differs from traditional applications of sampling is that in a database, it is feasible to collect accurate summary statistics from the data in addition to the sample. This paper describes a set of sampling-based estimators for approximate query processing that make use of simple summary statistics to to greatly increase the accuracy of sampling-based estimators.
Abstract This paper gives an overview of a framework for making existing MPI applications elastic... more Abstract This paper gives an overview of a framework for making existing MPI applications elastic, and executing them with user-specified time and cost constraints in a cloud framework. Considering the limitations of the MPI implementations currently available, we support adaptation by terminating one execution and restarting a new program on a different number of instances. The key component of our system is a decision layer.
Abstract We investigate runtime strategies for data-intensive applications that invovle generaliz... more Abstract We investigate runtime strategies for data-intensive applications that invovle generalized reductions on large, distributed datasets. Our set of strategies includes replicated filter state, partitioned filter state, and hybrid options between these two extremes. We evaluate these strategies using emulators of three real applications, different query and output sizes, and a number of configurations. We consider execution in a homogeneous cluster and in a distributed environment where only a subset of nodes hst the data.
Abstract In compiling applications for distributed memory machines, runtime analysis is required ... more Abstract In compiling applications for distributed memory machines, runtime analysis is required when data to be communicated cannot be determined at compile-time. One such class of applications requiring runtime analysis is block structured codes. These codes employ multiple structured meshes, which may be nested (for multigrid codes) and/or irregularly coupled (called multiblock or irregularly coupled regular mesh problems).
Abstract The availability of a coding-based replication scheme where simple voting is used to mai... more Abstract The availability of a coding-based replication scheme where simple voting is used to maintain correctness of replicated data is evaluated. It is shown that the storage requirement for maintaining the data with a given availability is reduced significantly. The ways that some of the extensions of the voting scheme can be modified to manage this coding-based replication are also described. The availability of these is evaluated, and the reduction in the storage space requirements achieved is studied
Abstract Overlapping memory a standard technique modern architectures, archies. In this~ a~ er, a... more Abstract Overlapping memory a standard technique modern architectures, archies. In this~ a~ er, accesses with computations is for improving performance on which have deep memory hierwe present a compiler technique for overlapping“ac~ esses t: secondary memory(disks) wit h computation.
In scientific data-intensive computing, data from multiple sources may have to be analyzed using ... more In scientific data-intensive computing, data from multiple sources may have to be analyzed using a variety of analysis tools or services. This can introduce a number of challenges. In recent years, several research groups have initiated work addressing some of these challenges. For examples, Chimera is a system for supporting virtual data views and demand-driven data derivation [3, 11]. Similarly, CoDIMS-G is a system providing grid services for data and program integration [2].
Abstract Data parallel languages like High Performance Fortran (HPF) are emerging as the architec... more Abstract Data parallel languages like High Performance Fortran (HPF) are emerging as the architecture independent mode of programming distributed memory parallel machines. In this paper, we present the interprocedural optimizations required for compiling applications having irregular data access patterns, when coded in such data parallel languages.
ABSTRACT Driven by the cost-effectiveness and the power-efficiency, GPUs are being increasingly u... more ABSTRACT Driven by the cost-effectiveness and the power-efficiency, GPUs are being increasingly used to accelerate computations in many domains. However, developing highly efficient GPU implementations requires a lot of expertise and effort. Thus, tool support for tuning GPU programs is urgently needed, and more specifically, lowoverhead mechanisms for collecting fine-grained runtime information are critically required.
Abstract A large part of the data on the World Wide Web resides in the deep web. Executing struct... more Abstract A large part of the data on the World Wide Web resides in the deep web. Executing structured, high-level queries on deep web data sources involves a number of challenges, several of which arise because query execution engines have a very limited access to data. In this paper, we consider the problem of executing aggregation queries involving data enumeration on these data sources, which requires sampling. The existing work in this area (HDSampler and its variants) is based on simple random sampling.
Abstract Many stream-based applications share a common set of characteristics, which makes grid-b... more Abstract Many stream-based applications share a common set of characteristics, which makes grid-based and adaptive processing desirable or even a necessity. We view the problem of flexible and adaptive processing of distributed data streams as a grid computing problem. We have been developing a middleware for processing of distributed data streams. Our system is referred to as GATES (grid-based adaptive execution on streams). It is designed to use the existing grid standards and tools to the extent possible.
There is an increasing trends towards distributed and shared repositories for storing scientific ... more There is an increasing trends towards distributed and shared repositories for storing scientific datasets. Developing applications that retrieve and process data from such repositories involves a number of challenges. First, these data repositories store data in complex, low-level layouts, which should be abstracted from application developers. Second, as data repositories are shared resources, part of the computations on the data must be performed at a different set of machines than the ones hosting the data.
Abstract The recent emergence of, cloud computing is making the, vision, of, utility computing, r... more Abstract The recent emergence of, cloud computing is making the, vision, of, utility computing, realizable, ie, computing resources and services from a cloud can be delivered, utilized, and paid for in the same fashion as utilities like water or electricity.
ABSTRACT Driven by the emergence of GPUs as a major player in high performance computing and the ... more ABSTRACT Driven by the emergence of GPUs as a major player in high performance computing and the rapidly growing popularity of cloud environments, GPU instances are now being offered by cloud providers. The use of GPUs in a cloud environment, however, is still at initial stages, and the challenge of making GPU a true shared resource in the cloud has not yet been addressed. This paper presents a framework to enable applications executing within virtual machines to transparently share one or more GPUs.
Abstract Data cube construction is a commonly used operation in data warehouses. Because of the v... more Abstract Data cube construction is a commonly used operation in data warehouses. Because of the volume of data that is stored and analyzed in a data warehouse and the amount of computation involved in data cube construction, it is natural to consider parallel machines for this operation. This paper addresses a number of algorithmic issues in parallel data cube construction. First, we present an aggregation tree for sequential (and parallel) data cube construction, which has minimally bounded memory requirements.
Abstract Computing as a utility, that is, on-demand access to computing and storage infrastructur... more Abstract Computing as a utility, that is, on-demand access to computing and storage infrastructure, has emerged in the form of the Cloud. In this model of computing, elastic resource allocation, ie, the ability to scale resource allocation for specific applications, should be optimized to manage cost versus performance.
Abstract Processing and analyzing large volumes of data plays an increasingly important role in m... more Abstract Processing and analyzing large volumes of data plays an increasingly important role in many domains of scientific research. Typical examples of very large scientific datasets include long running simulations of time-dependent phenomena that periodically generate snapshots of their state, archives of raw and processed remote sensing data, and archives of medical images. High-level language and compiler support for developing applications that analyze and process such datasets has, however, been lacking so far.
Abstract For many organizations, one attractive use of cloud resources can be through what is ref... more Abstract For many organizations, one attractive use of cloud resources can be through what is referred to as cloud bursting or the hybrid cloud. These refer to scenarios where an organization acquires and manages in-house resources to meet its base need, but can use additional resources from a cloud provider to maintain an acceptable response time during workload peaks. Cloud bursting has so far been discussed in the context of using additional computing resources from a cloud provider.
Abstract This paper considers the problem of supporting and efficiently implementing fault-tolera... more Abstract This paper considers the problem of supporting and efficiently implementing fault-tolerance for tightly-coupled and pipelined applications, especially streaming applications, in a grid environment. We provide an alternative to basic checkpointing and use the notion of light-weight summary structure (LSS) to enable efficient failure-recovery. The idea behind LSS is that at certain points during the execution of a processing stage, the state of the program can be summarized by a small amount of memory.
Abstract One important way in which sampling for approximate query processing in a database envir... more Abstract One important way in which sampling for approximate query processing in a database environment differs from traditional applications of sampling is that in a database, it is feasible to collect accurate summary statistics from the data in addition to the sample. This paper describes a set of sampling-based estimators for approximate query processing that make use of simple summary statistics to to greatly increase the accuracy of sampling-based estimators.
Abstract This paper gives an overview of a framework for making existing MPI applications elastic... more Abstract This paper gives an overview of a framework for making existing MPI applications elastic, and executing them with user-specified time and cost constraints in a cloud framework. Considering the limitations of the MPI implementations currently available, we support adaptation by terminating one execution and restarting a new program on a different number of instances. The key component of our system is a decision layer.
Abstract We investigate runtime strategies for data-intensive applications that invovle generaliz... more Abstract We investigate runtime strategies for data-intensive applications that invovle generalized reductions on large, distributed datasets. Our set of strategies includes replicated filter state, partitioned filter state, and hybrid options between these two extremes. We evaluate these strategies using emulators of three real applications, different query and output sizes, and a number of configurations. We consider execution in a homogeneous cluster and in a distributed environment where only a subset of nodes hst the data.
Abstract In compiling applications for distributed memory machines, runtime analysis is required ... more Abstract In compiling applications for distributed memory machines, runtime analysis is required when data to be communicated cannot be determined at compile-time. One such class of applications requiring runtime analysis is block structured codes. These codes employ multiple structured meshes, which may be nested (for multigrid codes) and/or irregularly coupled (called multiblock or irregularly coupled regular mesh problems).
Abstract The availability of a coding-based replication scheme where simple voting is used to mai... more Abstract The availability of a coding-based replication scheme where simple voting is used to maintain correctness of replicated data is evaluated. It is shown that the storage requirement for maintaining the data with a given availability is reduced significantly. The ways that some of the extensions of the voting scheme can be modified to manage this coding-based replication are also described. The availability of these is evaluated, and the reduction in the storage space requirements achieved is studied
Abstract Overlapping memory a standard technique modern architectures, archies. In this~ a~ er, a... more Abstract Overlapping memory a standard technique modern architectures, archies. In this~ a~ er, accesses with computations is for improving performance on which have deep memory hierwe present a compiler technique for overlapping“ac~ esses t: secondary memory(disks) wit h computation.
In scientific data-intensive computing, data from multiple sources may have to be analyzed using ... more In scientific data-intensive computing, data from multiple sources may have to be analyzed using a variety of analysis tools or services. This can introduce a number of challenges. In recent years, several research groups have initiated work addressing some of these challenges. For examples, Chimera is a system for supporting virtual data views and demand-driven data derivation [3, 11]. Similarly, CoDIMS-G is a system providing grid services for data and program integration [2].
Uploads
Papers by Gagan Agrawal