Recent technological advances allow profiling of tumor samples to an unparalleled level with resp... more Recent technological advances allow profiling of tumor samples to an unparalleled level with respect to molecular and spatial composition as well as treatment response. We describe a prospective, observational clinical study performed within the Tumor Profiler (TuPro) Consortium that aims to show the extent to which such comprehensive information leads to advanced mechanistic insights of a patient’s tumor, enables prognostic and predictive biomarker discovery, and has the potential to support clinical decision making. For this study of melanoma, ovarian carcinoma, and acute myeloid leukemia tumors, in addition to the emerging standard diagnostic approaches of targeted NGS panel sequencing and digital pathology, we perform extensive characterization using the following exploratory technologies: single-cell genomics and transcriptomics, proteotyping, CyTOF, imaging CyTOF, pharmacoscopy, and 4i drug response profiling (4i DRP). In this work, we outline the aims of the TuPro study and p...
We describe a new algorithm for Gaussian Elimination suitable for general (unsymmetric and possib... more We describe a new algorithm for Gaussian Elimination suitable for general (unsymmetric and possibly singular) sparse matrices, of any entry type, which has a natural parallel and distributed-memory formulation but degrades gracefully to sequential execution. We present a sample MPI implementation of a program computing the rank of a sparse integer matrix using the proposed algorithm. Some preliminary performance measurements are presented and discussed, and the performance of the algorithm is compared to corresponding state-of-the-art algorithms for floating-point and integer matrices.
We review some relations occurring between the combinatorial intersection theory on the moduli sp... more We review some relations occurring between the combinatorial intersection theory on the moduli spaces of stable curves and the asymptotic behavior of the 't Hooft-Kontsevich matrix integrals. In particular, we give an alternative proof of the Witten-Di Francesco-Itzykson-Zuber theorem --which expresses derivatives of the partition function of intersection numbers as matrix integrals-- using techniques based on diagrammatic calculus and combinatorial relations among intersection numbers. These techniques extend to a more general interaction potential.
This paper is an introduction to the language of Feynman Diagrams. We use Reshetikhin-Turaev grap... more This paper is an introduction to the language of Feynman Diagrams. We use Reshetikhin-Turaev graphical calculus to define Feynman diagrams and prove that asymptotic expansions of Gaussian integrals can be written as a sum over a suitable family of graphs. We discuss how different kind of interactions give rise to different families of graphs. In particular, we show how symmetric and cyclic interactions lead to ``ordinary'' and ``ribbon'' graphs respectively. As an example, the 't Hooft-Kontsevich model for 2D quantum gravity is treated in some detail.
This paper presents AppPot, a system for creating Linux software appliances. AppPot can be run as... more This paper presents AppPot, a system for creating Linux software appliances. AppPot can be run as a regular batch or grid job and executed in user space, and requires no special virtualization support in the infrastructure. The main design goal of AppPot is to bring the benefits of a virtualization-based IaaS cloud to existing batch-oriented computing infrastructures. In particular, AppPot addresses the application deployment and configuration on large heterogeneous computing infrastructures: users are enabled to prepare their own customized virtual appliance for providing a safe execution environment for their applications. These appliances can then be executed on virtually any computing infrastructure being in a private or public cloud as well as any batch-controlled computing clusters the user may have access to. We give an overview of AppPot and its features, the technology that makes it possible, and report on experiences running it in production use within the Swiss National G...
Fatgraphs are multigraphs enriched with a cyclic order of the edges incident to a vertex. This pa... more Fatgraphs are multigraphs enriched with a cyclic order of the edges incident to a vertex. This paper presents algorithms to: (1) generate the set of all fatgraphs having a given genus and number of boundary cycles; (2) compute automorphisms of any given fatgraph; (3) compute the homology of the fatgraph complex. The algorithms are suitable for effective computer implementation. In particular, this allows us to compute the rational homology of the moduli space of Riemann surfaces with marked points. We thus compute the Betti numbers of M_g,n with (2g + n) ≤ 6, corroborating known results.
Abstract—The Python library FatGHol [FatGHoL] used in [Murri2012] to reckon the rational homology... more Abstract—The Python library FatGHol [FatGHoL] used in [Murri2012] to reckon the rational homology of the moduli space of Riemann surfaces is an example of a non-numeric scientific code: most of the processing it does is generating graphs (represented by complex Python objects) and computing their isomor-phisms (a triple of Python lists; again a nested data structure). These operations are repeated many times over: for example, the spaces M0,6 and M1,4 are triangulated by 4’583’322 and 747’664 graphs, respectively. This is an opportunity for every Python runtime to prove its strength in optimization. The purpose of this experiment was to assess the maturity of alternative Python runtimes, in terms of: compatibility with the language as implemented in CPython 2.7, and performance speedup. This paper compares the results and experiences from running FatGHol with
Fatgraphs are multigraphs enriched with a cyclic order of the edges incident to a vertex. This pa... more Fatgraphs are multigraphs enriched with a cyclic order of the edges incident to a vertex. This paper presents algorithms to: (1) generate the set Rg,n of fatgraphs, given the genus g and the number of boundary cycles n; (2) compute automorphisms of any given fatgraph; (3) compute the homology of the fatgraph complex Rg,n. The algorithms are suitable for effective computer implementation. In particular, this allows us to compute the rational homology of the moduli space of Riemann surfaces with marked points. We thus compute the Betti numbers of Mg,n with (2g +n) 6, corroborating known results.
Studies in health technology and informatics, 2012
One of the important questions in biological evolution is to know if certain changes along protei... more One of the important questions in biological evolution is to know if certain changes along protein coding genes have contributed to the adaptation of species. This problem is known to be biologically complex and computationally very expensive. It, therefore, requires efficient Grid or cluster solutions to overcome the computational challenge. We have developed a Grid-enabled tool (gcodeml) that relies on the PAML (codeml) package to help analyse large phylogenetic datasets on both Grids and computational clusters. Although we report on results for gcodeml, our approach is applicable and customisable to related problems in biology or other scientific domains.
Proceedings of EGI Community Forum 2012 / EMI Second Technical Conference — PoS(EGICF12-EMITC2)
This paper presents AppPot, a system for creating Linux software appliances. AppPot appliances ca... more This paper presents AppPot, a system for creating Linux software appliances. AppPot appliances can be run as a regular batch or grid job and executed in user space, and require no virtualization support in the infrastructure. The main design goal of AppPot is to bring the benefits of a virtualization-based IaaS cloud to existing batch-oriented computing infrastructures. In particular, AppPot addresses the application deployment and configuration on large heterogeneous computing infrastructures: users are able to prepare their own customized virtual appliance to provide a safe execution environment for their applications. These appliances can then be executed on virtually any computing infrastructure, being it a private or public cloud, as well as any batch-queueing compute cluster. We give an overview of AppPot and its features, the technology that makes it possible, and briefly report on experiences running it in production use within the Swiss national grid infrastructure SMSCG.
Porting and running of computational chemistry applications on distributed systems have been perf... more Porting and running of computational chemistry applications on distributed systems have been performed for a set of quantum mechanics scattering programs within a collaboration between the Grid Computing Competence Centre (Zurich), the Computational Dynamics and Kinetics Group (Perugia) and the Italian Grid Infrastructure (Bologna). For this purpose the high throughput execution framework GC3Pie and the AppPot cloud/grid virtual machines have been used to implement a grid based workflow dealing with the central blocks of an ab initio Molecular simulator. These blocks carry out both the fitting of the potential energy values computed using ab initio methods and the evaluation of some atom diatom quantum reactive scattering properties. Related performances are discussed and compared with other approaches.
Currently, more than 40 sequence tandem repeat detectors are published, providing heterogeneous, ... more Currently, more than 40 sequence tandem repeat detectors are published, providing heterogeneous, partly complementary, partly conflicting results. We present TRAL, a tandem repeat annotation library that allows running and parsing of various detection outputs, clustering of redundant or overlapping annotations, several statistical frameworks for filtering false positive annotations, and importantly a tandem repeat annotation and refinement module based on circular profile hidden Markov models (cpHMMs). Using TRAL, we evaluated the performance of a multi-step tandem repeat annotation workflow on 547,085 sequences in UniProtKB/Swiss-Prot. The researcher can use these results to predict run-times for specific datasets, and to choose annotation complexity accordingly. TRAL is an open-source Python3 library and is available, together with documentation and tutorials via http://www.vital-it.ch/software/tral. elke.schaper@isb-sib.ch.
The availability of powerful computing hardware in IaaS clouds makes cloud computing attractive a... more The availability of powerful computing hardware in IaaS clouds makes cloud computing attractive also for computational workloads that were up to now almost exclusively run on HPC clusters. In this paper we present the VM-MAD Orchestrator software: an open source framework for cloudbursting Linux-based HPC clusters into IaaS clouds but also computational grids. The Orchestrator is completely mod-ular, allowing flexible configurations of cloudbursting policies. It can be used with any batch system or cloud infrastructure, dynamically extend-ing the cluster when needed. A distinctive feature of our framework is that the policies can be tested and tuned in a simulation mode based on historical or synthetic cluster accounting data. In the paper we also describe how the VM-MAD Orchestrator was used in a production environment at the Functional Genomics Center Zurich to speed up the analysis of mass spectrometry-based protein data by cloudbursting to the Amazon Elastic Compute Cloud. The a...
The Python library FatGHol (FatGHoL) used in (Murri2012) to reckon the rational homology of the m... more The Python library FatGHol (FatGHoL) used in (Murri2012) to reckon the rational homology of the moduli space of Riemann surfaces is an example of a non-numeric scientific code: most of the processing it does is generating graphs (represented by complex Python objects) and computing their isomor- phisms (a triple of Python lists; again a nested data structure). These operations are repeated many times over: for example, the spaces M0;6 and M1;4 are triangulated by 4'583'322 and 747'664 graphs, respectively. This is an opportunity for every Python runtime to prove its strength in optimization. The purpose of this experiment was to assess the maturity of alternative Python runtimes, in terms of: compatibility with the language as implemented in CPython 2.7, and performance speedup. This paper compares the results and experiences from running FatGHol with different Python runtimes: CPython 2.7.5, PyPy 2.1, Cython 0.19, Numba 0.11, Nuitka 0.4.4 and Falcon.
Fatgraphs are multigraphs enriched with a cyclic order of the edges incident to a vertex. This pa... more Fatgraphs are multigraphs enriched with a cyclic order of the edges incident to a vertex. This paper presents algorithms to: (1) generate the set of all fatgraphs having a given genus and number of boundary cycles; (2) compute automorphisms of any given fatgraph; (3) compute the homology of the fatgraph complex. The algorithms are suitable for effective computer implementation. In particular, this allows us to compute the rational homology of the moduli space of Riemann surfaces with marked points. We thus compute the Betti numbers of $M_{g,n}$ with $(2g + n) \leq 6$, corroborating known results.
This paper presents AppPot, a system for creating Linux software appliances. AppPot can be run as... more This paper presents AppPot, a system for creating Linux software appliances. AppPot can be run as a regular batch or grid job and executed in user space, and requires no special virtualization support in the infrastructure. The main design goal of AppPot is to bring the benefits of a virtualization-based IaaS cloud to existing batch-oriented computing infrastructures. In particular, AppPot addresses the application deployment and configuration on large heterogeneous computing infrastructures: users are enabled to prepare their own customized virtual appliance for providing a safe execution environment for their applications. These appliances can then be executed on virtually any computing infrastructure being in a private or public cloud as well as any batch-controlled computing clusters the user may have access to. We give an overview of AppPot and its features, the technology that makes it possible, and report on experiences running it in production use within the Swiss National G...
Recent technological advances allow profiling of tumor samples to an unparalleled level with resp... more Recent technological advances allow profiling of tumor samples to an unparalleled level with respect to molecular and spatial composition as well as treatment response. We describe a prospective, observational clinical study performed within the Tumor Profiler (TuPro) Consortium that aims to show the extent to which such comprehensive information leads to advanced mechanistic insights of a patient’s tumor, enables prognostic and predictive biomarker discovery, and has the potential to support clinical decision making. For this study of melanoma, ovarian carcinoma, and acute myeloid leukemia tumors, in addition to the emerging standard diagnostic approaches of targeted NGS panel sequencing and digital pathology, we perform extensive characterization using the following exploratory technologies: single-cell genomics and transcriptomics, proteotyping, CyTOF, imaging CyTOF, pharmacoscopy, and 4i drug response profiling (4i DRP). In this work, we outline the aims of the TuPro study and p...
We describe a new algorithm for Gaussian Elimination suitable for general (unsymmetric and possib... more We describe a new algorithm for Gaussian Elimination suitable for general (unsymmetric and possibly singular) sparse matrices, of any entry type, which has a natural parallel and distributed-memory formulation but degrades gracefully to sequential execution. We present a sample MPI implementation of a program computing the rank of a sparse integer matrix using the proposed algorithm. Some preliminary performance measurements are presented and discussed, and the performance of the algorithm is compared to corresponding state-of-the-art algorithms for floating-point and integer matrices.
We review some relations occurring between the combinatorial intersection theory on the moduli sp... more We review some relations occurring between the combinatorial intersection theory on the moduli spaces of stable curves and the asymptotic behavior of the 't Hooft-Kontsevich matrix integrals. In particular, we give an alternative proof of the Witten-Di Francesco-Itzykson-Zuber theorem --which expresses derivatives of the partition function of intersection numbers as matrix integrals-- using techniques based on diagrammatic calculus and combinatorial relations among intersection numbers. These techniques extend to a more general interaction potential.
This paper is an introduction to the language of Feynman Diagrams. We use Reshetikhin-Turaev grap... more This paper is an introduction to the language of Feynman Diagrams. We use Reshetikhin-Turaev graphical calculus to define Feynman diagrams and prove that asymptotic expansions of Gaussian integrals can be written as a sum over a suitable family of graphs. We discuss how different kind of interactions give rise to different families of graphs. In particular, we show how symmetric and cyclic interactions lead to ``ordinary'' and ``ribbon'' graphs respectively. As an example, the 't Hooft-Kontsevich model for 2D quantum gravity is treated in some detail.
This paper presents AppPot, a system for creating Linux software appliances. AppPot can be run as... more This paper presents AppPot, a system for creating Linux software appliances. AppPot can be run as a regular batch or grid job and executed in user space, and requires no special virtualization support in the infrastructure. The main design goal of AppPot is to bring the benefits of a virtualization-based IaaS cloud to existing batch-oriented computing infrastructures. In particular, AppPot addresses the application deployment and configuration on large heterogeneous computing infrastructures: users are enabled to prepare their own customized virtual appliance for providing a safe execution environment for their applications. These appliances can then be executed on virtually any computing infrastructure being in a private or public cloud as well as any batch-controlled computing clusters the user may have access to. We give an overview of AppPot and its features, the technology that makes it possible, and report on experiences running it in production use within the Swiss National G...
Fatgraphs are multigraphs enriched with a cyclic order of the edges incident to a vertex. This pa... more Fatgraphs are multigraphs enriched with a cyclic order of the edges incident to a vertex. This paper presents algorithms to: (1) generate the set of all fatgraphs having a given genus and number of boundary cycles; (2) compute automorphisms of any given fatgraph; (3) compute the homology of the fatgraph complex. The algorithms are suitable for effective computer implementation. In particular, this allows us to compute the rational homology of the moduli space of Riemann surfaces with marked points. We thus compute the Betti numbers of M_g,n with (2g + n) ≤ 6, corroborating known results.
Abstract—The Python library FatGHol [FatGHoL] used in [Murri2012] to reckon the rational homology... more Abstract—The Python library FatGHol [FatGHoL] used in [Murri2012] to reckon the rational homology of the moduli space of Riemann surfaces is an example of a non-numeric scientific code: most of the processing it does is generating graphs (represented by complex Python objects) and computing their isomor-phisms (a triple of Python lists; again a nested data structure). These operations are repeated many times over: for example, the spaces M0,6 and M1,4 are triangulated by 4’583’322 and 747’664 graphs, respectively. This is an opportunity for every Python runtime to prove its strength in optimization. The purpose of this experiment was to assess the maturity of alternative Python runtimes, in terms of: compatibility with the language as implemented in CPython 2.7, and performance speedup. This paper compares the results and experiences from running FatGHol with
Fatgraphs are multigraphs enriched with a cyclic order of the edges incident to a vertex. This pa... more Fatgraphs are multigraphs enriched with a cyclic order of the edges incident to a vertex. This paper presents algorithms to: (1) generate the set Rg,n of fatgraphs, given the genus g and the number of boundary cycles n; (2) compute automorphisms of any given fatgraph; (3) compute the homology of the fatgraph complex Rg,n. The algorithms are suitable for effective computer implementation. In particular, this allows us to compute the rational homology of the moduli space of Riemann surfaces with marked points. We thus compute the Betti numbers of Mg,n with (2g +n) 6, corroborating known results.
Studies in health technology and informatics, 2012
One of the important questions in biological evolution is to know if certain changes along protei... more One of the important questions in biological evolution is to know if certain changes along protein coding genes have contributed to the adaptation of species. This problem is known to be biologically complex and computationally very expensive. It, therefore, requires efficient Grid or cluster solutions to overcome the computational challenge. We have developed a Grid-enabled tool (gcodeml) that relies on the PAML (codeml) package to help analyse large phylogenetic datasets on both Grids and computational clusters. Although we report on results for gcodeml, our approach is applicable and customisable to related problems in biology or other scientific domains.
Proceedings of EGI Community Forum 2012 / EMI Second Technical Conference — PoS(EGICF12-EMITC2)
This paper presents AppPot, a system for creating Linux software appliances. AppPot appliances ca... more This paper presents AppPot, a system for creating Linux software appliances. AppPot appliances can be run as a regular batch or grid job and executed in user space, and require no virtualization support in the infrastructure. The main design goal of AppPot is to bring the benefits of a virtualization-based IaaS cloud to existing batch-oriented computing infrastructures. In particular, AppPot addresses the application deployment and configuration on large heterogeneous computing infrastructures: users are able to prepare their own customized virtual appliance to provide a safe execution environment for their applications. These appliances can then be executed on virtually any computing infrastructure, being it a private or public cloud, as well as any batch-queueing compute cluster. We give an overview of AppPot and its features, the technology that makes it possible, and briefly report on experiences running it in production use within the Swiss national grid infrastructure SMSCG.
Porting and running of computational chemistry applications on distributed systems have been perf... more Porting and running of computational chemistry applications on distributed systems have been performed for a set of quantum mechanics scattering programs within a collaboration between the Grid Computing Competence Centre (Zurich), the Computational Dynamics and Kinetics Group (Perugia) and the Italian Grid Infrastructure (Bologna). For this purpose the high throughput execution framework GC3Pie and the AppPot cloud/grid virtual machines have been used to implement a grid based workflow dealing with the central blocks of an ab initio Molecular simulator. These blocks carry out both the fitting of the potential energy values computed using ab initio methods and the evaluation of some atom diatom quantum reactive scattering properties. Related performances are discussed and compared with other approaches.
Currently, more than 40 sequence tandem repeat detectors are published, providing heterogeneous, ... more Currently, more than 40 sequence tandem repeat detectors are published, providing heterogeneous, partly complementary, partly conflicting results. We present TRAL, a tandem repeat annotation library that allows running and parsing of various detection outputs, clustering of redundant or overlapping annotations, several statistical frameworks for filtering false positive annotations, and importantly a tandem repeat annotation and refinement module based on circular profile hidden Markov models (cpHMMs). Using TRAL, we evaluated the performance of a multi-step tandem repeat annotation workflow on 547,085 sequences in UniProtKB/Swiss-Prot. The researcher can use these results to predict run-times for specific datasets, and to choose annotation complexity accordingly. TRAL is an open-source Python3 library and is available, together with documentation and tutorials via http://www.vital-it.ch/software/tral. elke.schaper@isb-sib.ch.
The availability of powerful computing hardware in IaaS clouds makes cloud computing attractive a... more The availability of powerful computing hardware in IaaS clouds makes cloud computing attractive also for computational workloads that were up to now almost exclusively run on HPC clusters. In this paper we present the VM-MAD Orchestrator software: an open source framework for cloudbursting Linux-based HPC clusters into IaaS clouds but also computational grids. The Orchestrator is completely mod-ular, allowing flexible configurations of cloudbursting policies. It can be used with any batch system or cloud infrastructure, dynamically extend-ing the cluster when needed. A distinctive feature of our framework is that the policies can be tested and tuned in a simulation mode based on historical or synthetic cluster accounting data. In the paper we also describe how the VM-MAD Orchestrator was used in a production environment at the Functional Genomics Center Zurich to speed up the analysis of mass spectrometry-based protein data by cloudbursting to the Amazon Elastic Compute Cloud. The a...
The Python library FatGHol (FatGHoL) used in (Murri2012) to reckon the rational homology of the m... more The Python library FatGHol (FatGHoL) used in (Murri2012) to reckon the rational homology of the moduli space of Riemann surfaces is an example of a non-numeric scientific code: most of the processing it does is generating graphs (represented by complex Python objects) and computing their isomor- phisms (a triple of Python lists; again a nested data structure). These operations are repeated many times over: for example, the spaces M0;6 and M1;4 are triangulated by 4'583'322 and 747'664 graphs, respectively. This is an opportunity for every Python runtime to prove its strength in optimization. The purpose of this experiment was to assess the maturity of alternative Python runtimes, in terms of: compatibility with the language as implemented in CPython 2.7, and performance speedup. This paper compares the results and experiences from running FatGHol with different Python runtimes: CPython 2.7.5, PyPy 2.1, Cython 0.19, Numba 0.11, Nuitka 0.4.4 and Falcon.
Fatgraphs are multigraphs enriched with a cyclic order of the edges incident to a vertex. This pa... more Fatgraphs are multigraphs enriched with a cyclic order of the edges incident to a vertex. This paper presents algorithms to: (1) generate the set of all fatgraphs having a given genus and number of boundary cycles; (2) compute automorphisms of any given fatgraph; (3) compute the homology of the fatgraph complex. The algorithms are suitable for effective computer implementation. In particular, this allows us to compute the rational homology of the moduli space of Riemann surfaces with marked points. We thus compute the Betti numbers of $M_{g,n}$ with $(2g + n) \leq 6$, corroborating known results.
This paper presents AppPot, a system for creating Linux software appliances. AppPot can be run as... more This paper presents AppPot, a system for creating Linux software appliances. AppPot can be run as a regular batch or grid job and executed in user space, and requires no special virtualization support in the infrastructure. The main design goal of AppPot is to bring the benefits of a virtualization-based IaaS cloud to existing batch-oriented computing infrastructures. In particular, AppPot addresses the application deployment and configuration on large heterogeneous computing infrastructures: users are enabled to prepare their own customized virtual appliance for providing a safe execution environment for their applications. These appliances can then be executed on virtually any computing infrastructure being in a private or public cloud as well as any batch-controlled computing clusters the user may have access to. We give an overview of AppPot and its features, the technology that makes it possible, and report on experiences running it in production use within the Swiss National G...
The Tumor Profiler Study: Integrated, multi-omic, functional tumor profiling for clinical decision support, 2020
Recent technological advances allow profiling of tumor samples to an unparalleled level with resp... more Recent technological advances allow profiling of tumor samples to an unparalleled level with respect to molecular and spatial composition as well as treatment response. We describe a prospective, observational clinical study performed within the Tumor Profiler (TuPro) Consortium that aims to show the extent to which such comprehensive information leads to advanced mechanistic insights of a patient's tumor, enables prognostic and predictive biomarker discovery, and has the potential to support clinical decision making. For this study of melanoma, ovarian carcinoma, and acute myeloid leukemia tumors, in addition to the emerging standard diagnostic approaches of targeted NGS panel sequencing and digital pathology, we perform extensive characterization using the following exploratory technologies: single-cell genomics and transcriptomics, proteotyping, CyTOF, imaging CyTOF, pharmacoscopy, and 4i drug response profiling (4i DRP). In this work, we outline the aims of the TuPro study and present preliminary results on the feasibility of using these technologies in clinical practice showcasing the power of an integrative multi-modal and functional approach for understanding a tumor's underlying biology and for clinical decision support.
Uploads
Papers by Riccardo Murri