Abstract
Because of ever-increasing throughput requirements of sequencing data, most existing short-read aligners have been designed to focus on speed at the expense of accuracy. The Genome Multitool (GEM) mapper can leverage string matching by filtration to search the alignment space more efficiently, simultaneously delivering precision (performing fully tunable exhaustive searches that return all existing matches, including gapped ones) and speed (being several times faster than comparable state-of-the-art tools).
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
265,23 € per year
only 22,10 € per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout



Similar content being viewed by others
References
Sboner, A., Mu, X.J., Greenbaum, D., Auerbach, R.K. & Gerstein, M.B. Genome Biol. 12, 125 (2011).
Ventura, M. et al. Genome Res. 21, 1640–1649 (2011).
Metzker, M.L. Nat. Rev. Genet. 11, 31–46 (2010).
Hansen, K.D., Brenner, S.E. & Dudoit, S. Nucleic Acids Res. 38, e131 (2010).
Karakoc, E. et al. Nat. Methods 9, 176–178 (2012).
Alkan, C. et al. Nat. Genet. 41, 1061–1067 (2009).
Hach, F. et al. Nat. Methods 7, 576–577 (2010).
Gusfield, D. Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology (Cambridge University Press, 1997).
Li, H. & Durbin, R. Bioinformatics 25, 1754–1760 (2009).
Langmead, B., Trapnell, C., Pop, M. & Salzberg, S.L. Genome Biol. 10, R25 (2009).
Langmead, B. & Salzberg, S.L. Nat. Methods 9, 357–359 (2012).
Navarro, G. & Baeza-Yates, R. J. Discrete Algorithms (Amst.) 1, 205–239 (2000).
Myers, E.W. Algorithmica 12, 345–374 (1994).
Li, R. et al. Bioinformatics 25, 1966–1967 (2009).
Li, H. & Durbin, R. Bioinformatics 26, 589–595 (2010).
Li, H. & Homer, N. Brief. Bioinform. 11, 473–483 (2010).
Burrows, M. & Wheeler, D.J. Technical Report 124 (Digital Equipment Corporation, Palo Alto, California, 1994).
Ferragina, P. & Manzini, G. in Proceedings of the 41st Symposium on Foundations of Computer Science (FOCS 2000) 390–398 (2000).
Myers, E.W. JACM 46, 395–415 (1999).
Eddy, S.R. Nat. Biotechnol. 22, 909–910 (2004).
The Tomato Sequencing Consortium. Nature 485, 653–641 (2012).
Li, H. et al. Bioinformatics 25, 2078–2079 (2009).
Acknowledgements
This work was supported by grant CSD2007-00050 from the Ministerio de Educación y Ciencia (Spain) and grant 1R01MH090941-01 from the US National Institutes of Health/National Human Genome Research Institute. Additional funding was provided by the European Union 7th Framework integrating project Revolutionary Approaches and Devices for Nucleic Acid Analysis (READNA, funded under grant agreement Health-F4-2008-201418) and the European Union 7th Framework project European Sequencing and Genotyping Infrastructure (ESGI, funded under grant agreement 262055). We thank S. Heath for his thorough revision of the original manuscript. S.M.-S. thanks J. Campos-Laclaustra for his advice. On behalf of the GEM project, P.R. also thanks T. Alioto, J. Camps-Puchades, T. Derrien, S. Djebali, P. Ferreira, I. Gut, S. Heath, D. Gonzalez-Knowles, R. Kofler, V. Lacroix, J. Lagarde and A. Merkel for their continued support and insights.
Author information
Authors and Affiliations
Contributions
S.M.-S. designed and implemented algorithms and contributed material to the manuscript. M.S. contributed with fruitful discussions. R.G. initiated the project and contributed with fruitful discussions. P.R. designed and implemented algorithms, was the main architect of the GEM project and wrote the manuscript. All the authors read and approved the manuscript.
Corresponding author
Ethics declarations
Competing interests
Our institutions have decided a double-licensing scheme for the GEM tools; they will be free for academic noncommercial use, but a fee will be required for commercial use.
Supplementary information
Supplementary Text and Figures
Supplementary Figure 1, Supplementary Tables 1–6, Supplementary Discussion, Supplementary Protocol and Supplementary Data. (PDF 813 kb)
Rights and permissions
About this article
Cite this article
Marco-Sola, S., Sammeth, M., Guigó, R. et al. The GEM mapper: fast, accurate and versatile alignment by filtration. Nat Methods 9, 1185–1188 (2012). https://doi.org/10.1038/nmeth.2221
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nmeth.2221
This article is cited by
-
Ancient and recent origins of shared polymorphisms in yeast
Nature Ecology & Evolution (2024)
-
Identifying novel regulatory effects for clinically relevant genes through the study of the Greek population
BMC Genomics (2023)
-
Space-efficient computation of parallel approximate string matching
The Journal of Supercomputing (2023)
-
Genetic analysis of blood molecular phenotypes reveals common properties in the regulatory networks affecting complex traits
Nature Communications (2023)
-
Effects of personalized diets by prediction of glycemic responses on glycemic control and metabolic health in newly diagnosed T2DM: a randomized dietary intervention pilot trial
BMC Medicine (2022)