An integrated software system for analyzing ChIP-chip and ChIP-seq data

Ji, Hongkai; Jiang, Hui; Ma, Wenxiu; Johnson, David S; Myers, Richard M; Wong, Wing H

doi:10.1038/nbt.1505

Article
Published: 02 November 2008

An integrated software system for analyzing ChIP-chip and ChIP-seq data

Nature Biotechnology volume 26, pages 1293–1300 (2008)Cite this article

7411 Accesses
567 Citations
4 Altmetric
Metrics details

Abstract

We present CisGenome, a software system for analyzing genome-wide chromatin immunoprecipitation (ChIP) data. CisGenome is designed to meet all basic needs of ChIP data analyses, including visualization, data normalization, peak detection, false discovery rate computation, gene-peak association, and sequence and motif analysis. In addition to implementing previously published ChIP–microarray (ChIP-chip) analysis methods, the software contains statistical methods designed specifically for ChlP sequencing (ChIP-seq) data obtained by coupling ChIP with massively parallel sequencing. The modular design of CisGenome enables it to support interactive analyses through a graphic user interface as well as customized batch-mode computation for advanced data mining. A built-in browser allows visualization of array images, signals, gene structure, conservation, and DNA sequence and motif information. We demonstrate the use of these tools by a comparative analysis of ChIP-chip and ChIP-seq data for the transcription factor NRSF/REST, a study of ChIP-seq analysis with or without a negative control sample, and an analysis of a new motif in Nanog- and Sox2-binding regions.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on SpringerLink
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 1: The basic framework of CisGenome.**

**Figure 3: Comparisons between NRSF ChIP-seq and ChIP-chip data.**

**Figure 4: Analysis of a novel motif in Sox2 and Nanog binding regions.**

AutoRELACS: automated generation and analysis of ultra-parallel ChIP-seq

Article Open access 24 July 2020

Multiplexed chromatin immunoprecipitation sequencing for quantitative study of histone modifications and chromatin factors

Article 03 October 2024

Streamlined quantitative analysis of histone modification abundance at nucleosome-scale resolution with siQ-ChIP version 2.0

Article Open access 09 May 2023

Accession codes

Accessions

Gene Expression Omnibus

References

Cawley, S. et al. Unbiased mapping of transcription factor binding sites along human chromosomes 21 and 22 points to widespread regulation of noncoding RNAs. Cell 116, 499–509 (2004).
Article CAS Google Scholar
Boyer, L.A. et al. Core transcriptional regulatory circuitry in human embryonic stem cells. Cell 122, 947–956 (2005).
Article CAS Google Scholar
Carroll, J.S. et al. Genome-wide analysis of estrogen receptor binding sites. Nat. Genet. 38, 1289–1297 (2006).
Article CAS Google Scholar
Johnson, D.S., Mortazavi, A., Myers, R.M. & Wold, B. Genome-wide mapping of in vivo protein-DNA interactions. Science 316, 1497–1502 (2007).
Article CAS Google Scholar
Robertson, G. et al. Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat. Methods 4, 651–657 (2007).
Article CAS Google Scholar
Mikkelsen, T.S. et al. Genome-wide maps of chromatin state in pluripotent and lineage-committed cells. Nature 448, 553–560 (2007).
Article CAS Google Scholar
Barski, A. et al. High-resolution profiling of histone methylations in the human genome. Cell 129, 823–837 (2007).
Article CAS Google Scholar
Chen, X. et al. Integration of external signaling pathways with the core transcriptional network in embryonic stem cells. Cell 133, 1106–1117 (2008).
Article CAS Google Scholar
Wederell, E.D. et al. Global analysis of in vivo Foxa2-binding sites in mouse adult liver using massively parallel sequencing. Nucleic Acids Res. 36, 4549–4564 (2008).
Article CAS Google Scholar
Marson, A. et al. Connecting microRNA genes to the core transcriptional regulatory circuitry of embryonic stem cells. Cell 134, 521–533 (2008).
Article CAS Google Scholar
Johnson, W.E. et al. Model-based analysis of tiling-arrays for ChIP-chip. Proc. Natl. Acad. Sci. USA 103, 12457–12462 (2006).
Article CAS Google Scholar
Ji, H. & Wong, W.H. TileMap: create chromosomal map of tiling array hybridizations. Bioinformatics 21, 3629–3636 (2005).
Article CAS Google Scholar
Kampa, D. et al. Novel RNAs identified from an in-depth analysis of the transcriptome of human chromosomes 21 and 22. Genome Res. 14, 331–342 (2004).
Article CAS Google Scholar
Zheng, M., Barrera, L.O., Ren, B. & Wu, Y.N. ChIP-chip: data, model, and analysis. Biometrics 63, 787–796 (2007).
Article CAS Google Scholar
Keles, S. Mixture modeling for genome-wide localization of transcription factors. Biometrics 63, 10–21 (2007).
Article CAS Google Scholar
Ghosh, S., Hirsch, H.A., Sekinger, E., Struhl, K. & Gingeras, T.R. Rank-statistics based enrichment-site prediction algorithm developed for chromatin immunoprecipitation on chip experiments. BMC Bioinformatics 7, 434 (2006).
Article Google Scholar
Du, J. et al. A supervised hidden markov model framework for efficiently segmenting tiling array data in transcriptional and chIP-chip experiments: systematically incorporating validated biological knowledge. Bioinformatics 22, 3016–3024 (2006).
Article CAS Google Scholar
Qi, Y. et al. High-resolution computational models of genome binding events. Nat. Biotechnol. 24, 963–970 (2006).
Article CAS Google Scholar
Scacheri, P.C., Crawford, G.E. & Davis, S. Statistics for ChIP-chip and DNase hypersensitivity experiments on NimbleGen arrays. Methods Enzymol. 411, 270–282 (2006).
Article CAS Google Scholar
Bieda, M., Xu, X., Singer, M.A., Green, R. & Farnham, P.J. Unbiased location analysis of E2F1-binding sites suggests a widespread role for E2F1 in the human genome. Genome Res. 16, 595–605 (2006).
Article CAS Google Scholar
Zhang, Z.D. et al. Tilescope: online analysis pipeline for high-density tiling microarray data. Genome Biol. 8, R81 (2007).
Article Google Scholar
Song, J.S. et al. Model-based analysis of two-color arrays (MA2C). Genome Biol. 8, R178 (2007).
Article Google Scholar
Reiss, D.J., Facciotti, M.T. & Baliga, N.S. Model-based deconvolution of genome-wide DNA binding. Bioinformatics 24, 396–403 (2008).
Article CAS Google Scholar
Song, J.S. et al. Microarray blob-defect removal improves array analysis. Bioinformatics 23, 966–971 (2007).
Article CAS Google Scholar
Liu, X.S., Brutlag, D.L. & Liu, J.S. An algorithm for finding protein-DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments. Nat. Biotechnol. 20, 835–839 (2002).
Article CAS Google Scholar
Hong, P. et al. A boosting approach for motif modeling using ChIP-chip data. Bioinformatics 21, 2636–2643 (2005).
Article CAS Google Scholar
Shim, H. & Keles, S. Integrating quantitative information from ChIP-chip experiments into motif finding. Biostatistics 9, 51–65 (2008).
Article Google Scholar
Ji, X., Li, W., Song, J., Wei, L. & Liu, X.S. CEAS: cis-regulatory element annotation system. Nucleic Acids Res. 34, W551–554 (2006).
Article CAS Google Scholar
Albert, I., Wachi, S., Jiang, C. & Pugh, B.F. GeneTrack–a genomic data processing and visualization framework. Bioinformatics 24, 1305–1306 (2008).
Article CAS Google Scholar
Valouev, A. et al. Genome-wide analysis of transcription factor binding sites based on ChIP-Seq data. Nat. Methods 5, 829–834 (2008).
Article CAS Google Scholar
Jothi, R., Cuddapah, S., Barski, A., Cui, K. & Zhao, K. Genome-wide identification of in vivo protein-DNA binding sites from ChIP-Seq data. Nucleic Acids Res. 36, 5221–5231 (2008).
Article CAS Google Scholar
Wheeler, D.L. et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 36, D13–D21 (2008).
Article CAS Google Scholar
Karolchik, D. et al. The UCSC genome browser database: 2008 update. Nucleic Acids Res. 36, D773–D779 (2008).
Article CAS Google Scholar
Flicek, P. et al. Ensembl 2008. Nucleic Acids Res. 36, D707–D714 (2008).
Article CAS Google Scholar
Liu, J.S., Neuwald, A.F. & Lawrence, C.E. Bayesian models for multiple local sequence alignment and Gibbs sampling strategies. J. Am. Stat. Assoc. 90, 1156–1170 (1995).
Article Google Scholar
Zhou, Q. & Wong, W.H. CisModule: de novo discovery of cis-regulatory modules by hierarchical mixture modeling. Proc. Natl. Acad. Sci. USA 101, 12114–12119 (2004).
Article CAS Google Scholar
Ji, H., Vokes, S.A. & Wong, W.H. A comparative analysis of genome-wide chromatin immunoprecipitation data for mammalian transcription factors. Nucleic Acids Res. 34, e146 (2006).
Article Google Scholar
Chen, Z.F., Paquette, A.J. & Anderson, D.J. NRSF/REST is required in vivo for repression of multiple neuronal target genes during embryogenesis. Nat. Genet. 20, 136–142 (1998).
Article CAS Google Scholar
Chong, J.A. et al. REST: a mammalian silencer protein that restricts sodium channel gene expression to neurons. Cell 80, 949–957 (1995).
Article CAS Google Scholar
Matys, V. et al. TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes. Nucleic Acids Res. 34, D108–D110 (2006).
Article CAS Google Scholar
Johnson, D.S. et al. Systematic evaluation of variability in ChIP-chip experiments using predefined DNA targets. Genome Res. 18, 393–403 (2008).
Article Google Scholar
Bailey, T.L. & Elkan, C. Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, 28–36. AAAI Press, Menlo Park, California, USA, (1994).
Giardine, B. et al. Galaxy: A platform for interactive large-scale genome analysis. Genome Res. 15, 1451–1455 (2005).
Article CAS Google Scholar
Crooks, G.E., Hon, G., Chandonia, J.M. & Brenner, S.E. WebLogo: A sequence logo generator. Genome Res. 14, 1188–1190 (2004).
Article CAS Google Scholar
The ENCODE Project Consortium. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447, 799–816 (2007).
Euskirchen, G.M. et al. Mapping of transcription factor binding regions in mammalian cells by ChIP: comparison of array- and sequencing-based technologies. Genome Res. 17, 898–909 (2007).
Article CAS Google Scholar
Jiang, H. & Wong, W.H. SeqMap: mapping massive amount of oligonucleotides to the genome. Bioinformatics 24, 2395–2396 (2008).
Article CAS Google Scholar
Siepel, A. et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 15, 1034–1050 (2005).
Article CAS Google Scholar
Schmid, C.D. & Bucher, P. ChIP-Seq data reveal nucleosome architecture of human promoters. Cell 131, 831–832 (2007).
Article CAS Google Scholar

Download references

Acknowledgements

We thank W. Li for assistance with analyzing the ChIP-chip spike-in data. This research was supported by National Institutes of Health grant HG003903 (to W.H.W.) and the National Human Genome Research Institute's ENCODE project (to R.M.M.). H. Ji is partially supported by the Johns Hopkins Bloomberg School of Public Health Richard L. Gelb Cancer Research Fund.

Author information

David S Johnson
Present address: Present address: Gene Security Network, Inc., 1442 Cortland Avenue, San Francisco, California 94110, USA.,

Authors and Affiliations

Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, 615 North Wolfe Street, Baltimore, 21205, Maryland, USA
Hongkai Ji
Institute for Computational and Mathematical Engineering, Stanford University, Durand Building, 496 Lomita Mall, Stanford, 94305, California, USA
Hui Jiang
Department of Computer Science, Stanford University, 353 Serra Mall, Stanford, 94305, California, USA
Wenxiu Ma
Department of Genetics, Stanford University School of Medicine, 300 Pasteur Drive, Stanford, 94305, California, USA
David S Johnson
HudsonAlpha Institute for Biotechnology, 601 Genome Way, Huntsville, 35806, Alabama, USA
Richard M Myers
Department of Statistics, Stanford University, Sequoia Hall, 390 Serra Mall, Stanford, 94305, California, USA
Wing H Wong
Department of Health Research and Policy, Stanford University, Sequoia Hall, 390 Serra Mall, Stanford, 94305, California, USA
Wing H Wong

Authors

Hongkai Ji
View author publications
You can also search for this author inPubMed Google Scholar
Hui Jiang
View author publications
You can also search for this author inPubMed Google Scholar
Wenxiu Ma
View author publications
You can also search for this author inPubMed Google Scholar
David S Johnson
View author publications
You can also search for this author inPubMed Google Scholar
Richard M Myers
View author publications
You can also search for this author inPubMed Google Scholar
Wing H Wong
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

H. Ji conceived the study, developed the CisGenome GUI and data analysis algorithms, carried out data analyses and drafted the manuscript. H. Jiang developed the CisGenome browser. W.M. participated in algorithm development and carried out data analyses. D.S.J. and R.M.M. generated NRSF ChIP-chip data. W.H.W. conceived the study and drafted the manuscript. All authors read and revised the manuscript.

Corresponding author

Correspondence to Wing H Wong.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–17, Supplementary Tables 1–15, Supplementary Notes, Supplementary Methods, Supplementary Data (PDF 2354 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ji, H., Jiang, H., Ma, W. et al. An integrated software system for analyzing ChIP-chip and ChIP-seq data. Nat Biotechnol 26, 1293–1300 (2008). https://doi.org/10.1038/nbt.1505

Download citation

Received: 11 June 2008
Accepted: 03 October 2008
Published: 02 November 2008
Issue Date: November 2008
DOI: https://doi.org/10.1038/nbt.1505

An integrated software system for analyzing ChIP-chip and ChIP-seq data

Abstract

Access options

Similar content being viewed by others

AutoRELACS: automated generation and analysis of ultra-parallel ChIP-seq

Multiplexed chromatin immunoprecipitation sequencing for quantitative study of histone modifications and chromatin factors

Streamlined quantitative analysis of histone modification abundance at nucleosome-scale resolution with siQ-ChIP version 2.0

Accession codes

Accessions

Gene Expression Omnibus

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Supplementary information

Supplementary Text and Figures

Rights and permissions

About this article

Cite this article

Search

Quick links

Abstract

Access options

Similar content being viewed by others

AutoRELACS: automated generation and analysis of ultra-parallel ChIP-seq

Multiplexed chromatin immunoprecipitation sequencing for quantitative study of histone modifications and chromatin factors

Streamlined quantitative analysis of histone modification abundance at nucleosome-scale resolution with siQ-ChIP version 2.0

Accession codes

Accessions

Gene Expression Omnibus

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Supplementary information

Supplementary Text and Figures

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links