International Journal of Biomedical and Pharmaceutical Sciences ©2007 Global Science Books
The Genetics and Genomics of Trypanosoma cruzi
Martin P. Vazquez
INGEBI-CONICET, Facultad de Ciencias Exactas y Naturales, University of Buenos Aires, Vuelta de Obligado 2490 2P, 1428 Buenos Aires, Argentina
Correspondence: * mvazquez@dna.uba.ar
ABSTRACT
Trypanosoma cruzi is a kinetoplastid parasite that causes Chagas disease. Trypanosomes are unusual organisms in many aspects of its
genetics and molecular and cellular biology and considered a paradigm of the exception of the rule in the eukaryotic kingdom. The
complete genome sequence of T. cruzi was published in 2005, thus, providing a major tool to the understanding of several of his unusual
aspects. However, with so many different mechanisms between the parasite and its mammalian host there is still a lack of availability of
effective antiparasitic drugs or disease treatments, specially in the chronic phase. This review highlights the fundamentals of the fascinating genetics and genomics of T. cruzi with emphasis in the differential mechanisms that could provide interesting therapeutic targets.
_____________________________________________________________________________________________________________
Keywords: trans-splicing, polycistronic transcription, post-transcriptional regulation
Abbreviations: IR, intergenic region; PTU, polycistronic transcription units; UTR, unstranlated region
CONTENTS
INTRODUCTION: AN OVERVIEW OF TRYPANOSOMA CRUZI AND CHAGAS DISEASE .................................................................. 1
How is Chagas disease treated in patients?................................................................................................................................................ 2
A GENETIC OVERVIEW ............................................................................................................................................................................. 2
Gene organization and transcription: The exception to the rule in the eukaryotic kingdom ...................................................................... 2
TRANS-SPLICING mRNA PROCESSING AND POST-TRANSCRIPTIONAL REGULATION OF GENE EXPRESSION: MORE
SURPRISES ON THE GO............................................................................................................................................................................. 4
Trans-splicing processing of RNAs ........................................................................................................................................................... 4
Events of post-transcriptional regulation after trans-splicing.................................................................................................................... 4
TOOLS FOR GENETIC MANIPULATION OF T. CRUZI ........................................................................................................................... 5
A GENOMIC OVERVIEW............................................................................................................................................................................ 6
The genome sequence of T. cruzi............................................................................................................................................................... 6
Repetitive elements and retrotransposons modeled the T. cruzi genome ................................................................................................... 7
Surface protein families are very large ...................................................................................................................................................... 7
COMPARATIVE GENOMICS OF THE TRITRYPS .................................................................................................................................... 9
CONCLUDING REMARKS: THE FUTURES OF T. CRUZI GENETICS AND DRUG DEVELOPMENT ARE ON A COLLISION
PATH.............................................................................................................................................................................................................. 9
ACKNOWLEDGEMENTS ......................................................................................................................................................................... 10
REFERENCES............................................................................................................................................................................................. 10
_____________________________________________________________________________________________________________
INTRODUCTION: AN OVERVIEW OF
TRYPANOSOMA CRUZI AND CHAGAS DISEASE
Trypanosoma cruzi causes Chagas disease in humans. The
disease is endemic in Latin American countries, where 1618 million people are affected with more than 20,000
deaths reported each year (Prata 2001; Dias et al. 2002).
The acute infection can be lethal, but the disease usually
evolves to a chronic asymptomatic phase. However, in 2530% of cases develops symptoms like cardiomyopathy or
lesions in gastrointestinal tract which ultimately lead to
death. The chronic phase is characterized by low parasitemia but parasites persist inside cells and are often associated to the sites of lesions (Brener and Gazzinelli 1997;
Levin 1996; Tarleton and Zhang 1999; Schijman et al.
2004).
T. cruzi belongs to the subkingdom protozoa, order
kinetoplastida, a group which also includes Trypanosoma
brucei (sleeping sickness) and Leishmania major (Leishmaniasis). The three model organisms are collectively known
as the Tritryps. The T. cruzi life cycle includes two hosts,
Received: 1 March, 2007. Accepted: 13 April, 2007.
one invertebrate and one vertebrate as it is transmitted by
triatomine insects. The parasite exists in different forms in
different stages of its life cycle. It is ingested by the triatomine as trypomastigote in the blood meal but rapidly transforms into epimastigote in the midgut and transforms back
into trypomastigote in the hindgut. In the vertebrate host,
the parasite travels in the bloodstream as trypomastigote but
transforms into amastigote inside cells (Fig. 1) (Tyler and
Engman 2001).
Two different ecosystems exist for T. cruzi: the sylvatic
cycle occurring in wild hemiptera and generally involving
mammals and the domestic cycle dependant on homedwelling hemiptera and humans and household animals.
The connection between the two is made by infected rats,
mice, bats and marsupials. It is postulated that the parasite
emerges over 150 million years ago originally infecting primitive mammals in the regions that originated North and
South America but the contact with humans occurs more
recently, in the late Pleistocene, 15,000-20,000 years ago
(Briones et al. 1999). It was demonstrated the presence of T.
cruzi in infected mummies from Northern Chile and SouSpecial Feature
International Journal of Biomedical and Pharmaceutical Sciences 1(1), 1-11 ©2007 Global Science Books
Fig. 1 Life cycle of Trypanosoma cruzi. The vector Triatoma
infestans takes a blood meal and
leaves metacyclic trypomastigotes
in the feces. Trypomastigotes
enter wound bite and travel in the
bloodstream. Later they are able
to infect various cell types and
transform into amastigotes which
multiply by binary fission. In the
end, the cells with amastigote
nests explode and the parasites
transform back into trypomastigotes which continue the cycle in
the vertebrate host. Eventually, a
triatomine bug takes a new blood
meal and the parasites transform
to epimastigotes in the midgut
where they multiply by binary
fission. They transform to
infective metacyclics in the
hindgut.
phase but very limited antiparasitic activity in the chronic
form of the disease. Side effects include anorexia, vomiting
and allergic dermopathy (Urbina and Docampo 2003).
New approaches to specific chemotherapy are under
development. Biochemical routes such as the de novo sterol
biosynthesis pathway, cysteine protease inhibitors or pyrophosphate metabolism have been chemically validated and
are poised for clinical trails in the near future. Other promising approaches include interference with trypanothione
synthesis, redox metabolism or protein complexes (proteasome, spliceosome, etc.) that may present differences with
that of the host (Urbina and Docampo 2003; Vazquez et al.
2003).
The role of T. cruzi in the aetiology of chronic Chagas
disease has been the subject of many debates. Several studies strongly implicated autoimmune phenomena as the primary factor leading to the pathogenesis of the disease. This
hypothesis is based on the apparent absence of parasites in
the inflammatory lesions and the presence of anti-self antibodies. These antibodies were postulated as the result of a
“molecular mimicry” between parasite and host antigens
(Levin 1996).
However, recent works found that the severity of the
disease correlated strongly with the persistence of T. cruzi
antigens that coupled with an unbalanced immune response
in some individuals can lead to the sustained inflammatory
response in infected tissues (Tarleton 2001).
Overall, elimination of T. cruzi from infected chronic
Chagas patients would be a prerequisite to arrest the evolution of the disease and avoid its long-term consequences.
thern Peru dated 9,000 years before present day (Aufderheide et al. 2004)
Initially, two major evolutionary lineages have been
identified and named T. cruzi I and T. cruzi II associated
with the sylvatic and domestic cycles, respectively. T. cruzi
II was also associated with severe manifestations of the
disease (Briones et al. 1999; Zingales et al. 1999). However,
further inspection of genomic data combined with new analytical techniques showed that T. cruzi II could be subdivided into five phylogenetic sublineages (IIa-e) and there
were also hybrid strains belonging to groups 1/2 (Brisse et
al. 2000; Macedo et al. 2004). The occurrence of hybrid
strains in natural populations suggests that sexual events
definitely have taken place in the past and have shaped the
current genetic structure of T. cruzi populations. At present,
it is accepted that the ancestral lineages were three, T. cruzi
I, II and III. At least, two hybridization events involving T.
cruzi II and T. cruzi III produced evolutionarily viable progeny in which the first was identified as the recipient and
the latter as the donor by the mitochondrial clade of the
hybrid strains (de Freitas et al. 2006). It is assumed that the
complete understanding of the population structure in T.
cruzi will be indispensable for an effective control of the
disease by controlling especially the sylvatic cycle.
How is Chagas disease treated in patients?
Specific treatment of Chagas disease with chemotherapy
has been controversial; instead previous control of the parasite reservoirs could be essential. Currently available chemotherapy based on nitrofurtan (nifurtimox, Lampit®
Bayer) and nitroimidazole (Benznidazole, Radanil® Roche)
is unsatisfactory because of their limited efficacy in the prevalent chronic stage of the disease and their toxic side effects (Docampo 1990). Their activities were discovered
empirically over three decades ago. Nifurtimox acts via the
reduction of a nitro group to produce highly toxic, reduced
oxygen metabolites. T. cruzi has been shown to be deficient
in detoxification mechanisms for oxygen metabolites and is
thus more sensitive to oxidation stress than are vertebrate
cells. Benznidazole seems to act via a reductive stress mechanism, which involves covalent modification of macromolecules by nitroreduction intermediates. Both nifurtimox
and benznidazole have significant activity in the acute
A GENETIC OVERVIEW
Gene organization and transcription: The
exception to the rule in the eukaryotic kingdom
Trypanosomes are intriguing and amazing organisms in
many aspects of their biology. In fact, they managed to
emerge as the paradigm to “the exception of the rule” in the
eukaryotic lineage during the last decade. However, what
was considered “rare and exceptional” in these organisms
was later shown to be more common than previously
thought in the eukaryotic kingdom such as the mRNA transsplicing or RNA editing processes.
2
The genetics and genomics of Trypanosoma cruzi. Martin P. Vazquez
A
Promoter??
pPY tract
ORF1
ORF2
Polyadenylation
Trans-splicing
??
SL
AG
DNA
(A)
pPY
SL
Pre-mRNA
AG
SL
AAAn mRNAs
AAAAn
B
Anatomy of a typical T. cruzi intergenic region
TAACGAGTTTCTTCAAAATATGCAGCGGATTCACTAAGAAACATTTTCACGCACGAAAGCGAAATTATTA
TAACGAGTTTCTTCAAAATATGCAGCGGATTCACTAAGAAACATTTTCACGCACGAAAGCGAAATTATTA
TGATTGTTATTATAATACTTTTTCTTTGTTGTTTTATCCACTTATTATTGTTGTGTTAAATTGTTTTTACCTTTT
TGATTGTTATTATAATACTTTTTCTTTGTTGTTTTATCCACTTATTATTGTTGTGTTAAATTGTTTTTACCTTTT
TTCTTTTCCAACTTCTTTTATGATGTCTTTTCTTTTTTTTTTTTTTGCTCTATAAGTTGTCTTGTCAGATG
TTCTTTTCCAACTTCTTTTATGATGTCTTTTCTTTTTTTTTTTTTTGCTCTATAAGTTGTCTTGTCAGATG
Fig. 2 Summary of polycistronic transcription and RNA processing events in trypanosomes. (A) Highlighted are several important features: the
absence of defined RNA Pol II promotores, generation of polycistrons and coupled trans-splicing/polyadenylation coordinated by a polypyrimidine tract
(pPY) in the intergenic region. Trans-spliceosome complex is identified by a shaded balloon with question marks to indicate that the majority of its
components are yet to be determined. The AG is the splice acceptor site of the SL RNA and the (A) represents an adenine residue that is usually the
acceptor of the poly A tail. (B) Anatomy of a typical intergenic region in T. cruzi represented by the HX1 sequence used in the expression vector pTREX.
In italics is highlighted the pPY tract, in italics and underline, the splice acceptor site and in rectangles, the stop codon of the gene upstream and the start
codon of the gene downstream.
One possible explanation is that, in trypanosomes, transcription by RNA polymerase II is not directed by sequence
elements but by structural elements such as an unwound
chromatin region. This is known as the “landing pad” theory
and proposes that polymerase II lands wherever it founds a
more relaxed state of the chromatin to start transcription.
Despite these peculiarities, trypanosomes have highly
conserved copies of the three eukaryotic RNA polymerases
(Pol I, Pol II and Pol III).
Surprisingly, two protein-coding genes are transcribed
by Pol I and not Pol II in Trypanosoma brucei. These two
genes encode for the major surface proteins, PARP and VSG,
which coat the complete parasite surface in procyclics and
trypomastigotes forms respectively (Clayton 2002). These
genes are also embedded in polycistronic transcribed units
but present a well defined promoter which is exchangeable
with the Pol I ribosomal promoter. Remarkably, T. brucei is
the only known organism that transcribes a protein-coding
gene using Pol I which is exclusively used for ribosomal
RNA transcription in all eukaryotes. The problem is that Pol
I do not have an associated capping activity to protect the
mRNA 5 end. Thus, transcription of protein-coding genes is
restricted to Pol II and its associated capping enzyme.
How did T. brucei solve this apparent problem? The answer behind this secret is in the RNA processing machinery,
the trans-splicing reaction, which will be discussed below.
In contrast, T. cruzi cells do not present any Pol I transcribed protein-coding gene unless they are genetically engineered to do so.
Why does T. brucei adopt Pol I for the production of
PARP and VSG coat proteins?
The Pol I promoter is very strong and sustains high
levels of transcription rates which is ideal for the production
of a massive amount of these surface proteins needed by the
parasite.
Since T. cruzi and L. major developed intracellular
forms and do not have Pol I transcription, this could be a relatively new trick adopted by T. brucei to adjust to its extracellular life-style.
As a consequence of this mode of polycistronic transcription, the genes are densely packed in the genome sepa-
The bottom line to understand the particular trypanosomatid genetics is the absence of typical RNA polymerase II
promoters for protein-coding genes in their genomes.
The question raised from the previous statement is challenging: how can the trypanosome genes be transcribed and
expressed? The answer is more challenging: we do not
know.
The fact is that mRNAs are expressed as large polycistronic transcription units (PTUs) composed of unrelated
genes. These PTUs could be as large as a whole chromosome. In the end, the PTUs are processed to monocistronic
mRNAs by two coupled reactions, trans-splicing and polyadenylation, and exported to the cytoplasm for translation
(Clayton 2002; Monnerat et al. 2004; Martinez-Calvillo et
al. 2004). Thus, mRNA maturation in trypanosomes differs
from the process in most eukaryotes (Fig. 2A).
It is postulated that one or few transcription initiation
sites are present per chromosome. Why is so difficult to
find polymerase II promoters in trypanosomes? Mainly because the 5 start site is lost during mRNA processing and
the large polycistronic intermediates are very short-lived.
The chromosome I of Leishmania major is a very good
example. It is 270 kb long and harbors 79 genes, fifty of
which are transcribed towards one telomere while the remainder are transcribed towards the other as single opposite
PTUs (Martinez-Calvillo et al. 2003). The two units are
separated by a small 1.6 kb-long region in which several
transcription start sites could be located in both directions.
The minimal region with no transcription was mapped to a
73 nucleotide-long sequence which resembled none of the
known eukaryotic promoter sequences and with a C-rich
tract as the only recognizable feature (Martinez-Calvillo et
al. 2003). It was named the strand switch region because
somehow it managed to direct transcription in both directions. How does it work? It is not possible to answer that at
present.
More intriguing is the fact that chromosome III is organized and transcribed in a very similar fashion but the
strand switch regions in each one of these chromosomes
share no sequence homologies (Worthey et al. 2003; Martinez-Calvillo et al. 2004).
3
International Journal of Biomedical and Pharmaceutical Sciences 1(1), 1-11 ©2007 Global Science Books
rated by short Intergenic Regions (IRs). These IRs could be
as little as 150 bp long and they are always pyrimidine-rich
(Fig. 2B). The IRs play a central role in mRNA processing
and gene regulation assuming part of the responsibility that
was left behind by the absence of Pol II promoters (Clayton
2002).
During the course of evolution of this genome organization, trypanosomes lost the vast majority of introns. In
fact, only two intron-containing genes have been identified
out of the 12,000 genes present in the T. cruzi genome, a
Poly A Polymerase (PAP) gene and a DEAD-box type helicase gene (Mair et al. 2000; Ivens et al. 2005).
The absence of typical Pol II promoters posed another
challenging question: How gene regulation is controlled
and regulated in the context of the complex life cycle of
trypanosomes?
nic transcripts, and it provides the cap to the individual
mRNAs.
Although trans-splicing was first discovered in trypanosomes, the process was later found in nematodes, trematodes, euglenoids and chordates (Hastings 2005). However,
in these organisms, a minority of genes are processed in this
way.
Surprisingly, after almost a decade of searching for cissplicing in trypanosomes, two genes carrying a single cisspliced intron each were found (Mair et al. 2000; Ivens et al.
2005), demonstrating that the two splicing processes coexist
in trypanosomes as in every other organism capable of
trans-splicing.
After years of experiments, much was learned about the
SL RNA biogenesis and its interactions with the small
nuclear RNAs (snRNAs) to conform the trans-spliceosome
complex (Liang et al. 2003). In contrast, very little is known
about the splicing factors involved in the trans- and cis-spliceosomes as well as the polyadenylation factors. Preliminary results showed interesting information suggesting unique features of this complex respect to other eukaryotic organisms (Liang et al. 2003; Vazquez et al. 2003). Unraveling the machinery of trans-splicing that does not exist on
the host of this parasite gives hope for a therapeutic intervention directed toward specific components of the transspliceosome.
Since trypanosomes lack control at transcription initiation sites, regulation of gene expression depends entirely on
post-transcriptional processes.
Trans-splicing is the first step that could be the subject
of post-transcriptional regulation. It has been shown that
potential modulation of trans-splicing efficiency is possible
(Vazquez and Levin 1999; Hummel et al. 2000; Ben-Dov et
al. 2005). Exonic enhancers have been described in both T.
brucei and T. cruzi (Lopez-Estranio et al. 1998; Ben-Dov et
al. 2005). Moreover, modulation of splicing efficiency in T.
cruzi has been shown to occur by the insertion of a short
interspersed repetitive element (SIRE) carrying its own
trans-splicing signals (Ben-Dov et al. 2005).
Alternative trans-splicing was also demonstrated in T.
cruzi. In fact, the Lyt1 gene, coding for a product of the lytic
pathway, produces three different variants of mRNA by
alternative trans-splicing. One of these mRNAs generates a
short variant of Lyt1 lacking a N-terminal signal sequence
(Manning-Cela et al. 2002)
Recently, it has been shown that splice site-skipping is
responsible for the regulation of some protein-coding genes
in T. cruzi (Jager et al. 2007). The skipping process generates a bicistronic transcript that avoids the translation of the
gene downstream.
TRANS-SPLICING mRNA PROCESSING AND
POST-TRANSCRIPTIONAL REGULATION OF
GENE EXPRESSION: MORE SURPRISES ON THE
GO
Promoters and enhancers are the key elements present in
eukaryotic cells to achieve fine tuning regulation of gene
expression to control complex situations such as development, differentiation and multicellularity. Even deep-branching eukaryotes such as Giardia and Trichomonas have
basal Pol II promoters to direct transcription (Vancova et al.
2003). However, trypanosomes are a unique exception in
this matter.
In these parasites, regulation of gene expression for
protein-coding genes is entirely post-transcriptional (Clayton 2002).
The first event coordinated by the IR sequence elements is the processing of polycistronic mRNA transcription. Two reactions are needed to generate monocistronic
mRNAs: trans-splicing and polyadenylation. The two
events are coupled and coordinated by the same polypyrimidine rich sequence present in the IR (Matthews et al.
1994; Lebowitz et al. 2003). In this way, Poly A addition of
the gene upstream is directed by the trans-splicing processsing of the gene downstream in the PTU. There is no specific signal for polyadenylation such as the AAUAAA in
higher eukaryotes (Fig. 2A).
Trans-splicing processing of RNAs
Trans-splicing was discovered almost 20 years ago. It involves to different molecules, the polycistronic RNA and a
39 nucleotide sequence named Splice Leader (SL). It was
later established that all trypanosome mRNAs undergo
trans-splicing and, thus, acquired the SL common sequence
in the 5 end (Liang et al. 2003).
The source of the SL sequence was found to be a small
capped RNA transcribed by Pol II from thousands of SL
RNA genes arranged in head to tail tandem. Surprisingly,
the SL RNA genes are the only known to be transcribed
from a well defined Pol II promoter. However, it is not a
protein-coding gene. Trans-splicing proceeds through a
two-step trans-esterification reaction, analogous to cis-splicing but forming a Y structure instead of a lariat intermediate. The SL RNA precursor is a 140 nt long molecule that
is capped in the 5 end and carries a GU 5 splice site (ss)
donor downstream of the first 39 bases. The capping modification involves the first four bases of the SL sequence to
form the so called “cap 4 structure” that is unique among
eukaryotes.
The reaction proceeds as follows: The GU 5 ss is branched to an Adenosine in the IR of the pre-mRNA precursor
and the SL sequence is free to react with the first available
AG 3 ss downstream. The Y structure intermediate is degraded and the SL is joined to the mature mRNA (Liang et
al. 2003).
Thus, the SL addition serves two purposes: it functions
together with polyadenylation in dissecting the polycistro-
Events of post-transcriptional regulation after
trans-splicing
The next major step for post-transcriptional regulation is
mRNA export and turnover in the cytoplasm. Several reports indicate that this is one of the two more important
steps to control gene expression in trypanosomes. The other
is the control of mRNA translation (Clayton 2002; D’Orso
et al. 2003).
The control of mRNA half-life in the cytoplasm requires
at least three actors: two to indicate specifically which
mRNA needs to be stabilized or destabilized and another
one as the effector (i.e., the exosome complex). To specify
one particular type of mRNA, it is required cis-acting signals in the 5 or 3 untranslated regions (UTR) of the mRNA
and protein factors to recognize them (Fig. 3).
These proteins harbor RNA binding domains such as
RRMs (RNA Recognition Motif), Zinc Fingers (CCCH or
CCHC types) or PUF (Pumillio/FBF homology domains).
In accordance with their choice of gene regulation, the
genomes of trypanosomes contain several expanded families of these types of proteins. Indeed, their genomes contain
more than 100 proteins with RRM domains, more than 40
proteins with CCCH zinc finger motives, around 20 proteins
4
The genetics and genomics of Trypanosoma cruzi. Martin P. Vazquez
Exosome
RRM
Zinc finger
PUF
Degradation
Stabilization
Enhanced translation
ORF
SL
ciscis-acting
sequence
AAAAn
3㫪UTR
TransTrans-splicing efficiency
ciscis-acting sequence
Enhanced translation
Fig. 3 Summary of the posttranscriptional events of
gene regulation documented
in T. cruzi. Protein factors
with RRM, zinc fingers
(CCCH) or PUF domains bind
cis-acting sequences in the 3
untranslated regions (UTR)
and trigger two events: stabilization or degradation of
mRNAs. Cis-acting sequences
in the 5 UTRs also enhance
translation of the mRNA. In
general, the 3 UTRs are larger
than the 5 UTRs. Trans-splicing efficiency also affects the
abundance of mRNAs.
nine imperfect repeats of 36-40 amino acids that binds to
the core nucleotide sequence UGUR. On the basis of different studies, a common function established for these proteins is the maintenance of stem cells by promoting proliferation and repressing their differentiation (Wickens et al.
2002). In this way, it is expected that Pumilio proteins
would have a central role in a programmed control of gene
expression events during differentiation in trypanosomes.
The PUF proteins reported for T. cruzi are double the number of PUF proteins present in yeasts, five times the number
in mammals and other parasites such as Plasmodium falciparum and are comparable with the number present in nematodes (Caro et al. 2006).
A bioinformatic analysis in T. cruzi has provided some
putative important mRNA targets for some of the PUF proteins (Caro et al. 2006). One of them is the mRNA for the
nuclear encoded Cox5 subunit of the mitochondrial cytochrome oxidase whose protein product is developmental regulated but the results await in vivo confirmation (Caro et al.
2006). Depletion of PUF1 in T. brucei has no effect on parasite viability, indicating that redundancy may occur in the
trypanosome PUF protein family (Luu et al. 2006).
Besides all these examples of regulation by cis-acting
sequences in the 3 UTR of mRNA, few examples also indicate that signals in the short 5 UTRs could modulate translation such as the case of the TcP2 ribosomal protein
mRNA (Ben-Dov et al. 2005).
In Fig. 3, different events of post-transcriptional regulation that have been documented in T. cruzi are summarized.
with CCHC zinc knuckle motives and 10 PUF proteins (de
Gaudenzi et al. 2005; Ivens et al. 2005; Caro et al. 2006).
A subset of the RRM containing proteins fulfill housekeeping functions such as the U2AF35 splicing factor (Vazquez et al. 2003) and the Poly A binding Protein (PABP)
but it was estimated that at least 50 RRM proteins perform
trypanosome specific functions (de Gaudenzi et al. 2005).
One of the first to be identified was TcUBP-1 in T. cruzi, a
single RRM containing protein, involved in the in vivo destabilization of SMUG (Mucin) mRNA in a stage specific
manner (D’Orso and Frasch 2002). It was later shown that
TcUBP-1 is a member of a larger family with five additional members (TcUBP-2, TcRBP-3, TcRBP-4, TcRBP-5 and
TcRBP-6). The proteins TcUBP-1 and TcUBP-2 act together in the epimastigote stage to bind AU-rich elements
(AREs) present in the 3 UTR of SMUG mRNA to stabilize
the complex (D’Orso and Frasch 2001). Conversely, in the
trypomastigote stage, TcUBP-2 is no expressed and
TcUBP-1 alone binds the ARE element and the poly A binding protein TcPABP to destabilize the mRNA. The mRNA
degradation pathway seems to proceed through the exosome complex (D’Orso et al. 2003). This complex was fully characterized in T. brucei and the orthologue proteins are
present in the T. cruzi genome (Estevez et al. 2001; Haile et
al. 2003).
One of the most interesting examples of CCCH zinc
finger proteins involved in post-transcriptional regulation is
that of the ZFP1 and ZFP2. They are two small proteins of
101 and 139 residues respectively that were first identified
in T. brucei. They were implicated in regulated morphogenesis and differentiation of the parasite (Hendriks et al.
2001; Caro et al. 2005). Indeed, overexpression of ZFP2
generated a posterior extension of the microtubule corset, a
mechanism responsible for kinetoplast repositioning during
differentiation. On the other hand, depletion of ZFP2
severely compromised differentiation from bloodstream to
procyclic forms and ZFP1 is in vivo enriched through differentiation to procyclics. Four homologous proteins were
found in the genome of T. cruzi, two ZFP1s (a and b) and
two ZFP2s (a and b). It was demonstrated that the T. cruzi
ZFP1 and ZFP2 families interact with each other via a WW
protein interaction domain present in ZFP2 and the corresponding binding site (a proline rich sequence) present in
ZFP1 (Caro et al. 2005). It is postulated that heterodimmers
ZFP1/ZFP2 bind RNA targets via the CCCH zinc finger
and sequester the mRNA through proteasome degradation
(Hendriks et al. 2001; Caro et al. 2005). The CCCH motif
of T. cruzi ZFP1 was shown to bind C-rich sequences in
RNA in vitro (Morking et al. 2004) but it is still unknown
which are the mRNA targets of these proteins in vivo.
The Pumilio protein family (PUF) is evolutionary conserved and found exclusively in eukaryotes. These proteins
bind 3 UTR elements of their target mRNAs to reduce expression either by repressing translation or causing mRNA
instability. All pumilio proteins share a domain of eight to
TOOLS FOR GENETIC MANIPULATION OF T.
CRUZI
During the last decade, various genetic tools have been introduced that allow manipulation of trypanosomatid genomes. Many of these are adapted from other eukaryotic
model organisms, and the task for molecular parasitologists
has been to get them to work in parasites that pose unique
challenges. The genetic “toolkit” includes techniques that
allowed researches to investigate gene function by both
gain- and loss-of-function strategies and to study localization of their protein products using in vivo tagging strategies
such as Green Fluorescent Protein (GFP).
Main efforts to set up these techniques have been made
in T. brucei and Leishmania while T. cruzi has been left far
behind during many years.
It is clear that the differences in the genetic tools that are
available for the different parasites have marked an impact
on the kinds of questions that researches are pursuing in
these organisms.
Moreover, one of the most powerful tools available to
study gene function is knockdown by RNA interference, or
RNAi (Beverley 2003). RNAi was successfully established
and widely used in various model organisms mainly because
of its simplicity and efficiency. Gene silencing by RNAi is
5
International Journal of Biomedical and Pharmaceutical Sciences 1(1), 1-11 ©2007 Global Science Books
(da Rocha et al. 2004b). Development of DNA transfection
vectors was challenging since trypanosomes have no well
defined promoters or other structural elements such as origin of replications or centromeres. Initial constructions have
taken into account the importance of the intergenic regions
in RNA processing and stability. In this way, the first transfection vector available for T. cruzi was pTEX (Kelly et al.
1992) which harbors no promoter sequence but contains the
IRs of the gapdh gene surrounding a polylinker cloning site
and a selectable marker (NEO) for G418 resistance. This
vector allowed stable transfection of T. cruzi and unregulated overexpression of protein products. The pTEX is
maintained as a circular episome of multiple head to tail
units of the original transfected plasmid. To obtain higher
overexpression levels, the amount of G418 is raised and
concomitantly more plasmid units were added to the multimeric episome. The results also demonstrated that there is
no need for Pol II promotores and that the elements in the
IRs have taken all the control of gene expression.
Subsequently, a second generation of expression vectors were developed using the pTEX backbone such as
pRIBOTEX (Martinez-Calvillo et al. 1997) and pTREX
(Vazquez and Levin 1999). These vectors introduced two
new features: a strong ribosomal promoter for Pol I transcription (pRIBOTEX) and a strong trans-splicing signal
named HX1 associated downstream (pTREX) (Figs. 2B,
4A). The presence of HX1 in pTREX highlights the impact
of the RNA processing signals on gene expression in T.
cruzi. When RNA processing is directed by a cryptic and
weak trans-splicing signal in pRIBOTEX, the difference on
the levels of gene expression between the two vectors is
huge (Fig. 4B, 4C).
These vectors are spontaneously inserted into the ribosomal locus as single copy instead of being maintained as
episomes. The high recombinogenic activity is stimulated
by a short 86 nucleotides long sequence within the ribosomal promoter region (Lorenzi et al. 2003). In this way,
pTREX obtains very high levels of overexpression, it is
more stable than pTEX and it was used successfully in several genetic analyses (Vazquez et al. 2003; da Rocha et al.
2004a; Guevara et al. 2005).
However, more complex analyses of gene function require the use of an inducible expression system. Such a system was present for years in T. brucei (Wirtz and Clayton
1995) but it was not developed until recently in T. cruzi by
two different groups (da Rocha et al. 2004a; Taylor and
Kelly 2006). One of these tetracycline-regulated expression
vectors, pTcINDEX, was developed using elements from
the T. brucei system and pTREX. The expression is under
the control of a tetracycline-regulatable T7 promoter and it
was tested using two markers, luciferase and Red Fluorescent Protein (RFP). The induction was both time and dose
dependant and the system could be induced at least 100-fold
within 24 hs of the addition of tetracycline (Taylor and
Kelly 2006). The vector pTcINDEX represents a valuable
addition to the genetic tools available for T. cruzi and the
system is ready for the use in dominant negative approaches
in the near future.
well established in T. brucei and several successful reports
are available in the literature including a chromosome-wide
analysis of gene function (Motika and Englund 2004; Ullu
et al. 2004; Subramaniam et al. 2006). However, efforts to
use this tool in Leishmania (Robinson and Beverley 2003)
and T. cruzi (da Rocha et al. 2004) were completely unsuccessful. Recently, with the genome sequence finished and
annotated, it was discovered that T. cruzi lacks some essential components of the RNAi pathway machinery such as
the argonaute 1 (Berriman et al. 2005) and the dicer-like
protein (Shi et al. 2006) present in T. brucei.
The absence of RNAi in T. cruzi introduced a great
limitation to the functional genetics in this parasite.
Thus, analysis of gene function relies exclusively on
more difficult and time-consuming techniques such as gene
deletion or dominant negative approaches. As a disadvantage, gene deletion could only be applied depending on the
gene copy number present in the genome and as advantage,
homologous recombination works particularly well in trypanosomatids (Beverley 2003). Few reports have successfully created null mutants for non-essential genes such as
the T. cruzi surface glycoprotein GP72, a single copy gene
(Cooper et al. 1993). Instead, attempts to disrupt essential
genes resulted in a remarkable emergence of aneuploid or
polyploidy parasites, an unusual outcome that is now used
as a criterion for weather the targeted gene is essential or
not (Beverley 2003).
Besides, in vitro culture models and transient and stable
DNA transfection systems for T. cruzi are well established
A GENOMIC OVERVIEW
The genome sequence of T. cruzi
Fig. 4 Expression vectors. (A) Schematic representation of two of the
most widely used gene expression vectors in T. cruzi carrying a green
fluorescent protein (GFP) gene. High rates of transcription are directed by
the RNA Pol I ribosomal promoter (Rib. Prom.). Vertical arrows indicate
the splice acceptor site for SL RNA addition in each case. HX1 is an intergenic region that harbors a strong trans-splicing signal. Neo is the G418
resistance gene that is surrounded by intergenic regions from the T. cruzi
gapdh gene. (B) Estimation of GFP fluorescence in transgenic parasites
transfected either with pTREX-GFP or pRIBOTEX-GFP using a fluorometer. (C) Estimation of fluorescence of the same parasites in B but as
observed under fluorescence microscopy. There is almost a 100x difference between pTREX and pRIBOTEX due exclusively to the intergenic
region HX1.
Trypanosoma cruzi is diploid and presents different-sized
homologous chromosome pairs (Pedroso et al. 2003). Since
the chromosomes do not condense in metaphase, direct karyotypic analysis is not possible. Thus, its number has been
estimated by the use of Pulsed Field Gel Electrophoresis
(PFGE). Current estimates indicate the presence of 28 chromosomes per haploid genome (Branche et al. 2006). However, the exact number is not known because homologs can
differ substantially in size complicating the PFGE analysis.
The T. cruzi genome project was challenging. Its genome sequence was accomplished as part of a project to
obtain the sequence of the three model kinetoplastids along
6
The genetics and genomics of Trypanosoma cruzi. Martin P. Vazquez
with T. brucei and L. major, the Tritryps (Berriman et al.
2005; El-Sayed et al. 2005a, 2005b; Ivens et al. 2005).
The T. cruzi strain CL Brener is a member of the subgroup IIe and was chosen for sequencing because it is well
characterized experimentally. However, it was later discovered that it is a hybrid strain between T. cruzi I and T. cruzi
II lineages. This fact complicated the efforts to obtain the
whole genome sequence and its assembly and it was later
necessary to generate a 2.5X sequence coverage of the Esmeraldo strain from the progenitor subgroup IIb to allow
distinguishing the two haplotypes. Moreover, the sequence
coverage for CL Brener strain was 19X with more than 768
Mb obtained in single reads, one of the largest for any eukaryotic genome sequenced to date.
It was finally published in July, 2005 in a special edition of Science with the genomes of the other Tritryps (ElSayed et al. 2005b).
The sequence was obtained by using the whole-genome
shotgun (WGS) technique because the high repeat content
limited the initial “map-as-you-go” bacterial artificial chromosome (BAC) clone-based approach. The assembly parameters were modified to contend with the high allelic variation of the genome.
The current T. cruzi assembly contains 5489 scaffolds
totaling 67 Mb. On the basis of the assembly, the T. cruzi
diploid genome size was estimated between 106.4 and
110.7 Mb. A total of 60.5 Mb comprised the annotated dataset. The current estimate indicates that the haploid genome
of T. cruzi contains about 12,000 protein-coding genes. A
total of 594 RNA genes were identified in this dataset and
another 1400 RNA genes in the unannotated contigs (ElSayed et al. 2005b).
The putative function could be assigned to 50.8% of the
predicted protein-coding genes on the basis of significant
similarity to previously characterized proteins or known
functional domains.
always inserted in polypyrimidine tracts in the intergenic
regions and, as a consequence, it does not interrupt proteincoding genes. In fact, SIRE was found transcribed always in
sense orientation in 2.2% of the mRNAs as part of the 3
UTR (Vazquez et al. 2000) and, at least, in one case as part
of the 5 UTR (Vazquez et al. 1994; Ben-Dov et al. 2005).
It was demonstrated that SIRE harbors a weak but functional trans-splicing signal that modulates expression of a
ribosomal protein gene (Ben-Dov et al. 2005). Moreover, it
was suggested that transcription as part of the 3 UTRs
could have a role in programmed gene regulation events but
protein factors that bind SIRE sequences await identification (Vazquez et al. 2000; D’Orso and Frasch 2001).
VIPER was initially described as a 2326 bp long LTRlike retroelement associated to SIRE (Fig. 5). VIPER begins
with the first 182 bp of SIRE, whereas its 3 end is formed
by the last 220 bp of SIRE. Both SIRE moieties are connected by a 1924 bp segment that harbors an open reading
frame coding for a complete reverse transcriptase-RNAse H
protein with a 15 amino acids C-terminal sequence derived
from the SIRE element. It was later found that this element
is a truncated version of a larger VIPER (Fig. 5). The complete VIPER is 4480 bp long and harbor three non-overlapped domains encoding for a GAG-like, a tyrosine recombinase and a reverse transcriptase-RNAse H proteins. On the
basis of its reverse transcriptase phylogeny, it was established that VIPER constitutes a novel group of tyrosine recombinase-enconding retrotransposons (Lorenzi et al. 2006).
Interestingly, VIPER was found associated with strandswitch regions of transcription in several chromosomes (Lorenzi et al. 2006). In this way, the VIPER-SIRE pair was
demonstrated to have several consequences to the modeling,
plasticity and expression of the T. cruzi genome.
The NON-LTR retrotransposons are represented by
members of CZAR, a site specific element inserted in the
SL RNA loci (8 copies); L1Tc, an active retroelement with
15 intact copies; and the nonautonomus NARTc, which
seems to use the L1Tc machinery for retrotransposition (ElSayed et al. 2005b). Thus, VIPER-SIRE and L1Tc-NARTc
seem to have the same relationship pattern in the genome
(Fig. 5).
It is worth noting that expansion of these repetitive elements in T. cruzi could be the result of the absence of active
RNAi machinery in its genome since T. brucei uses RNAi to
control the expression of the retroelements (Ullu et al.
2004).
Repetitive elements and retrotransposons
modeled the T. cruzi genome
One of the more challenging problems in assembling the T.
cruzi genome was its repetitive nature. In fact, the genome
was annotated as individual large scaffolds but whole chromosomes could not be reconstructed as in T. brucei and
Leishmania. At least 50% of the genome is repetitive sequences, consisting mostly of large gene families of surface
proteins, retrotransposons and subtelomeric repeats.
Long terminal repeat (LTR) and non-LTR retroelements
are abundant. The LTR retrotransposons are represented by
SIRE and its associated element VIPER (Vazquez et al.
2000) which have 480 and 275 highly conserved copies respectively. However, degenerated copies of SIRE could account for 1500 to 2000 copies.
The element SIRE is 430 bp long and dispersed through
out the genome with presence in all chromosomes. SIRE is
Surface protein families are very large
Besides the repetitive elements, the T. cruzi genome presented very large gene families such as the mucin and trans-sialidase (TS) families with 863 and 1430 copies respectively.
Most notably is the newly discovered MASP family of mucin associated proteins with 1377 members. Other expanded
families are the surface protein GP63 and the retroposon
LTR Retrotransposon
Fig. 5 Schematic representation of the two more relevant
retrotransposon in the T. cruzi
genome: SIRE-VIPER and
NARTc-L1Tc. The numbers
100% and 50% indicate sequence homology between the
non-autonomous NARTc and
the autonomous L1Tc.
VIPER (4.5 kb)
SIRE (0.43 kb)
NON-LTR Retrotransposon
L1Tc (4.9 kb)
NARTc (0.25 kb)
7
International Journal of Biomedical and Pharmaceutical Sciences 1(1), 1-11 ©2007 Global Science Books
Fig. 6 Frontpage of the TcruziDB website. TcruziDB is an integrated Trypanosoma cruzi genome resource that combines several search tools to perform
data mining. It provides gene classifications and gene expression data such as ESTs and proteomic profiles in the three different life cycle stages of the
parasite. http://tcruzidb.org (with kind permission).
with variable degrees of homology to the active ones. The
significance sequence variability suggests a strong selective
pressure on the TS family to diversify maybe in part due to
the mammalian immune response (Buscaglia et al. 2006).
The other superfamily, MASP, was previously unknown
and discovered as part of the genome project. Most members of this family are located downstream of TcMUC II
mucin genes and that is why they are named mucin-associated surface proteins (MASP). However, they are resembled
structurally and not at the sequence level. The central region
of these proteins is highly variable and often contains repeated sequences. An interesting observation is the existence of
chimeras that contain the N- or C-terminal conserved domains of MASP combined with the N- or C-terminal domain of mucin or the C-terminal domain from TS (El-Sayed
et al. 2005b; Buscaglia et al. 2006).
A common feature of these superfamilies is the presence
of a large number of pseudogenes and they may contribute
to diversity of the sequence repertoire through recombination events.
Finally, all the information regarding T. cruzi genomic
and post-genomics were assembled in an integrated database named TcuziDB (http://tcruzidb.org). The TcruziDB
has combined the annotation data with expression data (proteomic and EST) and several search features in a relational
architecture (Fig. 6). The database is growing constantly
with the support of the research community that deposit
functional genomic datasets (Aguero et al. 2006).
hot-spot protein (RHS) with 425 and 752 members respectively. The RHS protein function is unknown (El-Sayed et
al. 2005b).
The massive expansion of surface proteins genes in T.
cruzi is interesting. The parasite is covered with mucins,
which contribute to parasite protection and to the establishment of a persistent infection. Mucins are glycoproteins
that bear a dense array of O-linked oligosaccharides. It is
suggested that they provide protection against the vector
and/or vertebrate-host-derived defense mechanisms and ensure the targeting and invasion of specific cells (Buscaglia
et al. 2006).
The T. cruzi mucin repertoire stands out among the protozoan parasites for its complexity and versatility. They account for ~15% of all the predicted T. cruzi genes together
with TS and MASPs, with which they are physically, and
probably functionally, related. This fact highlights the importance of these protein families in the parasite biology.
Interestingly, they are all mostly telomeric.
The transcription of mucin families is differential according to the life cycle. The TcSMUG family of 30-50 kDa is
preferentially expressed in the epimastigotes in the insect
vector, and the TcMUC family of 60-200 kDa is expressed
in the vertebrate host forms, amastigote and trypomastigote
(Buscaglia et al. 2006).
The largest superfamily is TS which is divided in two
subfamilies. One subfamily includes 12 genes that enconde
enzymatically active TSs. The remaining TS superfamily
members consist of enzymatically inactive TS-like proteins
8
The genetics and genomics of Trypanosoma cruzi. Martin P. Vazquez
COMPARATIVE GENOMICS OF THE TRITRYPS
Despite having diverged 200-500 million years ago
predating the emergence of mammals, the genomes of the
Tritryp are highly syntenic (i.e. show conservation of gene
order). Moreover, almost all of the 3-way COGs (94%) fall
within regions of conserved synteny. The analysis of synteny breakpoints showed that 40% were associated with
family expansions, structural RNAs or retroelements insertions (El-Sayed et al. 2005a).
The high degree of synteny is most likely the reflection
of their mode of polycistronic transcription and RNA processing mechanisms. Since transcription initiation is postulated to start only at few sites per chromosome, there may
be a selective pressure against synteny breaks within polycistronic gene clusters.
The synteny always decreased towards the telomere
and subtelomeric regions as a result of specific adaptations
of these parasites to their survival strategies. Antigenic
variation and diversity are characteristics of T. brucei and T.
cruzi respectively and the presence of a large array of genes
encoding surface proteins in or near telomeres is not accidental. Moreover, its correlation with several insertions of
retrotransposons in these regions may enhance recombination frequency and provide for a rapid sequence variation
needed for survival. Recombination at these sites is preferential because it prevents synteny breaks at the middle of a
chromosome.
Finally, the comparative genomics of Tritryps also resolved a long-standing issue that claims that these species
are descended from an ancestor that contained a photosynthetic endosymbiont. However, the authors found that the
protein domain content of Tritryps is not consistent with
large-scale horizontal transfer of genetic material from
plants (El-Sayed et al. 2005a).
Although the Tritryps share many general characteristics,
each is transmitted by a different insect, has its own life
cycle features, different target tissues and distinct disease
pathogenesis in their mammalian host. The availability of
their genomes allowed a better understanding of the genetic and evolutionary bases of these pathogens thorough
comparative genomics.
The genome of T. cruzi is the largest (67 Mb and
12,000 genes) compared to T. brucei (35 Mb and 9068
genes) and Leishmania (33 Mb and 8311 genes) which is
also reflected in the larger number of protein-coding genes.
The three genome contents were compared using the
algorithm BlastP (basic local alignment search tool) and
the mutual best hits between the three were grouped as
clusters of orthologous groups (3-way COG). This defined
the Tritryp core proteome which consist of 6158 members
with several hundreds of hypothetical proteins with unknown functions (Fig. 7). Several other proteins of the core
proteome serve functions conserved in all eukaryotes such
as DNA duplication, transcription, RNA processing, translation, DNA repair and many structural proteins (El-Sayed
et al. 2005a). Of special importance are proteins of unknown function shared by the three that could reveal clues
to develop a single anti-parasitic drug that could kill any of
the Tritryp pathogens.
A total of 1014 2-way COGs were also defined (COGs
shared by only two of the three Tritryps) with T. brucei and
Leishmania sharing the less number of 2-way COGs (74),
and T. cruzi/T. brucei and T. cruzi/Leishmania sharing similar numbers (458 and 482, respectively) (Fig. 7). The
larger number of species-specific members or 1-way COGs
is present in T. cruzi (3736) compared to T. brucei (1392)
and Leishmania (910) and is contributed mainly by the
surface protein families (Fig. 7).
Several examples of protein domains expansion and
loss were also revealed. Many of these proteins appear to
be involved in host interactions. T. cruzi has expanded bacterial neuraminidase, mucin-like glycoprotein, retroposon
hot spot protein domain (RHS) and several RNA binding
protein domains. Leishmania has reduced several proteinprotein interaction domains such as leucine-rich repeats
(LRR) or tetratricopeptide repeats (PPR), this does not
happen in T. cruzi and T. brucei.
CONCLUDING REMARKS: THE FUTURES OF T.
CRUZI GENETICS AND DRUG DEVELOPMENT
ARE ON A COLLISION PATH
I have presented various examples that put trypanosomes at
the edge of eukaryotic evolution with several unusual aspects of their molecular and cellular biology as the general
rule while these features are, at best, the exception of the
rule in other organisms. With such a plethora of potential
drug targets derived from these aspects and a genome sequence finished, why we do not have a number of effective
drugs in clinical trials for Chagas disease?
I think of two possible explanations to answer this question.
On the one side is the limitation of tools to perform
post-genomics and functional genomic studies in T. cruzi.
As I mentioned before, T. cruzi lacks RNAi and this is a
major limitation to analyze gene function in a wide scale.
Moreover, other tools such as a powerful inducible system
to use with dominant negative approaches have not been developed until recently. For that reason, this useful approach
has yet to be tested consistently in the years to come. Without powerful functional genomic tools to enter the post-genome era, the analysis of putative drug targets will suffer an
important delay until we could decipher completely these
unusual mechanisms in T. cruzi. T. brucei is a step forward
in this aspect but it may not be always a model for T. cruzi
genetics.
One the other side is the pharmaceutical industry and its
lack of interest to invest in Chagas drug development because it is “a disease of the poor”. The governments of the
affected developing countries need to move the disease
higher up their priority list to find a solution to this conundrum.
As Carolyn Ash and Barbara Jasny wisely stated in the
introduction to the trypanosomatid genomes issue of Science, “let’s hope the genomes will fuel this process”.
T. brucei COGs
1392
458
74
6158
3736
T. cruzi COGs
482
910
L. major COGs
Fig. 1 Clusters of orthologue groups (COGs) classification in comparative genomics of the Tritryps. The T. cruzi COGs are shaded in gray.
The number of products in the core proteome (the 3-way COG) is underlined.
9
International Journal of Biomedical and Pharmaceutical Sciences 1(1), 1-11 ©2007 Global Science Books
ACKNOWLEDGEMENTS
de Gaudenzi J, Frasch AC, Clayton C (2005) RNA-binding domain proteins
in Kinetoplastids: a comparative analysis. Eukaryotic Cell 4, 2106-2114
Dias JC, Silveira AC, Schofield CJ (2002) The impact of Chagas disease control in Latin America: A review. Memórias do Instituto Oswaldo Cruz 97,
603-612
Docampo R (1990) Sensitivity of parasites to free radical damage by antiparasitic drugs. Chemico-Biological Interactions 73, 1-27
D’Orso I, Frasch AC (2001) Functionally different AU- and G-rich cis-elements confer developmentally regulated mRNA stability in Trypanosoma
cruzi by interaction with specific RNA-binding proteins. The Journal of Biological Chemistry 276, 15783-15793
D’Orso I, Frasch AC (2002) TcUBP-1, an mRNA destabilizing factor from trypanosomes, homodimerizes and interacts with novel AU-rich element and
Poly(A)-binding proteins forming a ribonucleoprotein complex. Journal of
Biological Chemistry 277, 50520-50528
D’Orso I, De Gaudenzi JG, Frasch AC (2003) RNA-binding proteins and
mRNA turnover in trypanosomes. Trends in Parasitology 19, 151-155
El-Sayed NM, Myler PJ, Blandin G, Berriman M, Crabtree J, Aggarwal G,
Caler E, Renauld H, Worthey EA, Hertz-Fowler C, Ghedin E, Peacock C,
Bartholomeu DC, Haas BJ, Tran AN, Wortman JR, Alsmark UC, Angiuoli S, Anupama A, Badger J, Bringaud F, Cadag E, Carlton JM, Cerqueira GC, Creasy T, Delcher AL, Djikeng A, Embley TM, Hauser C,
Ivens AC, Kummerfeld SK, Pereira-Leal JB, Nilsson D, Peterson J, Salzberg SL, Shallom J, Silva JC, Sundaram J, Westenberger S, White O,
Melville SE, Donelson JE, Andersson B, Stuart KD, Hall N (2005a) Comparative genomics of trypanosomatid parasitic protozoa. Science 309, 404409
El-Sayed NM, Myler PJ, Bartholomeu DC, Nilsson D, Aggarwal G, Tran AN,
Ghedin E, Worthey EA, Delcher AL, Blandin G, Westenberger SJ, Caler
E, Cerqueira GC, Branche C, Haas B, Anupama A, Arner E, Aslund L,
Attipoe P, Bontempi E, Bringaud F, Burton P, Cadag E, Campbell DA,
Carrington M, Crabtree J, Darban H, da Silveira JF, de Jong P, Edwards
K, Englund PT, Fazelina G, Feldblyum T, Ferella M, Frasch AC, Gull K,
Horn D, Hou L, Huang Y, Kindlund E, Klingbeil M, Kluge S, Koo H,
Lacerda D, Levin MJ, Lorenzi H, Louie T, Machado CR, McCulloch R,
McKenna A, Mizuno Y, Mottram JC, Nelson S, Ochaya S, Osoegawa K,
Pai G, Parsons M, Pentony M, Pettersson U, Pop M, Ramirez JL, Rinta J,
Robertson L, Salzberg SL, Sanchez DO, Seyler A, Sharma R, Shetty J,
Simpson AJ, Sisk E, Tammi MT, Tarleton R, Teixeira S, van Aken S, Vogt
C, Ward PN, Wickstead B, Wortman J, White O, Fraser CM, Stuart KD,
Andersson B (2005b) The genome sequence of Trypanosoma cruzi, etiologic
agent of Chagas disease. Science 309, 409-415
Estevez AM, Kempf T, Clayton C (2001) The exosome of Trypanosoma brucei.
The EMBO Journal 16, 3831-3839
Guevara P, Dias M, Rojas A, Crisante G, Abreu-Blanco MT, Umezawa E,
Vazquez MP, Levin M, Anez N, Ramirez JL (2005) Expression of fluorescent genes in Trypanosoma cruzi and Trypanosoma rangeli (Kinetoplastida:
Trypanosomatidae): its application to parasite-vector biology. Journal of Medical Entomology 42, 48-56
Haile S, Estevez AM, Clayton C (2003) A role for the exosome in the in vivo
degradation of unstable mRNAs. RNA 9, 1491-1501
Hastings KE (2005) SL trans-splicing: easy come or easy go? Trends in Genetics 21, 240-247
Hendriks EF, Robinson DR, Hinkins M, Matthews KR (2001) A novel
CCCH protein which modulates differentiation of Trypanosoma brucei to its
procyclic form. The EMBO Journal 3, 6700-6711
Hummel HS, Gilleespie DR, Swindle J (2000) Mutational analyses of 3 splice
site selection during trans-splicing. The Journal of Biological Chemistry 275,
35522-35531
Ivens AC, Peacock CS, Worthey EA, Murphy L, Aggarwal G, Berriman M,
Sisk E, Rajandream MA, Adlem E, Aert R, Anupama A, Apostolou Z,
Attipoe P, Bason N, Bauser C, Beck A, Beverley SM, Bianchettin G, Borzym K, Bothe G, Bruschi CV, Collins M, Cadag E, Ciarloni L, Clayton C,
Coulson RM, Cronin A, Cruz AK, Davies RM, De Gaudenzi J, Dobson
DE, Duesterhoeft A, Fazelina G, Fosker N, Frasch AC, Fraser A, Fuchs M,
Gabel C, Goble A, Goffeau A, Harris D, Hertz-Fowler C, Hilbert H,
Horn D, Huang Y, Klages S, Knights A, Kube M, Larke N, Litvin L, Lord
A, Louie T, Marra M, Masuy D, Matthews K, Michaeli S, Mottram JC,
Muller-Auer S, Munden H, Nelson S, Norbertczak H, Oliver K, O'Neil S,
Pentony M, Pohl TM, Price C, Purnelle B, Quail MA, Rabbinowitsch E,
Reinhardt R, Rieger M, Rinta J, Robben J, Robertson L, Ruiz JC, Rutter
S, Saunders D, Schafer M, Schein J, Schwartz DC, Seeger K, Seyler A,
Sharp S, Shin H, Sivam D, Squares R, Squares S, Tosato V, Vogt C, Volckaert G, Wambutt R, Warren T, Wedler H, Woodward J, Zhou S, Zimmermann W, Smith DF, Blackwell JM, Stuart KD, Barrell B, Myler PJ
(2005) The genome of the kinetoplastid parasite, Leishmania major. Science
309, 436-442
Jager AV, de Gaudenzi JG, Cassola A, D’Orso I, Frasch AC (2007) mRNA
maturation by two-step trans-splicing/polyadenylation processing in trypanosomes. Proceedings of the National Academy of Sciences USA 104, 20352042
Kelly JM, Ward HM, Miles MA, Kendall G (1992) A shuttle vector which facilitates the expression of transfected genes in Trypanosoma cruzi and Leish-
I thank Jessica Kissinger for permission to reproduce the tcruzidb
frontpage in this article. This work was supported by grants from
FONCYT – PICT REDES 2003-00300, UBACyT X-153 (University of Buenos Aires) and PIP-CONICET 5492. MV is a member
of the career of scientific investigator of CONICET, Argentina.
REFERENCES
Aguero F, Zheng W, Weatherly DB, Mendes P, Kissinger JC (2006)
TcruziDB: an integrated, post-genomics community resource for Trypanosoma cruzi. Nucleic Acids Research 1, D428-431
Aufderheide AC, Salo W, Madden M, Streitz J, Buikstra J (2004) A 9,000year record of Chagas’ disease. Proceedings of the National Academy of Sciences USA 101, 2034-2039
Ben-Dov C, Levin MJ, Vazquez MP (2005) Analysis of the highly efficient
pre-mRNA processing region HX1 of Trypanosoma cruzi Molecular and
Biochemical Parasitology 140, 96-105
Berriman M, Ghedin E, Hertz-Fowler C, Blandin G, Renauld H, Bartholomeu DC, Lennard NJ, Caler E, Hamlin NE, Haas B, Bohme U, Hannick
L, Aslett MA, Shallom J, Marcello L, Hou L, Wickstead B, Alsmark UC,
Arrowsmith C, Atkin RJ, Barron AJ, Bringaud F, Brooks K, Carrington
M, Cherevach I, Chillingworth TJ, Churcher C, Clark LN, Corton CH,
Cronin A, Davies RM, Doggett J, Djikeng A, Feldblyum T, Field MC,
Fraser A, Goodhead I, Hance Z, Harper D, Harris BR, Hauser H, Hostetler J, Ivens A, Jagels K, Johnson D, Johnson J, Jones K, Kerhornou
AX, Koo H, Larke N, Landfear S, Larkin C, Leech V, Line A, Lord A,
Macleod A, Mooney PJ, Moule S, Martin DM, Morgan GW, Mungall K,
Norbertczak H, Ormond D, Pai G, Peacock CS, Peterson J, Quail MA,
Rabbinowitsch E, Rajandream MA, Reitter C, Salzberg SL, Sanders M,
Schobel S, Sharp S, Simmonds M, Simpson AJ, Tallon L, Turner CM,
Tait A, Tivey AR, Van Aken S, Walker D, Wanless D, Wang S, White B,
White O, Whitehead S, Woodward J, Wortman J, Adams MD, Embley
TM, Gull K, Ullu E, Barry JD, Fairlamb AH, Opperdoes F, Barrell BG,
Donelson JE, Hall N, Fraser CM, Melville SE, El-Sayed NM (2005) The
genome of the African trypanosome Trypanosoma brucei. Science 309, 416422
Beverley SM (2003) Protozomics: trypanosomatid parasite genetics comes of
age. Nature Reviews Genetics 4, 11-19
Branche C, Ochaya S, Aslund L, Andersson B (2006) Comparative karyotyping as a tool for genome structure analysis of Trypanosoma cruzi. Molecular and Biochemical Parasitology 147, 30-38
Brener Z, Gazzinelli RT (1997) Immunological control of Trypanosoma cruzi
infection and pathogenesis of Chagas’ disease. International Archives of Allergy and Immunology 114, 103-110
Briones MR, Souto RP, Stolf BS, Zingales B (1999) The evolution of two Trypanosoma cruzi subgroups inferred from rRNA genes can be correlated with
the interchange of American mammalian faunas in the Cenozoic and has implications to pathogenicity and host specificity. Molecular and Biochemical
Parasitology 104, 219-232
Brisse S, Dujardin JC, Tibayrenc M (2000) Identification of six Trypanosoma cruzi phylogenetic lineages by random amplified polymorphic DNA
and multilocus enzyme eletrophoresis. International Journal of Parasitology
30, 35-44
Buscaglia CA, Campo VA, Frasch AC, Di Noia JM (2006) Trypanosoma
cruzi surface mucins: host-dependent coat diversity. Nature Reviews Microbiology 4, 229-236
Caro F, Bercovich N, Atorrasagasti C, Levin MJ, Vazquez MP (2005) Protein interactions within the TcZFP zinc finger family members of Trypanosoma cruzi: implications for their functions. Biochemical and Biophysical
Research Communications 333, 1017-1025
Caro F, Bercovich N, Atorrasagasti C, Levin MJ, Vazquez MP (2006) Trypanosoma cruzi: the Pumilio RNA-binding proteins family is composed by
ten members. Experimental Parasitology 113, 112-124
Clayton CE (2002) Life without transcriptional control? From fly to man and
back again. The EMBO Journal 21, 1881-1888
Cooper R, de Jesus AR, Cross GA (1993) Deletion of an immunodominant
Trypanosoma cruzi surface glycoprotein disrupts flagellum-cell adhesion.
The Journal of Cell Biology 122, 149-156
da Rocha WD, Otsu K, Teixeira SM, Donelson JE (2004a) Tests of cytoplasmic RNA interference (RNAi) and construction of a tetracycline-inducible
T7 promoter system in Trypanosoma cruzi. Molecular and Biochemical Parasitology 133, 175-86
da Rocha WD, Silva RA, Bartholomeu DC, Pires SF, Freitas JM, Macedo
AM, Vazquez MP, Levin MJ, Teixeira SM (2004b) Expression of exogenous genes in Trypanosoma cruzi: improving vectors and electroporation
protocols. Parasitology Research 92, 113-20
de Freitas JM, Augusto-Pinto L, Pimenta JR, Bastos-Rodrigues L, Gonçalves VF, Teixeira SM, Chiari E, Junqueira AC, Fernandes O, Macedo
AM, Machado CR, Pena SD (2006) Ancestral genomes, sex, and the population structure of Trypanosoma cruzi. PLoS Pathogenes 2, 226-235
10
The genetics and genomics of Trypanosoma cruzi. Martin P. Vazquez
mania. Nucleic Acids Research 20, 3963-3969
Lebowitz JH, Smith HQ, Rusche L, Beverley M (1993) Coupling of poly(A)
site selection and trans-splicing in Leishmania. Genes and Development 7,
996-1007
Levin MJ (1996) In chronic Chagas heart disease, don’t forget the parasite.
Parasitology Today 12, 415-416
Liang XH, Haritan A, Uliel S, Michaeli S (2003) trans and cis splicing in trypanosomatids: mechanism, factors, and regulation. Eukaryotic Cell 2, 830840
Lopez-Estranio C, Tschudi C, Ullu E (1998) Exonic sequences in the 5 untranslated region of tubulin mRNA modulate trans-splicing in Trypanosoma
brucei. Molecular and Cellular Biology 8, 4620-4628
Lorenzi HA, Vazquez MP, Levin MJ (2003) Integration of expression vectors
into the ribosomal locus of Trypanosoma cruzi. Gene 310, 91-99
Lorenzi HA, Robledo G, Levin MJ (2006) The VIPER elements of trypanosomes constitute a novel group of tyrosine recombinase-encoding retrotransposons. Molecular and Biochemical Parasitology 145, 184-194
Luu VD, Brems S, Hoheisel JD, Burchmore R, Guilbride DL, Clayton C
(2006) Functional analysis of Trypanosoma brucei PUF1. Molecular and
Biochemical Parasitology 150, 340-349
Macedo AM, Machado CR, Oliveira RP, Pena SDJ (2004) Trypanosoma
cruzi: Genetic structure of populations and relevance of genetic variability to
the pathogenesis of Chagas disease. Memórias do Instituto Oswaldo Cruz 99,
1-12
Mair G, Shi H, Li H, Djikeng A, Aviles HO, Bishop JR, Falcone FH, Gavrilescu C, Montgomery JL, Santori MI, Stern LS, Wang Z, Ullu E, Tschudi C (2000) A new twist in trypanosome RNA metabolism: cis-splicing
of pre-mRNA. RNA 6, 163-169
Manning-Cela R, Gonzalez A, Swindle J (2002) Alternative splicing of LYT1
transcripts in Trypanosoma cruzi. Infection and Immunity 70, 4726-4728
Martinez-Calvillo S, Lopez I, Hernandez R (1997) pRIBOTEX expression
vector: a pTEX derivative for a rapid selection of Trypanosoma cruzi transfectants. Gene 199, 71-76
Martinez-Calvillo S, Yan S, Nguyen D, Fox M, Stuart K, Myler PJ (2003)
Transcription of Leishmania major Friedlin chromosome 1 initiates in both
directions within a single region. Molecular Cell 11, 1291-1299
Martinez-Calvillo S, Nguyen D, Stuart K, Myler PJ (2004) Transcription initiation and termination on Leishmania major chromosome 3. Eukaryotic Cell
3, 506-517
Matthews KR, Tschudi C, Ullu E (1994) A common pyrimidine-rich motif
governs trans-splicing and polyadenylation of tubulin polycistronic premRNA in trypanosomes. Genes and Development 8, 491-501
Monnerat S, Martinez-Calvillo S, Worthey E, Myler PJ, Stuart KD, Fasel
N (2004) Genomic organization and gene expression in a chromosomal region of Leishmania major. Molecular and Biochemical Parasitology 134,
233-243
Morking PA, Dallagiovanna BM, Foti L, Garat B, Picchi GF, Umaki AC,
Probst CM, Krieger MA, Goldenberg S, Fragoso SP (2004) TcZFP1: a
CCCH zinc finger protein of Trypanosoma cruzi that binds poly-C oligoribonucleotides in vitro. Biochemical and Biophysical Research Communications
319, 169-177
Motyka SA, Englund PT (2004) RNA interference for analysis of gene function in trypanosomatids. Current Opinion in Microbiology 7, 362-368
Pedroso A, Cupolillo E, Zingales B (2003) Evaluation of Trypanosoma cruzi
hybrid stocks based on chromosomal size variation. Molecular and Biochemical Parasitology 129, 79-90
Prata A (2001) Clinical and epidemiological aspects of Chagas disease. Lancet
Infectious Disease 1, 92-100
Robinson KA, Beverley SM (2003) Improvements in transfection efficiency
and tests of RNA interference (RNAi) approaches in the protozoan parasite
Leishmania. Molecular and Biochemical Parasitology 128, 217-28
Schijman AG, Vigliano CA, Viotti RJ, Burgos JM, Brandariz S, Lococo B,
Leze MI, Armenti HA, Levin MJ (2004) Trypanosoma cruzi DNA in cardiac lesions of argentinean patients with end-stage chronic Chagas heart disease. American Journal of Tropical Medicine and Hygiene 70, 210-220
Shi H, Tschudi C, Ullu E (2006) An unusual Dicer-like1 protein fuels the RNA
interference pathway in Trypanosoma brucei. RNA 12, 2063-2072
Subramaniam C, Veazey P, Redmond S, Hayes-Sinclair J, Chambers E,
Carrington M, Gull K, Matthews K, Horn D, Field MC (2006) Chromosome-wide analysis of gene function by RNA interference in the African trypanosome. Eukaryotic Cell 5, 1539-1549
Tarleton RL, Zhang L (1999) Chagas disease etiology: autoimmunity or parasite persistence? Parasitology Today 1, 94-99
Tarleton RL (2001) Parasite persistence in the aetiology of Chagas disease. International Journal of Parasitology 31, 550-554
Taylor MC, Kelly JM (2006) pTcINDEX: a stable tetracycline-regulated expression vector for Trypanosoma cruzi. BMC Biotechnology 6, 6-32
Tyler KM, Engman DM (2001) The life cycle of Trypanosoma cruzi revisited.
International Journal of Parasitology 31, 472-481
Ullu E, Tschudi C, Chakraborty T (2004) RNA interference in protozoan parasites. Cellular Microbiology 6, 509-519
Urbina JA, Docampo R (2003) Specific chemotherapy of Chagas disease: controversies and advances. Trends in Parasitology 19, 495-501
Vanacova S, Liston DR, Tachezy J, Johnson PJ (2003) Molecular biology of
the amitochondriate parasites, Giardia intestinalis, Entamoeba histolytica and
Trichomonas vaginalis. International Journal of Parasitology 33, 235-255
Vazquez MP, Schijman AG, Levin MJ (1994) A short interspersed repetitive
element provides a new 3 acceptor site for trans-splicing in certain ribosomal
P2 protein genes of Trypanosoma cruzi. Molecular and Biochemical Parasitology 64, 327-336
Vazquez MP, Levin MJ (1999) Functional analysis of the intergenic regions of
TcP2ȕ gene loci allowed the construction of an improved Trypanosoma cruzi
vector. Gene 239, 217-225
Vazquez MP, Ben-Dov C, Lorenzi H, Moore T, Schijman A, Levin MJ
(2000) The short interspersed repetitive element of Trypanosoma cruzi, SIRE,
is part of VIPER, an unusual retroelement related to long terminal repeat
retrotransposons. Proceedings of the National Academy of Sciences USA 97,
2128-2133
Vazquez MP, Atorrasagasti C, Bercovich N, Volcovich R, Levin MJ (2003)
Unique features of the Trypanosoma cruzi U2AF35 splicing factors. Molecular and Biochemical Parasitology 128, 77-81
Wickens M, Bernstein DS, Kimble J, Parker R (2002) A PUF family portrait:
3UTR regulation as a way of life. Trends in Genetics 18, 150-157
Wirtz E, Clayton C (1995) Inducible gene expression in trypanosomes mediated by a prokaryotic repressor. Science 268, 1179-1183
Worthey EA, Martinez-Calvillo S, Schnaufer A, Aggarwal G, Cawthra J,
Fazelinia G, Fong C, Fu G, Hassebrock M, Hixson G, Ivens AC, Kiser P,
Marsolini F, Rickel E, Salavati R, Sisk E, Sunkin SM, Stuart KD, Myler
PJ (2003) Leishmania major chromosome 3 contains two long convergent
polycistronic gene clusters separated by a tRNA gene. Nucleic Acids Research 31, 4201-4210
Zingales B, Stolf BS, Souto RP, Fernandes O, Briones MR (1999) Epidemiology, biochemistry and evolution of Trypanosoma cruzi lineages based on
ribosomal RNA sequences. Memórias do Instituto Oswaldo Cruz 94, 159-164
11