You are on page 1of 9

Mar Biotechnol

DOI 10.1007/s10126-008-9107-8

ORIGINAL ARTICLE

Preparation and Analysis of an Expressed Sequence Tag


Library from the Toxic Dinoflagellate Alexandrium catenella
Paulina Uribe & Daniela Fuentes & Jorge Valdés &
Amir Shmaryahu & Alicia Zúñiga & David Holmes &
Pablo D. T. Valenzuela

Received: 12 June 2007 / Accepted: 10 April 2008


# Springer Science + Business Media, LLC 2008

Abstract Dinoflagellates of the genus Alexandrium are Keywords Alexandrium catenella . cDNA sequencing .
photosynthetic microalgae that have an extreme importance Red tide microalgae . Toxic dinoflagellate
due to the impact of some toxic species on shellfish
aquaculture industry. Alexandrium catenella is the species
responsible for the production of paralytic shellfish poison- Introduction
ing in Chile and other geographical areas. We have
constructed a cDNA library from midexponential cells of Dinoflagellates represent a unique and important group of
A. catenella grown in culture free of associated bacteria and organisms in the marine environment in terms of their
sequenced 10,850 expressed sequence tags (ESTs) that numbers and diversity, as well as their ecological and
were assembled into 1,021 contigs and 5,475 singletons for physiological significance. They commonly occur as free-
a total of 6,496 unigenes. Approximately 41.6% of the living, photosynthetic, and marine unicellular algae, but also
unigenes showed similarity to genes with predicted func- include endosymbiotic, parasitic, heterotrophic, and freshwa-
tion. A significant number of unigenes showed similarity ter taxa. Some species are responsible for the production of
with genes from other dinoflagellates, plants, and other potent toxins that can be accumulated by shellfish and affect
protists. Among the identified genes, the most expressed humans and marine mammals. They form harmful blooms or
correspond to those coding for proteins of luminescence, “red tides,” in which cell numbers reach more than one million
carbohydrate metabolism, and photosynthesis. The sequen- cells per liter of seawater, producing a significant economic
ces of 9,847 ESTs have been deposited in Gene Bank impact and public health concern on different geographical
(accession numbers EX 454357–464203). areas throughout the world (Scholin et al. 1995; Hallegraeff
1993). Dinoflagellates are the only photosynthetic organisms
capable of bioluminescence (Sweeney 1987). Dinoflagellates
are also unique among eukaryotes in many other biological
and morphological characteristics. Their DNA content is
P. Uribe : D. Fuentes : A. Zúñiga : P. D. T. Valenzuela (*) higher than other eukayotes [from 3 to 250 pg/cell, or
Fundación Ciencia para la Vida,
approximately 3,000–215,000 megabases (Spector 1984;
Av. Zañartu 1482, Ñuñoa,
Santiago, Chile Triplett et al. 1993; Santos and Coffroth 2003)]. This is up
e-mail: pvalenzu@bionova.cl to over 80 times the size of the human genome (Lin 2006).
Their chromosomes consist of permanently condensed,
J. Valdés : A. Shmaryahu : D. Holmes : P. D. T. Valenzuela
genetically inactive central regions with peripheral loops of
Instituto MIFAB,
Zañartu 1482, Ñuñoa, B-DNA that protrude from this core and comprise the
Santiago, Chile actively transcribed DNA (Sigee 1984; Anderson et al. 1992;
Bhaud et al. 1999).
J. Valdés : A. Shmaryahu : D. Holmes
Dinoflagellates of the genus Alexandrium such as Alexan-
Center for Bioinformatics and Genome Biology,
Zañartu 1482, Ñuñoa, drium catenella cause paralytic shellfish poisoning through
Santiago, Chile saxitoxin production in Chile and in many geographical
Mar Biotechnol

areas of the world. In Chile, the first documented toxic isolation kit (Stratagene, La Jolla, CA, USA). cDNA was
bloom was reported in 1972 in Magallanes (Guzmán and prepared from approximately 5 μg of polyA+ mRNA and
Campodónico 1975). Since then, the dominant toxic dino- cloned using the vector pExpress 1, exploiting the Not I and
flagellate species in Chilean southern coastal areas and Eco RV restriction endonuclease sites. Double-strand cDNA
estuaries during toxic bloom events has been identified to be synthesis was performed according to manufacturer’s
A. catenella (Guzmán and Campodónico 1978). directions and quantified spectrophotometrically. The
The study of the molecular mechanisms that regulate cDNA libraries were not normalized. Sequencing reactions
growth, toxicity, photosynthesis, luminescence, and of other were carried out from the 5′ end of the cDNA insert using
circadian-controlled expressed genes of A. catenella, is of the universal primers M13FWD (5′-gtaaaacgacggccagt-3′)
critical importance for understanding the physiological and M13REV (5′-caggaaacagctatgac-3′).
mechanisms and bloom formation capacity of Alexandrium
species. However, there have been few studies regarding Computational Sequence Analysis and ESTs Assembly Vec-
the molecular biology of A. catenella. Sequencing of tor-derived, ribosomal, and ambiguous sequences were
complementary DNA libraries to generate expressed se- removed from the collected EST sequences. EST sequences
quence tags (ESTs) is a reasonable approach for discover- were assembled in clusters with a minimum value of 95%
ing expressed genes. ESTs can be used as markers for genes identity for at least a 50-bp region of overlap using the
expressed under specific conditions, for predicting protein CAP3 program (Huang and Madan 1999). Clusters and
families, and for the development of expression systems for singletons generated were designed as unigenes and were
new proteins and their functions. Here, we have developed then subjected to similarity searches against the National
an EST library of A. catenella strain ACC07, isolated from Center for Biotechnology Information nonredundant pro-
Chilean waters, and have carried out large-scale sequencing tein database, using the BLASTX algorithm (Altschul et al.
to yield an EST database containing 10,850 ESTs and 6,496 1990). Initially, sequence similarities were considered to be
unique genes. This database provides an important genomic significant when the E value was below e−5 at the
resource for scientists working on the genus Alexandrium aminoacid sequence level. However, a stricter criterion
and related dinoflagellates. with a cut-off E value of e−20 or less was also used in the
analysis. The InterProScan (Mulder et al. 2007), gene
ontology (the gene ontology consortium 2007), and clusters
Materials and Methods of orthologous groups (COGs) (Tatusov et al. 2003)
databases were used to infer the functional classification
Strains and Media Alexandrium catenella clone ACC07 of the predicted proteins.
isolated in Aysén, Chile, in 1994 was used. Cells were grown
in f/2 medium (Guillard 1995) at 16°C in a 10:14 light/dark
photoperiod. Axenic cultures were obtained according to Results and Discussion
Uribe and Espejo (2003). Briefly, cells were subjected to
sequential washing and filtration through an 11-μm pore-size Characteristics of the A. catenella cDNA Library
nylon mesh with 0.05 mg/ml gentamicin and 0.2 mg/ml
penicillin G. Bacterial presence was determined by direct or The cDNA library obtained had a titer of 1.1×106 colony
epifluorescence microscopy after staining with acridine forming units per milliliter for a total of 1×107 primary
orange (Imai 1987; Kuwae and Hosokawa 1999). This recombinants. Blue/white plaque identification following
procedure allowed the detection of less than one bacterium plating of an aliquot of the library revealed 99% recombi-
per dinoflagellate. The cell extracts of A. catenella strain nant plaques. The quality of the library was assessed by
ACCO7 contained approximately 5–8 femtomoles of saxi- examining the insert size of 768 (2×384 well plates)
toxin equivalents/cell. randomly selected recombinant plaques. The average insert
size was 1.7 Kb, a value similar to that of a recent cDNA
cDNA library preparation Approximately 6×106 cells library of Karenia brevis (Lidie et al. 2005). The average
from an exponential phase culture were collected by size of the sequenced clones was 763 base pairs, and about
centrifugation at 1,000×g for 5 min during the light phase 83% of the sequenced cDNA clones contained inserts that
and were broken by four successive cycles of freezing, were longer than the single sequence read. The global G+C
grinding, and thawing. Approximately 400 μg of total RNA content for these ESTs was 56.8%. This value is similar to
was extracted using Trizol (Gibco BRL, Life Technologies, that obtained for the coding regions of Alexandrium
Gaithsburg, MD, USA), according to the manufacturer’s tamarense (60.8%) (Hackett et al. 2005) and in the range
directions, and quantified spectrophotometrically. Poly A+ of the values obtained for other dinoflagellates such as K.
mRNA was isolated with the Poly (A) Quick mRNA brevis (51%), Lingulodinium polyedrum (59.0%), Amphidi-
Mar Biotechnol

rium carterae (50.4%), and Crypthecodinium cohnii (50%) plants and animals (21% and 16%, respectively), whereas
(Lidie et al. 2005 and references therein). similarities with prokaryotes, flagellates, and protozoa were
Analysis of the codon usage revealed a major use of G 14%, 14%, and 13%, respectively (Tanikawa et al. 2004). A
(35.1%) and C (44.8%) at the third position similar to that similar analysis was carried out using an E value of e−20 or
obtained for A. tamarense (37.2% and 40.7%, respectively). less. In this case, a total of 3,460 ESTs corresponding to
The most frequent stop codon was TGA (72.7%), compared 1,546 unigenes were found to have similarity to previously
to TAA and TAG (6.5% and 20.7%, respectively). identified genes. As shown in Fig. 1, a large proportion of
the ESTs show high level of similarity to genes of
Generation and Annotation of Expressed Sequences Tags dinoflagellates (47%), plants (18.1%), and other protista
(protozoa, ciliates, and other microalgae, 13.4%) (Fig. 1).
EST sequences were produced from the cDNA library and The unigenes could be assigned to known COGs. The
scanned visually to confirm overall quality of peak shape and most represented group of proteins in ESTs corresponding
correspondence with base identification. After the cleaning to cellular processes are those related to luminescence
process, the average length per EST of the remaining (14.5%), carbohydrate metabolism (13.6%), aminoacid
sequences (9,847) was 736 base pairs and the Phred quality metabolism (12.6%), protein modification (10.9%), and
value was larger than 20. The sequences were assembled into photosynthesis (8.3%). Using an E value of e−20 or less, the
1,021 contigs (clusters of assembled ESTs) and 5,475 most represented group of proteins in ESTs corresponding
singletons (sequences found only once) (Table 1). The to cellular processes are those related to luminescence
sequences of 9,847 ESTs have been deposited in Gene Bank (18.4%), carbohydrate metabolism (14.5%), aminoacid
with accession numbers EX454357–464203. metabolism (15%), protein modification (11.7%), and
Contigs were composed of multiple ESTs ranging from 2 photosynthesis (8.4%) (Fig. 2).
to 438. The percentage of unigene sequences with Among the first two categories, the majority of the
similarity to GenBank database was 41.6%. This EST predicted proteins corresponded to those from dinoflagel-
collection constitutes one of the largest dinoflagellate lates. Proteins from plants were the most represented in
libraries deposited (Lidie et al. 2005; Hackett et al. 2005; categories such as aminoacid and nucleic acid metabolism.
Tanikawa et al. 2004). The total number of unigenes was On the other hand, proteins from protozoa were the most
6,496, corresponding to less than half of the total sequences represented in translation and cellular cycle categories.
obtained (Table 1). The ratio of sequenced ESTs to the Some categories of proteins such as transport and those
number of unigenes is similar to that reported for other with noncharacterized function were similarly distributed
dinoflagellate EST libraries. among different taxonomic groups of eukaryotes and
Using a cut-off E value of e−5 or less, a total of prokaryotes. This distribution of categories of proteins
5,443 ESTs corresponding to 2,700 unigenes, were found to among different taxonomic groups was similar when E
have similarity to previously identified genes from a wide values of e−30 or less were considered. In summary, by
variety of organisms. Alexandrium catenella sequences using an E value of e−20 as cut off, a higher degree of
were classified according to the organism with the best specificity was obtained resulting in an increased percent of
protein sequence hit. A significant proportion of the ESTs proteins from dinoflagellate, protozoa, plants, and other
show similarity to genes of dinoflagellates (32%), plants microalgae.
(15%), and other protista (protozoa, ciliates, and other
microalgae, 13%). Different percentages were found in a Highly Represented Genes
recent EST library analysis of the dinoflagellate L. poly-
edrum, where the groups most frequently found were land The contigs containing the highest number of ESTs
(analyzed with an E value of e−20) are listed in Table 2.
Table 1 Overview of the results from the A. catenella genomic The sequence coding for luciferin-binding protein (LBP) was
library the most abundant transcript in the library with 80 unigenes
(3%) representing 539 ESTs (15.6% of the total ESTs).
Number of sequences
This gene was also highly expressed in L. polyedrum
Total ESTs sequenced 10,859 (Machabée et al. 1994) and also highly expressed (4%) in a
Total valid ESTs 9,847 previous study of a normalized EST library of this
Average length per EST 736 bp dinoflagellate during the night phase (Tanikawa et al.
Number of contigs 1,021 2004). Similar results were reported in A. tamarense
Number of singletons 5,475 (Hackett et al. 2005). Recently, in a global transcriptional
Total unigenes 6,496
profiling of the toxic dinoflagellate Alexandrium fundyense,
Percentage known unigenes 41.6
four of the 15 signature sequences matched with the LBP
Mar Biotechnol

Fig. 1 Taxonomic group distri-


bution of targets with the best
hit by A. catenella ESTs con-
sidering an E value of less than
10−20 to the National Center for
Biotechnology Information pro-
tein nonredundant database

Fig. 2 Distribution of A. cate-


nella ESTs into the GO catego-
ries of cellular processes
Mar Biotechnol

Table 2 Most highly represented ESTs in A. catenella cDNA library when these domains are compared to other dinoflagellate
Protein Number of % species (Liu et al. 2004). Internal regions of each domain are
ESTs the most conserved, corresponding to the probable catalytic
site of this enzyme. Four conserved histidines are present, at
LBP 539 15.6 the following positions within each domain: first domain
S-adenosyl-L-homocysteine hydrolase 169 4.9
(D1): H138, H148, H163, and H169; second domain (D2):
Glyceraldehyde-3-phosphate dehydrogenase 128 3.7
H512, H525, H540, and H546; and third domain (D3):
isoform 2 (GPDH)
S-adenosylmethionine synthetase 2 105 3.0 H891, H901, H916, and H922. These histidines were
Actin 87 2.5 previously reported in L. polyendrum and are probably
EF-1 alpha-like protein 80 2.3 related to the pH regulation of the activity of this enzyme (Li
Fumarate reductase 71 2.1 et al. 1997). The first and the third domains of the LCF were
Peridinin chl a binding protein 62 1.8 expressed in bacteria and the products were 60 and 45 kDa,
Hsp90 62 1.8 respectively. The three domains of this protein have shown
Phosphoglycerate kinase 59 1.7
to be functional in L. polyedrum (Li et al. 1997).
Ribonucleoside-diphosphate reductase R2 56 1.6
Hsp70 54 1.6
The synthesis of the two luminescence proteins LCF and
Chloroplast phosphoribulokinase 49 1.4 LBP of Lingulodinium is regulated translationally; their
Polyubiquitin 42 1.2 mRNA and protein levels remain constant over the
Chloroplast light harvesting complex protein 40 1.2 circadian cycle (Machabée et al. 1994). Remarkably, both
LCF 35 1.0 LCF and LBP in L. polyedrum are destroyed at the end of
Light-harvesting polyprotein precursor 30 0.9 the night phase and then resynthesized in the next cycle.
Moreover, the scintillons themselves are broken down and
reformed each day (Machabée et al. 1994).
gene (Edner and Anderson 2006). In the present study, the Although the ecological function of the luminescence in
sequences of the two luminescence proteins of A. catenella dinoflagellates has not been determined, it is probably
were subjected to a more detailed analysis. related to predation avoidance and communication (Esaias
and Curl 1972; Abrahams and Townsend 1993). Taken into
Luciferin-binding Protein account that the luminescence proteins are among the most
expressed in A. catenella, probably a high proportion of the
The complete sequence of the LBP coding region of A. energy of this dinoflagellate is dedicated to this particular
catenella ACCO7 was obtained. It comprises 2,194 physiological response. Taken together, the specific char-
nucleotides, corresponding to 663 aminoacids. Sequencing acteristics of the luminescent proteins and their expression
the genomic coding region indicated the lack of introns, and patterns in a paralytic shellfish poisoning producing
after expression in bacteria, an 80-kDa protein was obtained dinoflagellate such as A. catenella are of special relevance
(data not shown). The LBP contains four domains with low to unveil the mechanisms of bloom formation in a toxin-
identity (15%) between them. The highest similarity in the producing species. These specific features could be useful
EST database was found with A. tamarense (Table 3). At for the development of new tools for the detection and
the aminoacid level, the highest similarity (76%) was found localization of this toxic species using bio-optical instru-
with L. polyedrum. As found previously in Lingulodinium, ments (Seliger et al. 1961; Widder et al. 1993).
the amino terminal region of approximately 100 aminoacids Also highly expressed are transcripts that show a very high
of LBP of A. catenella is similar (50%) to the equivalent similarity to the enzymes S-adenosyl-L-homocysteine hydro-
region of luciferase (LCF). This is the first complete lase and the S-adenosylmethionine synthetase 2 (E values of
sequence of LBP reported in a toxic strain of the genus e−129 and e−112, respectively). These enzymes are involved in
Alexandrium (accession number EU236684). methylation reactions that play a major role in the
modification of a large variety of acceptor molecules, such
Luciferase as lipids, polysaccharides, nucleic acids, proteins, and
secondary plant products (reviewed by Giovanelli 1987). In
Another highly expressed luminescence protein was the LCF. eukaryotes, DNA methylation has been implicated in the
Complete sequence analysis of the 3,476 nucleotides coding control of several cellular processes, including differentia-
for the A. catenella enzyme showed that the most closely tion, gene regulation, and embryonic development (Cheng
related were those from A. tamarense and A. affine (94% 1995). The high expression level of genes that matched with
identity) (Liu et al. 2004). The sequence contains no introns the two heat shock proteins HSP90 and HSP70 sequences
and presents three domains with an identity of 76% between was also remarkable. These proteins participate in various
them, a significantly lower value than the identity obtained cellular processes including signal transduction, protein
Mar Biotechnol

Table 3 Photosynthesis and light harvesting proteins of A. catenella EST library

GenBank access number Function E value % Identity Organism

Photosynthesis
EX456598 Chloroplast photosystem II 12 kDa extrinsic 6.00E-64 6.00E-64 Alexandrium tamarense
protein (PsbU)
EX455192 Photosystem II 23 kDa polypeptide (PsbP) E-121 86 Phakopsora pachyrhizi
EX458868 PSII cytochrome c550 oxygen-evolving (PsbV) E-108 91 Alexandrium tamarense
EX455275 Plastid oxygen evolving enhancer 1 precursor (PsbO) 4.00E-71 94 Alexandrium tamarense
EX456053 Chloroplast cytochrome f (PetA) E-116 90 Alexandrium tamarense
EX457854 Chloroplast ferredoxin (PetF) 9.00E-90 83 Alexandrium tamarense
EX455467 Chloroplast ferredoxin-NADP{+) reductase (PetH) E-127 90 Heterocapsa triquetra
EX455749 Rieske iron–sulfur protein precursor (PetC) E-125 94 Alexandrium tamarense
EX463406 Photosystem I iron–sulfur center (PsaC) 3.00E-87 98 Alexandrium tamarense
EX456301 Chloroplast photosystem I subunit XI (PsaL) 3.00E-40 74 Heterocapsa triquetra
EX456206 PSI, ferredoxin-binding protein II (PsaD) 3.00E-51 90 Symbiodinium sp.
EX462386 Chloroplast photosystem I, subunit III (PsaF) E-101 93 Alexandrium tamarense
EX459236 Chloroplast ATP synthase gamma subunit (AtpC) 1.00E-91 89 Alexandrium tamarense
EX462908 Chloroplast ATP synthase subunit C (AtpH) 4.00E-76 76 Alexandrium tamarense
EX462123 Chloroplast light harvesting complex protein 2.00E-84 85 Lingulodinium polyedrum
EX460350 Peridinin-chlorophyll a-binding protein (PCP) 4.00E-75 89 Lingulodinium polyedrum
EX463946 Chloroplast phosphoribulokinase E-133 89 Amphidinium carterae
EX463746 Chloroplast transketolase E-114 80 Euglena gracilis
EX462931 Cytosolic class II fructose bisphosphate aldolase E-130 94 Heterocapsa triquetra
EX462518 Glyceraldehyde-3-phosphate dehydrogenase isoform 2 E-159 91 Symbiodinium sp.
EX455341 Phosphoglycerate kinase E-142 88 Karenia brevis
EX455303 Ribose-5-phosphate isomerase 5.00E-73 90 Phaeodactylum tricornutum
EX461668 RuBisCO form II E-153 96 Amphidinium carterae
EX461810 Triose-phosphate isomerase E-116 87 Isochrysis galbana
Chlorophyll synthesis
EX458222 NADPH protochlorophyllide reductase E-132 77 Phaeodactylum tricornutum
EX455862 Magnesium chelatase H-subunit E-114 88 Ostreococcus lucimarinus
EX455291 Mg-protoporhyrin IX (ChlI) 2.00E-68 88 Amphidinium carterae
EX455321 NADPH-protochlorophyllide oxidoreductase E-127 74 Phaeodactylum tricornutum
EX462775 Chloroplast geranylgeranyl reductase/hydrogenase 7.00E-50 81 Heterocapsa triquetra
EX456318 Glutamate 1-semialdehyde 2,1-aminomutase 7.00E-83 89 Amphidinium carterae
Luminescence
EX458318 LCF E-115 81 Lingulodinium polyedrum
EU236684 LBP 0 97 Alexandrium tamarense
Light receptors
EX456649 Cryptochrome dash 9.00E-47 65 Euglena gracilis
EX462564 Rhodopsin E-71 78 Branchiostoma floridae

folding, protein degradation, and morphological evolution of the library (Zhang et al. 1999). Thus, the 30 photosyn-
(Lindquist and Craig 1988). HSP70 proteins can be found in thesis unigenes represented in the A. catenella cDNA library
different cellular compartments and have a role in the dis- are probably encoded in the nucleus (Table 3). All these
assembly of clathrin cages and also participate in the post- plastid protein sequences contain tripartite N-terminal target-
translational transmembrane targeting of proteins to cellular ing signals that are shown to direct the trafficking of these
organelles (Craig 1989). The sequence coding for these pro- proteins through the different membranes of the dinoflagel-
teins have also been found in high frequency in the EST late secondary plastids. The distribution of these signal
library of the dinoflagellate A. tamarense (Hackett et al. elements in A. catenella plastid protein sequences was equiv-
2005). alent to those observed in the dinoflagellate Heterocapsa
triquera (Patron et al. 2005).
Photosynthesis and Light Harvesting Genes The origin of these nuclear encoded plastid protein
sequences is suggested by the relative high similarity with
None of the 15 known plastid-encoded genes from peridinin- those present in other peridinin-pigmented dinoflagellates
producing dinoflagellates were represented among the ESTs (Table 3). The nuclear location of these genes can be verified
Mar Biotechnol

by using the spliced leader sequence recently found in the blue light (400–500 nm) and UV-A (320–400 nm) receptor,
nuclear-encoded mRNAs of dinoflagellates (Zhang et al. were found. This protein, which is involved in the light
2007). As expected, within this group, the majority regulation of growth and development in plants and other
(accession numbers: EX455192, EX455275, EX455467, cellular processes such as growth and the induction of
EX455749, EX456053, EX456206, EX456301, EX456598, sexual reproduction in algae (Liscum et al. 2003) shows
EX457854, EX458868, EX459236, EX462386, EX462908, 30% identity to those from K. brevis and Arabidopsis
and EX463406) belong to the related species A. tamarense thaliana. We consider that these light receptors are an
(from the A. cantenella–tamarense–fundyense species com- interesting subject of study in relation to the high level of
plex) (Scholin et al. 1995) followed by those from A. expression of blue light luminescence proteins in A.
carterae and H. triquera; L. polyedrum, and Symbiodinium catenella, considering the probable role of the lumines-
sp. Only few ESTs were similar to chloroplast sequences cence in the cellular communication of dinoflagellates.
from the fucoxantin pigmented K. brevis, or from other
organisms such as euglenoids, green algae, and strameno- Other Proteins
piles, which have a different but parallel origin of the plastid
proteins. Two A. catenella unigenes show a 100% identity with a
The most expressed transcripts with a high similarity to toxic strain-specific sequence of A. tamarense (AT-T1),
photosynthesis genes were those predicted to encode the previously identified as a biomarker of toxicity by Chan et
light harvesting complex, composed of a chlorophyll a-/c- al. (2006). Both A. catenella sequences also show similarity
and peridinin-binding protein and those corresponding to a to unknown proteins of the nontoxic dinoflagellate H.
number of proteins of the light phase of the photosynthesis, triquera. These sequences contain signal peptide sequences,
such as photosystems I and II, cytochrome b6f, and ATP suggestive of a plastid targeting protein (Patron et al. 2005).
synthase (Patron et al. 2005) (Table 3). Highly expressed We have also found an A. catenella unigene coding for a
are the unigenes that are highly similar to the carbon protein with a high level of similarity to two interesting
fixation enzyme glyceraldehyde-3-phosphate dehydroge- conjugation-induced proteins, SPS19 from Saccharomyces
nase isoform 2 that was 86% identical to the one from L. cerevisiae and eIF-4A, an eukaryotic elongation factor that
polyedrum. This enzyme participates in the aldehyde was found recently to be induced during conjugation in the
formation during the Calvin cycle in the dark phase of dinoflagellates A. catenella and A. tamarense (Hosoi-
photosynthesis. Sequences coding for this enzyme were Tanabe et al. 2005).
also found among the highest expressed in other EST
libraries of different dinoflagellates such as L. polyedrum
(Bachvaroff et al. 2004), A. tamarense (Hackett et al. 2005),
K. brevis (Lidie et al. 2005), and A. fundyense (Taroncher-
Oldenburg and Anderson 2000). Other highly expressed
genes similar to sequences encoding carbon fixation
proteins were the phosphoglycerate kinase and the chloro-
plast phosphoribulokinase. The library contains the coding
sequences of six enzymes related to chlorophyll synthesis
and two enzymes involved in the synthesis of photo-
protective pigments (Table 3).
We have also found sequences with high similarity to
light receptors. One has 77% identity to the green light
receptor (450 and 500 nm) type 1 rhodopsin described in
Pyrocystis lunula, (Okamoto and Hastings 2003) and to
those from the marine chryptophyte Guillardia theta
(Sineshchekov et al. 2005) and Cryptomonas spp. (29%
and 26%, respectively). Type 1 rhodopsins have recently
been described in the green alga Chlamydomonas rein-
hardtii, where they function as receptors for phototaxis
responses (Sineshchekov et al. 2002). This photosensitive
protein is similar to γ-proteobacterial rhodopsins and more Fig. 3 Venn diagram of the comparison between the A. catenella
ESTs with the genomes of A. thaliana; T. pseudonana; C. merolae;
abundantly expressed during the early day hours (Okamoto Entoamoeba histolytica; and Plasmodium falciparum. The number
and Hastings 2003). Sequences that correspond to a second and percentage of the homologous sequences of A. catenella with each
photosensory receptor, the chryptochrome dash protein, a organism is referred in the intersection
Mar Biotechnol

Two sequences code for a protein with a cysteine-rich Parker MS, Palenik B, Pazour GT, Richardson PM, Rynearson TA,
region, which has similarity to the EhV_307 protein from Saito MA, Schwartz DC, Thamatrakoln K, Valentin K, Vardi A,
Wilkerson FP, Rokhsar DS (2004) The genome of the diatom
the Emiliania huxleyi virus (Wilson et al. 2005). Other viral Thalassiosira pseudonana: ecology, evolution, and metabolism.
sequences from the Paramecium bursaria Chlorella virus Science 306:79–86
were also found but with a lower similarity. Bachvaroff TR, Concepcion GT, Rogers CR, Herman EM, Delwiche CH
Genes predicted to encode a diversity of proteins (2004) Dinoflagellate expressed sequence tag data indicate massive
transfer of chloroplast genes to the nuclear genome. Protist 155:65–
involved in transport processes were detected; among them 78
were Na, K, Ca, phosphate, and ammonium channels, and Bhaud Y, Geraud M, Ausseil J, Soyer-Gobillard MO, Moreu H (1999)
also antiporters; ABC-transporters; aminoacid transporters; Cyclic expression of a nuclear protein in a dinoflagellate. J
and the Sec61 and SecY translocases, involved in secretion Eukaryot Microbiol 46:259–267
Chan LL, Sit WH, Lam PK, Hsieh DP, Hodgkiss IJ, Wan JM, Ho AY,
pathways in eukaryotes. Thirteen sequences that correspond Choi NM, Wang DZ, Dudgeon D (2006) Identification and
to transposable elements previously described by Armbrust characterization of a “biomarker of toxicity” from the proteome
et al. (2004) in the Thalassiosira pseudonana genome were of the paralytic shellfish toxin-producing dinoflagellate Alexan-
found with a relatively low similarity. drium tamarense (Dinophyceae). Proteomics 6:654–666
Cheng X (1995) DNA modification by methyltransferases. Curr Opin
Struct Biol 5:4–10
Comparative Genomics Craig EA (1989) Essential roles of 70 kDa heat inducible proteins.
Bioessays 11:48–52
The A. catenella protein database was compared with ge- Easias WE, Curl HC Jr (1972) Effect of dinoflagellate biolumines-
cence of copepod ingestión rates. Limnol Oceanogr 17:901–906
nomes of the plant A. thaliana and to genomes of unicellular Edner DL, Anderson DM (2006) Global transcriptional profiling of
eukaryotes of the protista kingdom, such as T. pseudonana, the toxic dinoflagellate Alexandrium fundyense using massively
Entoamoeba histolytica, Cryptosporidium hominis, and the parallel signature sequencing. BMC Genomics 7:88
red algae Cyanidioschyzon merolae (Fig. 3). The Venn Giovanelli J (1987) Sulfur aminoacids of plants: an overview.
Methods Enzymol 143:419–426
diagram shows the highest similarity with A. thaliana Guillard R (1995) Culture methods. In: Hallegraeff GM, Anderson
(19.3%), the diatom T. pseudonana (19.1%), and C. merolae DM, Cembella AD (eds) IOC manuals and guides: manual on
(18.3%) (Fig. 3). We observed a similar distribution of harmful marine microalgae. Intergovernmental Oceanographic
functional groups (COGs) among the sequences in common Commission of UNESCO, Paris, pp 45–62
Guzmán L, Campodónico I (1975) Marea Roja en la región de
with those five organisms (not shown). Magallanes. Publ Inst Pat Ser Monogr Punta Arenas (Chile) 9:44
When the unigenes of this library were compared with Guzmán L, Campodónico I (1978) Mareas Rojas en Chile. Inter-
10,886 ESTs of the closely related species A. tamarense ciencia 3:144–151
(Hackett et al. 2005) present in the public database, we found Hackett JD, Scheetz TE, Yoon HS, Soares MB, Bonaldo MF,
Casavant TL, Bhattacharya D (2005) Insight into a dinoflagellate
3,045 (46.9%) hits. From them, the 1,236 common unigenes genome through expressed sequence tag analysis. BMC
were classified into COGs, and the most represented Genomics 6:80
categories corresponded to carbohydrate metabolism Hallegraeff G (1993) A review of harmful algal blooms and their
(11.8%), posttranslational modification and chaperones apparent global increase. Phycologia 32:79–99
Hosoi-Tanabe S, Tomishima S, Nagai S, SaKo Y (2005) Identification
(9.1%), energy production (7.4%), and luminescence (6.6%). of a gene induced in conjugation-promoted cells of toxic marine
dinoflagellate Alexandrium tamarense and Alexandrium catenella
Acknowledgments This research has been partially funded by using differential display analysis. FEMS Microbiol Lett
CONICYT-FONDEF Project MR02I1003 and by a Microsoft Re- 251:161–168
search Joint R&D Program. Huang X, Madan A (1999) CAP3: A DNA sequence assembly
program. Genome Res 9:868–877
Imai I (1987) Size distribution, number and biomass of bacteria in
intertidal sediments and seawater of Ohmi Bay, Japan. Bull Jpn
References
Soc Microb Ecol 2:1–11
Kuwae T, Hosokawa Y (1999) Determination of abundance and
Abrahams MV, Townsend LD (1993) Bioluminescence in dinoflagellates: biovolume of bacteria in sediments by dual staining with 4_6-
A test of the burgular alarm hypothesis. Ecology 74:258–260 diamino-2-phenylindole and acridine orange: relationship to
Altschul SF, Gish W, Miller E, Myers EW, Lipman DT (1990) Basic dispersion treatment and sediment characteristics. Appl Environ
local alignment search tool. J Mol Biol 3:403–410 Microbiol 65:3407–3412
Anderson DM, Grabher A, Herzog M (1992) Separation of coding Li L, Hong R, Hastings JW (1997) Three functional luciferase domains
sequences from structural DNA in the dinoflagellate Cryptheco- in a single polypeptide chain. Proc Natl Acad Sci U S A 94:8954–
dinium cohnii. Mol Mar Biol Biotechnol 1:89–96 8958
Armbrust EY, Berges JA, Bowler C, Green BR, Martinez D, Putnam NH, Lidie KB, Ryan JC, Barbier M, Vandolah FM (2005) Gene expression
Zhou S, Allen AF, Apt KE, Bechner M, Brzezinski MA, Chaal BK, in Florida Red Tide Dinoflagellate Karenia brevis: Analysis of an
Chiovitti A, Davis AK, Demarest MS, Detter JC, Glavina T, expressed sequence tag library and development of a DNA
Goodstein D, Hadi MZ, Hellsten U, Hildebrand M, Jenkins BD, microarray. Mar Biotechnol 7:481–493
Jurka J, Kapitonov VV, Kroger N, Lau WW, Lane T, Larimer FW, Lin S (2006) The smallest dinoflagellate genome is yet to be found: A
Lippmeier JC, Lucas S, Medina M, Montsant A, Obornik M, comment on LaJeunesse et al. “Simbiodinium (Pyrrophyta)
Mar Biotechnol

genome sizes (DNA content) are smallest among dinoflagel- Spector D (1984) Dinoflagellate nuclei. In: Spector DL (ed)
lates”. J Phycol 42:746–748 Dinoflagellates. Academic, Orlando, pp 107–147
Lindquist S, Craig EA (1988) The heat-shock proteins. Annu Rev Sweeney B (1987) Bioluminescence and circadian rhythms. In: Taylor
Genet 22:631–677 FJR (ed) The biology of dinoflagellates, botanical monographs,
Liscum E, Hodgson DW, Campbell TJ (2003) Blue light signaling vol 21. Blackwell Scientific, Oxford
through the cryptochromes and phototropins. So that’s what the Tanikawa N, Akimoto H, Ogoh K, Chun W, Ohmiya Y (2004)
blues is all about. Plant Physiol 133:1429–1436 Expressed sequence tag analysis of the dinoflagellate Lingulodi-
Liu L, Wilson T, Hastings JW (2004) Molecular evolution of dinoflagel- nium polyedrum during dark phase. Photochem Photobiol 80:31–
late luciferases, enzymes with three catalytic domains in a single 35
polypeptide. Proc Natl Acad Sci U S A 101(47):16555–16560 Taroncher-Oldenburg G, Anderson DM (2000) Identification and
Machabee S, Wall L, Morse D (1994) Expression and genomic characterization of three differentially expressed genes, encoding
organization of a dinoflagellate gene family. Plant Mol Biol S-adenosylhomocysteine hydrolase, methionine aminopeptidase,
25:23–31 and a histone-like protein, in the toxic dinoflagellate Alexandrium
Mulder NJ, Apweiler R, Attwook TK, Bairoch A, Bateman A, Binns D, fundyense. Appl Environ Microbiol 66:2105–2112
Bork P, Buillard V, Cerutti L, Copley R, Courcelle E, Das U, Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin
Daugherty L, Dibley M, Finn R, Fleischmann W, Gough J, Haft D, EV, Krylov DM, Mazumder R, Mekhedov SL, Nikolskaya AN,
Hulo N, Hunter S, Kahn D, Kanapin A, Kejariwal A, Labarga A, Rao BS, Smirnov S, Sverdlov AV, Vasudevan S, Wolf YI, Yin JJ,
Langendijk-Genevaux PS, Lonsdale D, Lóperz R, Letunic I, Madera Natale DA (2003) The COG database: an updated version includes
M, Maslen J, McAnulla C, McDowall J, Mistry J, Mitchell A, eukaryotes. BMC Bioinformatics 4:41
Nikolskaya AN, Orchard S, Orengo C, Petryszak R, Slengut JD, The Gene Ontology Consortium (2007) The Gene Ontology project in
Sigrist CJ, Thomas PD, Valentin F, Wilson D, Wu CH, Yeats C 2008. Nucleic Acids Res 34:D322–D326
(2007) New developments in the InterPro database. Nucleic Acids Triplett EL, Govind NS, Roman SI, Jovinem RVM, Prèzelinm BB
Res 35:224–228 (1993) Characterization of the sequence organization of DNA
Okamoto OK, Hastings JW (2003) Novel dinoflagellate clock-related from the dinoflagellate Heterocapsa pygmaea (Glenodinium sp.).
genes identified through microarrays analysis. J Phycol 39:519–526 Mol Mar Biol Biotechnol 2:239–245
Patron NJ, Waller RF, Archibald JH, Keeling PT (2005) Complex Uribe P, Espejo RT (2003) Effect of associated bacterial microflora in
protein targeting to dinoflagellate plastids. J Mol Biol 348:1015– the growth and toxin production of Alexandrium catenella. Appl
1024 Environ Microbiol 69:659–662
Santos SR, Coffroth MA (2003) Molecular genetic evidence that Weinmaster G, Roberts VJ, Lemke G (1992) Notch2: a second
dinoflagellates belonging to the genus Symbiodinium Freudenthal mammalian Notch gene. Development 116:931–941
are haploid. Biol Bull 204:10–20 Widder EA, Case JF, Bernstein SA, MacIntyre S, Lowenstine MR,
Scholin CA, Hallegraeff GM, Anderson DM (1995) Molecular Bowlby MR, Cook DP (1993) A new large volume biolumines-
evolution of the Alexandrium tamarense species complex cence bathyphotometer with defined turbulence excitation. Deep-
(Dinophyceae) dispersal in the North American and west Pacific Sea Res 40:607–627
regions. Phycologia 34:472–485 Wilson WH, Schroeder DC, Allen MJ, Holden MT, Parkhill J,
Seliger HH, Fastie WG, McElroy WD (1961) Bioluminescence in Barrell BG, Churcher C, Hamlin N, Mungall K, Norbertczak H,
Chesapeake Bay. Science 133:699–700 Quail MA, Price C, Rabbinowitsch E, Walker D, Craigon M,
Sigee DC (1984) Structural DNA and genetically active DNA in Roy D, Ghazal P (2005) Complete genome sequence and lytic
dinoflagellate chromosomes. Biosystems 16:203–210 phase transcription profile of a Coccolithovirus. Science
Sineshchekov OA, Jung KH, Spudich JL (2002) Two rhodopsins 309:1090–1092
mediate phototaxis to low- and high-intensity light in Chlamydo- Zhang Z, Green BR, Cavalier-Smith T (1999) Single gene circles in
monas reinhardtii. Proc Natl Acad Sci U S A 25(99):8689–8694 dinoflagellate chloroplast genomes. Nature 400:155–159
Sineshchekov OA, Govorunova EG, Jung KH, Zauner S, Maier US, Zhang H, Hou Y, Miranda L, Campbell DA, Sturm NR, Gaasterland T,
Spudich JL (2005) Rhodopsin-mediated photoreception in Lin S (2007) Spliced leader RNA trans-splicing in dinoflagellates.
cryptophyte flagellates. Biophys J 89:4310–4319 Proc Natl Acad Sci U S A 104:4618–4623

You might also like