Professional Documents
Culture Documents
a r t i c l e
i n f o
Article history:
Received 1 May 2015
Received in revised form 12 August 2015
Accepted 24 August 2015
Available online 29 August 2015
Keywords:
RUNX1RUNX1T1 fusion gene
Alternative splicing
Data mining
Exons-hubs
Power-law behavior
a b s t r a c t
The t(8;21) translocation is the most widespread genetic defect found in human acute myeloid leukemia.
This translocation results in the RUNX1RUNX1T1 fusion gene that produces a wide variety of alternative
transcripts and inuences the course of the disease. The rules of combinatorics and splicing of exons in the
RUNX1RUNX1T1 transcripts are not known. To address this issue, we developed an exon graph model of
the fusion gene organization and evaluated its local exon combinatorics by the exon combinatorial index
(ECI). Here we show that the local exon combinatorics of the RUNX1RUNX1T1 gene follows a power-law
behavior and (i) the vast majority of exons has a low ECI, (ii) only a small part is represented by exonshubs of splicing with very high ECI values, and (iii) it is scale-free and very sensitive to targeted skipping
of exons-hubs. Stochasticity of the splicing machinery and preferred usage of exons in alternative
splicing can explain such behavior of the system. Stochasticity may explain up to 12% of the ECI variance
and results in a number of non-coding and unproductive transcripts that can be considered as a noise.
Half-life of these transcripts is increased due to the deregulation of some key genes of the nonsensemediated decay system in leukemia cells. On the other hand, preferred usage of exons may explain up
to 75% of the ECI variability. Our analysis revealed a set of splicing-related cis-regulatory motifs that
can explain attractiveness of exons in alternative splicing but only when they are considered together.
Cis-regulatory motifs are guides for splicing trans-factors and we observed a leukemia-specic prole of
expression of the splicing genes in t(8;21)-positive blasts. Altogether, our results show that alternative
splicing of the RUNX1RUNX1T1 transcripts follows strict rules and that the power-law component of
the fusion gene organization confers a high exibility to this process.
2015 Elsevier Ltd. All rights reserved.
1. Introduction
The t(8;21) translocation occurs in 412% of adult and 1230% of
pediatric cases of acute myeloid leukemia (AML) and represents the
most common genetic abnormality in human leukemias (Mller
et al., 2008). The main outcome of the translocation is the fusion
gene RUNX1RUNX1T1, which produces a wide range of different
transcripts (Era et al., 1995; Erickson et al., 1992; Kozu et al., 1993,
2005; LaFiura et al., 2008; Lasa et al., 2002; Mannari et al., 2010;
Miyoshi et al., 1993; Nisson et al., 1992; Saunders et al., 1996; Tighe
and Calabi, 1994; Van de Locht et al., 1994; Yan et al., 2006; Zhang
et al., 1997). One part of these transcripts is protein-coding, the
other is non-coding. Both full-length and truncated isoforms of the
fusion protein were also found experimentally. These isoforms are
transcriptional regulators with different activity (Kozu et al., 2005;
LaFiura et al., 2008; Mannari et al., 2010; Yan et al., 2006). It is
believed that RUNX1RUNX1T1 proteins play the critical role in
the initiation and persistence of the t(8;21)-positive AML (Hatlen
et al., 2012).
A large diversity of the RUNX1RUNX1T1 transcripts raises a
question if there is any rule of exon combination and splicing. To
V.V. Grinev et al. / The International Journal of Biochemistry & Cell Biology 68 (2015) 4858
date, only some elements of this puzzle are known. Thus, Tighe and
Calabi (1994) showed that the structure of the breakpoint region of
the fusion gene inuences variety of its transcripts. LaFiura et al.
(2008) found a connection between inclusion of cassette exons
from this region and formation of premature termination codons
(PTCs) in transcripts. At the same time, usage of some other alternative exons does not lead to a PTC but produces active isoforms
of the protein (Mannari et al., 2010; Yan et al., 2006). However,
available data are insufcient for the full understanding of the splicing principles of the RUNX1RUNX1T1 transcripts. Meanwhile, this
knowledge would allow us to clarify organization of the fusion
gene, its properties and functional role in leukemogenesis.
Our goal was to nd out whether there is any pattern in the local
exon combinatorics of the fusion gene. In this article, the term local
exon combinatorics refers to a set of alternative splicing events
generating different mRNA isoforms from a given exon, whereas
exon combinatorial index (ECI) is a quantitative measure of the
local exon combinatorics. Instead of the conventional linear model,
we used an exon graph model of the fusion gene in which the ECI
is an equivalent of the topological index node degree and means a
number of unique splicing events that involve an exon.
Here we show that empirical distribution of ECI values of the
RUNX1RUNX1T1 exons follows a power-law function and has
some specic properties: the vast majority of exons has a low ECI
while a small part is represented by exons-hubs of splicing with
high ECI values, the distribution is scale-free and is sensitive to
targeted skipping of exons-hubs. This distribution is formed by
stochasticity of the splicing machinery and preferred usage of exons
in alternative splicing, where attractiveness of an exon is mostly
determined by a set of sequence-related features. Altogether, our
results show that alternative splicing of the RUNX1RUNX1T1
transcripts follows strict rules and that the power-law component
of the fusion gene organization confers a high exibility to this
process.
2. Materials and methods
2.1. Cell line, patients and healthy donors samples
The t(8;21)-positive AML cell line Kasumi-1 (ATCC CRL2724TM) was obtained from the ATCC (LGC Standards GmbH,
Germany) and cultivated according to the standard protocol.
Twelve young patients with t(8;21)-positive AML were
diagnosed and treated at Belarusian Research Center for Pediatric Oncology, Hematology and Immunology (Minsk, Belarus).
Mononuclear cells were isolated using Histopaque (SigmaAldrich,
St Louis, USA) from patients bone marrow samples obtained before
the treatment and/or at the time of remission.
Bone marrow mononuclear cells (BMMNC) and peripheral blood
mononuclear cells (PBMNC) were obtained from primary material
of healthy donors using Histopaque (SigmaAldrich, St Louis, USA).
CD34+ hematopoietic progenitor/stem cells (HPSC) were isolated
from BMMNC of healthy individuals by magnetic separation with
EasySep Human CD34 Positive Selection Kit (StemCell Technologies
SARL, Grenoble, France). For the further total RNA isolation, we used
only cell samples with purity of CD34+ HPSC 99%.
This study was approved by the institutional ethical committee
and our research team followed the principles of the Declaration of
Helsinki for research involving human subjects.
2.2. cDNA synthesis, standard RT-PCR and real-time PCR
Total cellular RNA was isolated from cells using a TRI Reagent
(SigmaAldrich, St Louis, USA) according to the instruction of the
manufacturer.
49
50
V.V. Grinev et al. / The International Journal of Biochemistry & Cell Biology 68 (2015) 4858
3. Results
3.1. The RUNX1RUNX1T1 gene is a source of unprecedented
diversity of mRNA products
To reconstruct the exon graph, we created a comprehensive collection of transcripts of the fusion gene. We identied 102 unique
full-length transcripts and 8 unique expressed sequence tags (ESTs)
of the gene of interest in PubMed, GenBank and ChimerDB 2.0
databases (Benson et al., 2013; Era et al., 1995; Erickson et al., 1992;
Kim et al., 2010; Kozu et al., 1993, 2005; LaFiura et al., 2008; Lasa
et al., 2002; Mannari et al., 2010; Miyoshi et al., 1993; Nisson et al.,
1992; Saunders et al., 1996; Sayers et al., 2012; Tighe and Calabi,
1994; Van de Locht et al., 1994; Yan et al., 2006; Zhang et al., 1997).
In these sources, exon structure of all transcripts was described,
but the nucleotide sequence of some rare and unique exons was
not published. Therefore, we were able to fully reconstruct the
nucleotide sequence for 61.8% of full-length transcripts, and the
sequence of remaining transcripts was restored only partially.
To complete our collection, we created a cDNA library. The
library is based on cDNA from bone marrow samples of 12 young
patients with t(8;21)-positive AML (Supplementary Table 3) and
Kasumi-1 cells. For cDNA amplication, we used forward primers
directed to 5 UTR exons 1, 4a/4b, 7a/7c, 7d, 8a and 11a and reverse
primers directed to 3 UTR exons 12a, 15a, 17a and 17 of the fusion
gene. We also used primers specic to internal exons to amplify
rare and poorly detected transcripts (Fig. 1; Supplementary Tables
4 and 5).
In our cDNA library, we identied 33 new full-length and 55
short EST-like transcripts (Supplementary Table 1). This helped us
to expand signicantly the list of known transcripts of the fusion
gene: current collection includes 135 full-length and 63 EST-like
sequences. From 55 newly found ESTs, 30 sequences matched the
full-length transcripts of the fusion gene only partially. It means
that in t(8;21)-positive leukemia exists a subset of rare or hardly
amplied full-length transcripts that were not identied so far.
3.2. Power-law behavior of the local combinatorics of the
RUNX1RUNX1T1 exons
To nd out the character of the local exon combinatorics, we
developed an exon graph of the fusion gene organization. This exon
graph is based on full-length transcripts and includes 99 exons
connected by 163 splicing events (Fig. 2A).
We quantied the exon usage in different alternative splicing
events by the exon graph topology analysis and expressed this
metric with ECI values. This index falls in the range from 1 to 34
with high standard deviation of 5.1. Visual inspection of the ECI
value distribution lead us to the hypothesis that this index follows
a power-law function. To test this hypothesis, we used a threestep approach (Section 2) based on the mathematical formalism
of (Clauset et al., 2009; Virkar and Clauset, 2012). Our statistical
tests supported the power-law model y = x2.31 of the observed
distribution (Fig. 2B).
V.V. Grinev et al. / The International Journal of Biochemistry & Cell Biology 68 (2015) 4858
51
Fig. 1. A set of primers specic for terminal or internal exons of human RUNX1 and RUNX1T1 genes was used for the RUNX1RUNX1T1 gene cDNA library construction.
Analysis of the cumulative distribution shows that approximately 80% of exons of the RUNX1RUNX1T1 gene have a small ECI
value 3. This exon group represents cassette (mainly UTR exons
and exons from the breakpoint region) and constitutive (most of
the exons from 3 -RUNX1T1 part of the fusion gene) exons that are
not involved in alternative splicing.
At the same time, about 20% of the remaining exons have high
combinatorial index 4. These exons form a heavy right tail of
the empirical distribution. They are constitutive and are widely
used in alternative splicing as the exons of this group account
for about 80% of the total diversity of splicing events occurred in
the fusion gene transcripts. Noteworthy, exons 5, 6, 8b, 9, 10 and
11 are the most interesting: about 64% of the diversity of splicing events occurs involving these exons. Herewith, exons 5 and 6
encode almost entire DNA binding Runt homology domain RHD
of the RUNX1RUNX1T1 protein (Meyers et al., 1993) and exon 8
encodes a polypeptide bridge that connects RUNX1 and RUNX1T1
parts of the fusion protein. As for exons 9, 10 and 11, they encode
the rst conservative domain NHR1 from the RUNX1T1 part of the
RUNX1RUNX1T1 protein (Davis et al., 2003).
To clarify the relationship between the two groups of exons
mentioned above, we evaluated splicing preferences of these exons
by Kleinbergs authority score and the assortativity coefcient. We
found that according to the authority score all exons can be grouped
into three stable clusters with consensus higher than 0.95. The rst
cluster included exons with extremely low authority score between
4.4e18 and 5.4e2 (dark-green balls, Fig. 2C), the second cluster was composed of exons with moderate authority score ranging
from 6.6e2 to 0.3 (red balls, Fig. 2C) and, nally, the outlying exon
8b was always considered as the third cluster (blue ball, Fig. 2C).
Herewith, the second cluster is represented by exons with ECI values ranging from 2 to 31 (mean 4.4) that is on average 2.1 times
higher (p = 0.0006, MannWhitney U test) than for exons of the
rst cluster with ECI values ranging from 1 to 9 (mean 2.1). Despite
this, the assortativity coefcient for the whole exon graph is 0.38,
which is apparently due to a signicant predominance of the exons
Fig. 2. The local combinatorics of RUNX1RUNX1T1 exons follows a power-law behavior. (A) Exon graph of the RUNX1RUNX1T1 gene. Exons were clustered into 23 groups
(E) based on the genomic origin and/or overlapping of sequences. For each group, a well-known reference exon is shown in parentheses. (B) The power-law behavior of
the local combinatorics of RUNX1RUNX1T1 exons is supported by statistical tests on plausibility. The power-law function is good tted (red dashed line) to the heavy
right tail of empirical data (blue diamonds) and has the lowest KolmogorovSmirnov distance D and the highest bootstrap p-value among competing statistical models (see
Section 2). (C) Exons can be grouped into three stable clusters based on Kleinbergs authority score. However, most of exons have extremely low or moderate values of the
authority score. (For interpretation of the references to color in this gure legend, the reader is referred to the web version of this article.)
52
V.V. Grinev et al. / The International Journal of Biochemistry & Cell Biology 68 (2015) 4858
1.0
0.8
1
3
4
ECI value
Cumulative probability
0.6
0.4
30
20
10
0.2
0.0
0.0
40
0.2
0.4
0.6
0.8
1.0
0
-4
-2
Fig. 3. Stochastic noise in the splicing machinery and the positional distribution of exons in transcripts make a minor contribution to the variance of ECI values. (A) There
is a clear and signicant difference between the cumulative curve of the empirical ECI values (line 1) and the theoretical cumulative curve for a random exon graph (line 2)
(p = 1.4 108 , MannWhitney U test). No signicant difference was found between empirical and non-coding (line 3) or unproductive (line 4) noise corrected cumulative
distributions (p > 0.05, MannWhitney U test). Distributions were normalized to their max values. (B) Exons with high ECI values tend to occupy a position close to the center
of transcripts that include the exon of interest. In this gure, the corresponding position of an exon that is close to the 5 end (left to the center, indicated by 0) is displayed
by a negative value, while an exon close to the 3 end is indicated by a positive value.
with a moderate or low authority score and low ECI values in the
graph.
3.3. Stochasticity makes a minor contribution to the variance of
ECI values
The power-law distribution cannot result only from random
splicing of the RUNX1RUNX1T1 exons. Thus, cumulative distribution of the empirical ECI values is clearly and signicantly
different from the theoretical curve for a random graph (Fig. 3A).
Nevertheless, we evaluated contribution of randomness to the
local combinatorics of the RUNX1RUNX1T1 exons because it is
an important source of diversity of alternative splicing events in
human transcriptome (Melamud and Moult, 2009; Pickrell et al.,
2010).
For this purpose, we rst identied noise splicing events that
lead to the formation of non-coding transcripts or unproductive
transcripts with a PTC. These two categories of noise account
for about 13% and 26% of the splicing events diversity in the
RUNX1RUNX1T1 transcripts, respectively. However, the empirical cumulative distribution of ECI values becomes slightly different
only after correction for unproductive splicing but not after correction for splicing events that lead to non-coding transcripts (Fig. 3A).
Additionally, we evaluated the relationship between position
of an exon in transcripts and its ECI value. We performed this
analysis because the fusion gene is characterized by a large variety of cassette UTR exons and exons from the breakpoint region.
We expected that such organization of the gene gives a chance to
the nearest constitutive exons to get a high rank ECI. However, we
found only a moderate correlation between the positional distribution of the exons and the distribution of their ECI values ( = 0.455,
p = 2.2 106 ; Fig. 3B).
From a random forests-based nonlinear multiple regression, we
found that the noise splicing and the positional chance explained
not more than 12% of the ECI variance. Therefore, stochasticity is
only a minor factor in formation of the ECI value.
3.4. Deregulation of the NMD genes in leukemia cells may explain
a high abundance of unproductive RUNX1RUNX1T1 transcripts
In our dataset, about 38% of mRNA molecules are PTC-containing
transcripts. Although these transcripts are potential targets for
NMD system, their expression remains at relatively high level.
For example, inclusion of exon 15a as an internal exon (amplicon
exons 15a-15, Fig. 4A) always leads to formation of transcripts
with a PTC, which expression is comparable with that of some
V.V. Grinev et al. / The International Journal of Biochemistry & Cell Biology 68 (2015) 4858
53
Fig. 4. Activity of the NMD system in children t(8;21)-positive AML cells is deregulated. (A) In NMD study, ve different RUNX1RUNX1T1 cDNA-based amplicons were
quantied by real-time PCR. Quantity of the amplicon exons 15a-15 indicates the expression level of transcripts comprising exon 15a as an internal exon. When exon 15a
is used as an internal exon, it introduces a PTC in the mature transcript. (B) According to real-time PCR and statistical analysis, RUNX1RUNX1T1 mRNA isoforms containing
exons 11-12a, 15a, 15a-15, 16-17a or 16-17 are differentially expressed in leukemia cells. Herewith, expression level of transcripts with internal exon 15a is similar to
transcripts with exons 16-17a, which do not include a PTC (p = 0.79, MannWhitney U test). However, it is assumed that exon 15a can be not only an internal but also a
3 UTR exon. In particular, the overall expression level of transcripts containing exon 15a is signicantly higher than level of the PTC-containing transcripts with exons 15a-15
(p < 0.001, MannWhitney U test). (C) Expression of NMD genes is signicantly increased or decreased in leukemia cells in comparison with normal hematopoietic cells.
Statistical signicance of differences was conrmed with MannWhitney U test. (D) For some mRNA isoforms, we found a strong correlation with expression of NMD genes.
1.0
empirical
Cumulative probability
0.8
power 0.8
power 1
0.6
power 1.1
0.4
power 1.2
power 1.6
0.2
power 1.4
0.0
0.0
0.2
0.4
0.6
0.8
1.0
54
V.V. Grinev et al. / The International Journal of Biochemistry & Cell Biology 68 (2015) 4858
Fig. 6. Sequence features of human RUNX1RUNX1T1 exons and anking introns may determine the value of the ECI. (A) All sequence features were extracted from three
classes of mRNA structure elements. The rst class includes features of the target exon (exon of interest; SETE ) and its anking 5 (S5TE 1 ) and 3 (S3TE 1 ) intronic sequences. The
USE
second class contains features of the upstream rst neighboring exons (SEUSE ) and their anking 5 (S5USE
1 ) and 3 (S3 1 ) intronic sequences. Finally, the third class includes
DSE
features of the downstream rst neighboring exons (SEDSE ) and corresponding anking 5 (S5DSE
1 ) and 3 (S3 1 ) intronic sequences. (B) Sequence features are not equal in
importance for the prediction of the ECI value. The important features were ranked according to the mean decrease in the accuracy of the ECI value prediction after the
random permutation of the original feature values. An insertion of Venn diagram shows an overlap between the selected important features for the three types of the ECI.
(C) A complex relationship between the sequence features and the ECI value. None of the sequence features can reliably predict the ECI value. Such predictions can be made
on a compendium of features. The inner track of Circos plot includes sectors of combined set of features that were selected as signicant in prediction of the value of the in-,
out- and/or total-ECI. Width of each sector is proportional to the strength of the corresponding feature effect on the ECI value. Positive or negative character of this effect was
inferred from the correlation analysis. The outer track of the plot contains features of different subclasses. (D) Our compendium of the sequence features permits to predict
the values of the ECI by regression random forests with a high accuracy. The line plot demonstrates a binned distribution of Spearmans between the real values of the
ECI from the test subset of empirical data and the predicted values. This plot is based on 1000 simulations of the original and randomly permutated ECI values. Lines 1 and
1 represent the original and permutated total-ECI, lines 2 and 2 show the original and permutated in-ECI, and lines 3 and 3 indicate the original and permutated out-ECI,
respectively.
Model experiments demonstrated that selected features permit to predict the ECI value with high accuracy. For instance, the
median of Spearmans between values predicted by the trained
algorithm and empirical values of the total-ECI is 0.86 (Fig. 6D),
and the adjusted coefcient of determination equals to 0.75. We
observed the same results for in- and out-ECIs (Fig. 6D). Altogether,
our data provide an evidence that sequence features and the ECI
value of the RUNX1RUNX1T1 exons are closely interrelated.
3.7. Exons with high ECI values are hot points of the
RUNX1RUNX1T1 mRNA splicing
A power-law graph is highly sensitive to targeted attacks against
important vertices (Iyer et al., 2013; Schneidera et al., 2011). The
RUNX1RUNX1T1 exon graph has a power-law component and it
may have the same property. To check this hypothesis, we modeled
a skipping of exons by the splicing system and an outcome of such
a skip was evaluated with ve metrics (Fig. 8).
V.V. Grinev et al. / The International Journal of Biochemistry & Cell Biology 68 (2015) 4858
55
Fig. 7. Genes of splicing factors differentially expressed in t(8;21)-positive AML blasts. (A) The RBFOX3 gene is not expressed or expressed under the threshold of detection
by RT-PCR in normal hematopoietic cells but this gene is expressed in leukemia cells. The lanes on the upper electrophoregram: Fermentas GeneRulerTM 100 bp DNA Ladder
Plus (1), amplication of cDNA of the TBP gene from Kasumi-1 cells (2), amplication of cDNA of the RBFOX3 gene from normal PBMNC (3, 5), BMMNC (7, 9), CD34+ HPSC
(11, 13) and from Kasumi-1 cells (15) and amplication of cDNA of the RBFOX3 gene from respective RT negative controls (4, 6, 8, 10, 12, 14, 16). The lanes on the bottom
electrophoregram: Fermentas GeneRulerTM 100 bp DNA Ladder Plus (1), amplication of cDNA of the RBFOX3 gene from the bone marrow samples of nine children with
t(8;21)-positive AML (2, 4, 6, 8, 10, 12, 14, 16, 18) and respective RT negative controls (3, 5, 7, 9, 11, 13, 15, 17, 19). (B) Real-time PCR conrms the differential expression
of the RBFOX3 gene in normal and malignant hematopoietic cells. Expression of the RBFOX3 gene was normalized relative to the expression of the TBP gene, and then
re-normalized to the expression of this gene in Kasumi-1 cells. The picture shows an averaged expression of the RBFOX3 gene in 4 samples of normal CD34+ HPSC, 5 samples
of normal BMMNC, 5 samples of normal PBMNC and 9 bone marrow samples of children with t(8;21)-positive AML. (C) There is a signicant (according to MannWhitney U
test) differential expression of the splicing factors genes in leukemia cells in comparison with normal hematopoietic cells. (D) Correlation between expression of the splicing
factors genes and mRNA isoforms of the RUNX1RUNX1T1 gene.
56
V.V. Grinev et al. / The International Journal of Biochemistry & Cell Biology 68 (2015) 4858
20 40 60 80 100
Fraction of skipped exons
1.0
1.0
1.0
20 40 60 80 100
Fraction of skipped exons
B
1.0
A
0.8
0.6
0.4
0.2
0.0
0.8
0.6
0.4
0.2
0.0
0.8
0.6
0.4
0.2
0.0
20 40 60 80 100
Fraction of skipped exons
20 40 60 80 100
Fraction of skipped exons
0.8
0.6
0.4
0.2
0.0
Legend:
diversity of transcripts
average size (in number of exons) of transcripts
average length (in number of nucleotides) of transcripts
average length of ORF
portion of transcripts containing PTC
Fig. 8. In silico modeling supports a strong sensitivity of splicing of RUNX1RUNX1T1 transcripts to skipping of exons with high ECI values. (A) Skipping of exons that were
listed in the descending order of their ECI values: experimentally veried transcripts (on the top), predicted transcripts (on the bottom). (B) This picture is similar to (A), but
exons were excluded from splicing process in the ascending order of values of their ECI.
V.V. Grinev et al. / The International Journal of Biochemistry & Cell Biology 68 (2015) 4858
57
Davis, J.N., McGhee, L., Meyers, S., 2003. The ETO (MTG8) gene family. Gene 303,
110.
del Rio, G., Koschtzki, D., Coello, G., 2009. How to identify essential genes from
molecular networks? BMC Syst. Biol. 3, 102.
Dvinge, H., Ries, R.E., Ilagan, J.O., Stirewalt, D.L., Meshinchi, S., Bradley, R.K., 2014.
Sample processing obscures cancer-specic alterations in leukemic
transcriptomes. Proc. Natl. Acad. Sci. U. S. A. 111, 1680216807.
Era, T., Asou, N., Kunisada, T., Yamasaki, H., Asou, H., Kamada, N., et al., 1995.
Identication of two transcripts of AML1/ETO-fused gene in t(8;21) leukemic
cells and expression of wild type ETO gene in hematopoietic cells. Genes
Chromosomes Cancer 13, 2533.
Erickson, P., Gao, J., Chang, K.S., Look, T., Whisenant, E., Raimondi, S., et al., 1992.
Identication of breakpoints in t (8;21) acute myelogenous leukemia and
isolation of a fusion transcript, AMLl/ETO, with similarity to Drosophila
segmentation gene, runt. Blood 80, 18251831.
Gehring, N.H., Kunz, J.B., Neu-Yilik, G., Breit, S., Viegas, M.H., Hentze, M.W., et al.,
2005. Exon-junction complex components specify distinct routes of
nonsense-mediated mRNA decay with differential cofactor requirements. Mol.
Cell 20, 6575.
Gu, Z., Package circlize. Version 0.2.4. https://cran.r-project.org/web/packages/
circlize/index.html (accessed 20.03.15).
Hatlen, M.A., Wang, L., Nimer, S.D., 2012. AML1ETO driven acute leukemia:
insights into pathogenesis and potential therapeutic approaches. Front. Med. 6,
248262.
Heber, S., Alekseyev, M., Sze, S.H., Tang, H., Pevzner, P.A., 2002. Splicing graphs and
EST assembly problem. Bioinformatics 18, S181S188.
Houseley, J., Tollervey, D., 2010. Apparent non-canonical trans-splicing is
generated by reverse transcriptase in vitro. PLoS ONE 5, e12271.
Hug, B.A., Lazar, M.A., 2004. ETO interacting proteins. Oncogene 23, 42704274.
Iyer, S., Killingback, T., Sundaram, B., Wang, Z., 2013. Attack robustness and
centrality of complex networks. PLOS ONE 8, e59613.
Karolchik, D., Barber, G.P., Casper, J., Clawson, H., Cline, M.S., Diekhans, M., et al.,
2014. The UCSC genome browser database: 2014 update. Nucleic Acids Res. 42
(Database issue), D764D770.
Kashima, I., Yamashita, A., Izumi, N., Kataoka, N., Morishita, R., Hoshino, S., et al.,
2006. Binding of a novel SMG-1-Upf1-eRF1-eRF3 complex (SURF) to the exon
junction complex triggers Upf1 phosphorylation and nonsense-mediated
mRNA decay. Genes Dev. 20, 355367.
Kervestin, S., Jacobson, A., 2012. NMD: a multifaceted response to premature
translational termination. Nat. Rev. Mol. Cell Biol. 13, 700712.
Kim, P., Yoon, S., Kim, N., Lee, S., Ko, M., Lee, H., et al., 2010. ChimerDB 2.0 a
knowledgebase for fusion genes updated. Nucleic Acids Res. 38, D81D85.
Klaus, A., Yu, S., Plenz, D., 2011. Statistical analyses support power law
distributions found in neuronal avalanches. PLoS ONE 6, e19779.
Kleinberg, J.M., 1999. Authoritative sources in a hyperlinked environment. J. ACM
46, 604632.
Kozu, T., Fukuyama, T., Yamami, T., Akagi, K., Kaneko, Y., 2005. MYND less splice
variants of AML1MTG8 (AML1CBFA2T1) are expressed in leukemia with
t(8;21). Genes Chromosomes Cancer 43, 4553.
Kozu, T., Miyoshi, H., Shimizu, K., Maseki, N., Kaneko, Y., Asou, H., et al., 1993.
Junctions of the AMLl/MTG8(ETO) fusion are constant in t(8;21) acute myeloid
leukemia detected by reverse transcription polymerase chain reaction. Blood
82, 12701276.
Krzywinski, M., Schein, J., Birol, I., Connors, J., Gascoyne, R., Horsman, D., et al.,
2009. Circos: an information aesthetic for comparative genomics. Genome Res.
19, 16391645.
Kursa, M.B., Rudnicki, W.R., 2010. Feature selection with the Boruta package. J. Stat.
Softw. 36, 113.
LaFiura, K.M., Edwards, H., Taub, J.W., Matherly, L.H., Fontana, J.A., Mohamed, A.N.,
et al., 2008. Identication and characterization of novel AML1ETO fusion
transcripts in pediatric t(8;21) acute myeloid leukemia: a report from the
Childrens Oncology Group. Oncogene 27, 49334942.
Lasa, A., Nomdedeu, J.F., Carnicer, M.J., Llorente, A., Sierra, J., 2002. ETO sequence
may be dispensable in some AML1ETO leukemias. Blood 100, 4243
4244.
Liaw, A., Wiener, M., 2002. Classication and regression by random Forest. R News
2, 1822.
Maciejewski, J.P., Padgett, R.A., 2012. Defects in spliceosomal machinery: a new
pathway of leukaemogenesis. Br. J. Haematol. 158, 165173.
Majoros, W.H., Lebeck, N., Ohler, U., Li, S., 2014. Improved transcript isoform
discovery using ORF graphs. Bioinformatics 30, 19581964.
Mannari, D., Gascoyne, D., Dunne, J., Chaplin, T., Young, B., 2010. A novel exon in
AML1ETO negatively inuences the clonogenic potential of the t(8;21) in
acute myeloid leukemia. Leukemia 24, 891894.
Margolin, A.A., Wang, K., Lim, W.K., Kustagi, M., Nemenman, I., Califano, A., 2006.
Reverse engineering cellular networks. Nat. Protoc. 1, 662671.
Melamud, E., Moult, J., 2009. Stochastic noise in splicing machinery. Nucleic Acids
Res. 37, 48734886.
Meyers, S., Downing, J.R., Hiebert, S.W., 1993. Identication of AML-1 and the
(8;21) translocation protein (AML-1/ETO) as sequence-specic DNA-binding
proteins: the runt homology domain is required for DNA binding and
protein-protein interactions. Mol. Cell. Biol. 13, 63366345.
Migas, A.A., Mishkova, O.A., Ramanouskaya, T.V., Ilyushonak, I.M., Aleinikova, O.V.,
Grinev, V.V., 2014. RUNX1T1/MTG8/ETO gene expression status in human
t(8;21) (q22;q22)-positive acute myeloid leukemia cells. Leukemia Res. 38,
11021110.
58
V.V. Grinev et al. / The International Journal of Biochemistry & Cell Biology 68 (2015) 4858
Miyoshi, P., Kozu, T., Shimizu, K., Enomoto, K., Maseki, N., Kaneko, Y., et al., 1993.
The t(8;21) translocation in acute myeloid leukemia results in production of an
AML1MTG8 fusion transcript. EMBO J. 12, 27152721.
Mller, A.M.S., Duque, J., Shizuru, J.A., Lbbert, M., 2008. Complementing
mutations in core binding factor leukemias: from mouse models to clinical
applications. Oncogene 27, 57595773.
Newman, M.E.J., 2002. Assortative mixing in networks. Phys. Rev. Lett. 89, 208701.
Newman, M.E.J., 2003. The structure and function of complex networks. SIAM Rev.
45, 167256.
Newman, M.E.J., 2005. Power laws, Pareto distributions and Zipfs law. Contemp.
Phys. 46, 323351.
Nicholson, P., Yepiskoposyan, H., Metze, S., Zamudio Orozco, R., Kleinschmidt, N.,
Mhlemann, O., 2010. Nonsense-mediated mRNA decay in human cells:
mechanistic insights, functions beyond quality control and the double-life of
NMD factors. Cell. Mol. Life Sci. 67, 677700.
Nisson, P.E., Watkins, P.C., Sacchi, N., 1992. Transcriptionally active chimeric gene
derived from the fusion of the AML1 gene and a novel gene on chromosome 8
in t(8;21) leukemic cells. Cancer Genet. Cytogenet. 63, 8188.
Nykter, M., Price, N.D., Larjo, A., Aho, T., Kauffman, S.A., Yli-Harja, O., et al., 2008.
Critical networks exhibit maximal information diversity in
structuredynamics relationships. Phys. Rev. Lett. 100, 058702.
Park, S., Chen, W., Cierpicki, T., Tonelli, M., Cai, X., Speck, N.C., et al., 2009. Structure
of the AML1ETO eTAFH domain-HEB peptide complex and its contribution to
AML1-ETO activity. Blood 113, 35583567.
Pickrell, J.K., Pai, A.A., Gilad, Y., Pritchard, J.K., 2010. Noisy splicing drives mRNA
isoform diversity in human cells. PLoS Genet. 6, e1001236.
Saunders, M.J., Tobal, K., Keeney, S., Liu Yin, J.A., 1996. Expression of diverse
AML1/MTG8 transcripts is a consistent feature in acute myeloid leukemia with
t(8;21) irrespective of disease phase. Leukemia 10, 11391142.
Sayers, E.W., Barrett, T., Benson, D.A., Bolton, E., Bryant, S.H., Canese, K., et al., 2012.
Database resources of the National Center for Biotechnology Information.
Nucleic Acids Res. 40, D13D25.
Schneidera, C.M., Moreirab, A.A., Andrade, J.S., Havlin, J.S., Herrmanna, H.J., 2011.
Mitigation of malicious attacks on networks. Proc. Natl. Acad. Sci. U. S. A. 108,
38383841.
Smyth, G.K., 2005. Limma: linear models for microarray data. In: Gentleman, R.,
Carey, V., Dudoit, S., Irizarry, R., Huber, W. (Eds.), Bioinformatics and
Computational Biology Solutions using R and Bioconductor. Springer, New
York, pp. 397420.
Ruijter, J.M., Ramakers, C., Hoogaars, W.M.H., Karlen, Y., Bakker, O., van den Hoff,
M.J.B., et al., 2009. Amplication efciency: linking baseline and bias in the
analysis of quantitative PCR data. Nucleic Acids Res. 37 (6), e45, http://dx.doi.
org/10.1093/nar/gkp045.
Stauffer, D., 2012. Phase transitions on fractals networks. In: Meyers, R.A. (Ed.),
Mathematics of complexity and dynamical systems. Springer
Science + Business Media, LLC, New York, pp. 14001406.
Sun, X.J., Wang, Z., Wang, L., Jiang, Y., Kost, N., Soong, T.D., et al., 2013. A stable
transcription factor complex nucleated by oligomeric AML1ETO controls
leukaemogenesis. Nature 500, 9398.
Tahirov, T.H., Inoue-Bungo, T., Morii, H., Fujikawa, A., Sasaki, M., Kimura, K., et al.,
2001. Structural analyses of DNA recognition by the AML1/Runx-1 Runt
domain and its allosteric control by CBFbeta. Cell 104, 755767.
Tighe, J.E., Calabi, F., 1994. Alternative, out-of-frame runt/MTG8 transcripts are
encoded by the derivative (8) chromosome in the t(8;21) of acute myeloid
leukemia M2. Blood 84, 21152121.
Trajanovski, S., Martin-Hernandez, J., Winterbach, W., Van Mieghem, P., 2013.
Robustness envelopes of networks. J. Complex Netw. 1, 4462.
Van de Locht, L.T., Smetsers, T.F., Wittebol, S., Raymakers, R.A., Mensink, E.J., 1994.
Molecular diversity in AML1/ETO fusion transcripts in patients with t(8;21)
positive acute myeloid leukaemia. Leukemia 8, 17801784.
Virkar, Y., Clauset, A., 2012. Power-law distributions in binned empirical data. Ann.
Appl. Stat., 133.
Vuong, Q.H., 1989. Likelihood ratio tests for model selection and non-nested
hypotheses. Econometrica 57, 307333.
Wang, Y., Ma, M., Xiao, X., Wang, Z., 2012. Intronic splicing enhancers, cognate
splicing factors and context dependent regulation rules. Nat. Struct. Mol. Biol.
19, 10441052.
Wilkerson, M., Waltman, P., Package ConsensusClusterPlus. Version 1.22.0.
http://bioconductor.org/packages/release/bioc/html/ConsensusClusterPlus.
html (accessed 13.07.15).
Yan, M., Kanbe, E., Peterson, L.F., Boyapati, A., Miao, Y., Wang, Y., et al., 2006. A
previously unidentied alternatively spliced isoform of t(8;21) transcript
promotes leukemogenesis. Nat. Med. 12, 945949.
Zhang, J., Kalkum, M., Yamamura, S., Chait, B.T., Roeder, R.G., 2004. E protein
silencing by the leukemogenic AML1-ETO fusion protein. Science 305,
12861289.
Zhang, Y.W., Bae, S.C., Huang, G., Lu, J., Ahn, M.Y., Kanno, Y., et al., 1997. A novel
transcript encoding an N-terminally truncated AML1/PEBP2 alpha B protein
interferes with transactivation and blocks granulocytic differentiation of
32Dcl3 myeloid cells. Mol. Cell. Biol. 17, 41334145.