You are on page 1of 38

MBE Advance Access published January 27, 2012

An ancestral miR-1304 allele present in Neanderthals regulates genes involved in enamel formation and could explain dental differences with modern humans

Paper submitted as Research Article

Maria Lopez-Valenzuela1, Oscar Ramrez1, Antonio Rosas2, Samuel Garca-Vargas2, Marco de la Rasilla3, Carles Lalueza-Fox1 and Yolanda Espinosa-Parrilla1

Institut de Biologia Evolutiva (UPF-CSIC), CEXS-UPF-PRBB. C/ Dr. Aiguader, 88,

08003 Barcelona, Catalonia, Spain


2

Paleoanthropology Group, Department of Paleobiology, Museo Nacional de

Ciencias Naturales, CSIC. C/ Jos Gutierrez Abascal, 2, 28006 Madrid, Spain


3

rea de Prehistoria, Departamento de Historia, Universidad de Oviedo. C/ Teniente

Alfonso Martnez s/n. 33011 Oviedo, Spain

Corresponding author:

Yolanda Espinosa-Parrilla. Institut de Biologia Evolutiva

(UPF-CSIC), CEXS-UPF-PRBB. C/ Dr. Aiguader, 88, 08003, Barcelona, Spain. Tel: +34 93 316 0845. Fax: +34 93 316 0901. E-mail: yolespinosa@gmail.com

Keywords: Neanderthal, gene regulation, miR-SNP, microRNA, miR-1304, tooth development

Running Head: An ancestral miR-1304 allele regulates dental genes

The Author 2012. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com

ABSTRACT Genetic changes in regulatory elements are likely to result in phenotypic effects that might explain population-specific as well as species-specific traits. MicroRNAs are post-transcriptional repressors involved in the control of almost every biological process. These small non-coding RNAs are present in various phylogenetic groups and a large number of them remain highly conserved at the sequence level. MicroRNA-mediated regulation depends on perfect matching between the 7 nucleotides of its seed region and the target sequence usually located at the 3'UTR of the regulated gene. Hence, even single changes in seed regions are predicted to be deleterious as they may affect microRNA target specificity. In accordance to this, purifying selection has strongly acted on these regions. Comparison between the genomes of present-day humans from various populations, Neanderthal and other non-human primates showed a microRNA, miR-1304, that carries a polymorphism on its seed region. The ancestral allele is found in Neanderthal, non-human primates, at low frequency (~5%) in modern Asian populations and rarely in Africans. Using microRNA target site prediction algorithms we found that the derived allele increases the number of putative target genes for the derived microRNA more than tenfold, indicating an important functional evolution for miR-1304. Analysis of the predicted targets for derived miR-1304 indicates an association with behavior and nervous system development and function. Two of the predicted target genes for the ancestral miR-1304 allele are important genes for teeth formation, enamelin and amelotin. MicroRNA over-expression experiments using a luciferase-based assay showed that the ancestral version of miR-1304 reduces the enamelin and amelotin associated reporter gene expression by 50%, whereas the derived miR-1304 does not have any effect. Deletion of the corresponding target sites for miR-1304 in these

dental genes avoided their repression, which further supports their regulation by the ancestral miR-1304. Morphological studies described several differences in the dentition of Neanderthals and present-day humans like slower dentition timing and thicker enamel for present-day humans. The observed miR-1304 mediated-regulation of enamelin and amelotin could at least partially underlie these differences between the two Homo species as well as other still-unraveled phenotypic differences among modern human populations.

INTRODUCTION Investigating the genetic differences associated with phenotypic diversity among hominins is a crucial step towards the understanding of human adaptation and evolution. Genetic and genomic alterations in regulatory regions are a significant source of phenotypic diversity underlying important inter-individual and inter-species differences (Carroll 2008; Hindorff et al. 2009). This record has been recently confirmed in primates by the discovery that the human loss of specific regulatory DNA, in particular the loss of non-coding RNA with enhancer function, associates with the appearance of specific human traits such as the expansion of specific brain regions (McLean et al. 2011). Increasing evidence supports that allelic changes involving either microRNAs (miRNAs) or their regulatory machinery are major contributors to phenotypic diversity in human populations and may thus be important sources of phenotypic variation and have a role in the pathophysiology of several disorders (Borel and Antonorakis 2008). miRNAs are small non-coding RNAs of 1925 nucleotides in length in their mature form, processed from a longer hairpin structure, that act as post-transcriptional regulators of gene expression by either mRNA degradation or translational repression (Krol et al. 2010). It is estimated that miRNAs regulate more than 30% of all protein-coding genes, building complex regulatory networks that control almost every cellular process (Filipowicz et al. 2008). miRNAs act by means of partial complementarity to miRNA binding sites usually located in the 3-UTR regions of their target genes (Bartel 2004). Perfect complementarity between the target sequence of the regulated messenger RNA and the so-called seed region -nucleotides 2 through 7 or 8 from the 5 end of the mature miRNA- is thought to determine successful binding and, together with the stability of the RNA hybrids, are the basis of many

miRNA target site prediction algorithms (Brennecke et al. 2005; Lewis et al. 2005). Strong purifying selection acts on the mature miRNA and particularly on nucleotides corresponding to the seed region, where no mutation is tolerated as it would most likely produce a change in the target spectrum that could give rise to the emergence of a novel miRNA (Chen and Rajewsky 2006; Liu et al. 2008; Quach et al. 2009). Conservation of miRNAs through evolution is well documented and has been used for the discovery of homologous miRNAs across different phylogenetic groups. It is assumed that conserved miRNA may have high functional relevance and hence many previous research efforts focused on finding and characterizing those miRNAs (Grad et al. 2003; Berezikov et al. 2005; Chen and Rajewsky 2006). Nonetheless, the identification of species-specific miRNAs may help to understand evolutionary novelties among different phylogenetic groups. Several studies reach the conclusion that miRNAs are strongly conserved among primates, but still there is a set of miRNAs that are found only in present-day humans and thus are good candidates to contribute to human specific phenotypes (Bentwich et al. 2005; Berezikov et al. 2005; Liu et al. 2008; Brameier 2010; Lin et al. 2010). The first published draft of the Neanderthal genome revealed that present-day humans differ from Neanderthals by a nucleotide substitution in the seed region of microRNA miR-1304 (Green el al. 2010) that therefore it is likely to change the spectrum of target genes for miR-1304. Neanderthals are the closest known evolutionary relatives of modern humans. They inhabited parts of Europe and Western Asia during a succession of climatic cycles, exhibiting both behavioural and morphological adaptations probably related to cold climates, until their extinction around 30,000 years ago (Mellars 2004; Finlayson et al. 2006). The morphological features that distinguish Neanderthals from other humans, including specific cranio-facial and dental traits, first appear in the European

fossil record around 400,000 years ago (Stringer and Hublin 1999; Hublin 2009). Estimated from genomic data, these Homo populations diverged between 440,000 to 270,000 years ago (Endicott et al. 2010; Green et al. 2010). The analysis of the Neanderthal genome has revealed that about 80 protein-coding genes show fixed amino acid changes between Neanderthals and modern humans (Green et al. 2010). Phenotypes evolve by functional differences in proteins but also do largely through mutations in regulatory regions (Carroll 2008); thus it seems clear that genetic differences related to the distinctive Neanderthal phenotype should not be restricted to a set of protein coding genes and that the analysis shall be broadened to incorporate gene regulation as well. Here we analyze the primate specific miR-1304 by studying genetic variation of the human miR-1304 locus and the spectrum of target genes predicted for the derived and ancestral versions of this miRNA. By means of functional studies we show repression of a cluster of dental genes by the ancestral version of miR-1304 illustrating how a single nucleotide change in a regulatory element may underlie particular phenotypic differences.

MATERIAL AND METHODS Sequencing of chimpanzee miR-1304 and Neanderthal AMTN 3UTR miR-1304 was PCR-amplified from genomic DNA of 3 chimpanzee individuals using primers listed in supplementary table 1. Eight clones per individual were sequenced using standard procedures. The target site for miR-1304 on the 3UTR of the Neanderthal AMTN was PCRamplified in a Neanderthal specimen (SD-1253) from El Sidrn site (Asturias, Spain), and 55 clones were sequenced (supplementary fig 1) following a previously described methodology (Lalueza-Fox et al. 2007) using specific primers

(supplementary table 1). This bone sample has been used to retrieve ~14,000 protein-coding positions (Krause et al. 2007; Lalueza-Fox et al. 2008, 2009; Burbano et al. 2010), as well as 0.1% of the nuclear genome (Green et al. 2010) and the complete mitochondrial genome (Briggs et al. 2009). The degree of contamination has been estimated to be 0.27-0.29% for the mtDNA and 2% upper bound for autosomal DNA capture (Krause et al. 2007; Lalueza-Fox et al. 2008); these low figures render this specimen one of the most suitable Neanderthal samples for targeted paleogenetic analysis. Firefly luciferase constructs and Mutagenesis The 3UTRs of ENAM, AMTN and GAD1 were PCR-amplified from genomic DNA with Platinum Taq DNA polymerase (Invitrogen, Carlsbad, California), using primers containing an XbaI restriction site at the 5 end (supplementary table 1). PCR fragments were later purified, XbaI-digested and cloned into pGL4.13 vector (Promega, Madison, Wisconsin) downstream of the firefly luciferase reporter gene. Mutant reporter plasmids were generated as previously described (Muios-Gimeno et al. 2009) with the QuikChange site-directed mutagenesis kit (Stratagene, La Jolla,

California), using either the AMTN or ENAM pGL4.13 construct as a template and primers carrying the desired deletions (supplementary table 1). Cell culture and transfection HeLa cells were maintained in Dulbeccos Modified Eagles Medium supplemented with 10% fetal bovine serum, 100 units/ml penicillin and 100 g/ml Streptomycin (GIBCO, Invitrogen). Co-transfection optimization has been previously described (Guidi et al. 2010). Briefly, HeLa cells were seeded at 1.3 x 104 cells/well in 96-well plates and co-transfected 24 h later with the Firefly reporter constructs described above or the empty pGL4.13 vector (24 ng), the Renilla reporter plasmid pGL4.75 (3 ng) and 10nM miRNA mimic for derived miR-1304, ancestral miR-1304 and negative controls #2 and #4 (miRIDIAN, Dharmacon, Lafayette, Colorado) using Lipofectamine 2000 (Invitrogen). Luciferase activity assay The activity of Firefly and Renilla luciferases was determined 24 h after transfection using the Dual-Glo Luciferase Assay System (Promega). Relative reporter activity was obtained by normalization to the Renilla luciferase activity. In order to correct for vector-dependent unspecific effects, each relative reporter activity was normalized to the empty vector co-transfected with the corresponding miRNA (Guidi et al. 2010). Results were then compared to the mean of the two negative controls. Each experiment was done in triplicate and at least three independent experiments were performed for each miRNA. Statistical significance was determined using Students t test (p<0.05). Bonferroni correction for multiple comparisons was applied taking into account the analysis of two independent reporter genes and two miRNA mimics. Using these criteria, the corrected level of significance was set up equal to 0.0125 (four comparisons).

Computational methods Targets were predicted using the web-based prediction methods TargetScan (www.targetscan.org, release 5.1 (Friedman et al. 2009)) and TargetRank (www.hollywood.mit.edu/targetrank, (Nielsen et al. 2007)) on the human genome assembly (NCBI36/hg18, March 2006). Genomic coordinates are according to the following assemblies: Homo sapiens GRCh37/hg19, Pan troglodytes CHIMP2, Gorilla gorilla gorGor3, Pongo pygmaeus PPYG2, Macaca mulatta MMUL_1. Sequences, unless sequenced by ourselves, were obtained from the University of California Santa Cruz (UCSC) Genome Browser (http://www.genome.ucsc.edu) and Ensembl Genome Browser release 61 (www.ensembl.org). Pathway analysis was performed with the Ingenuity Pathway Analysis Software (IPA) version 6.3 (www.ingenuity.com). Human genetic variation on miR-1304 was assessed using genotypes for 1094 individuals in the June 2011 Data Release of the 1000 Genomes Project (www.1000genomes.org). Exploration for natural selection signatures in the human genome was performed by the analysis of data from HapMap, using the UCSC browser, and from the 53 populations of the Human Genome Diversity Panel (HGDP), using the HGDP selection browser (http://hgdp.uchicago.edu/)(Pickrell et al. 2009).

RESULTS Conservation of miR-1304 among primates As noted by Green et al (2010), the Neanderthal draft genome sequence differs from the reference genome of present-day humans in one base in the seed region of the primate specific miRNA miR-1304 (fig 1). The corresponding genomic sequence for Neanderthals is GCCTCGA and GCCTCAA for the reference human sequence. We took advantage of the recently completed genome sequencing of four primates Macaca mulatta, Pan troglodytes, Gorilla gorilla and Pongo pygmaeus- to compare primate sequences orthologous to human miR-1304 (fig 1). The sequence alignment confirmed that the reference human genome bear the derived state at this nucleotide position while Neanderthal and other non-human primates share the ancestral state. The rest of the hairpin sequence was identical between Neanderthals and the reference human genome and showed few changes in the other four primates. In the case of chimpanzee, the published shotgun assembly (March 2006 Pan_troglodytes2.1 6x) showed a deletion of 26 nucleotides together with an insertion that set apart the two resting parts of the miRNA. To verify the possibility that chimpanzee had lost this miRNA we sequenced three chimpanzees DNAs and, after analysis of eight clones per individual, identified a consensus sequence identical to the gorilla miR1304 sequence that differs in only one nucleotide position from the ancestral miR1304 (fig 1) indicating that chimpanzees likely also have a functional copy of miR1304. Analysis of the genetic variation of miR-1304 in human populations We checked for genetic variation at the miR-1304 locus among human populations using the dbSNP database. We found one Single Nucleotide Polymorphism (SNP), rs79759099, at this particular position in the seed region indicating that this change is

10

not fixed in human populations. Using the new release of the 1000 Genomes Project (June 2011 data release) with 1094 individuals representing 14 populations worldwide we found that while all the individuals in the European (GBR, FIN, IBS, CEU and TSI), Colombian (CLM), Mexican (MXL) and Kenyan (LWK) populations only presented the derived allele, the ancestral allele of miR-1304 was present as the minor allele in the Asian Japanese (JPT, MAF=0.067) and Chinese (CHB, MAF=0.072; CHS, MAF=0.05) populations and, at very low frequency, in the Yoruban (YRI, MAF=0.028); Puerto Rican (PUR, MAF=0.009) and African American (ASW, MAF=0.008) populations (supplementary table 2). Furthermore, since this different allelic distribution could be the result of selective sweeps within recent human populations we looked for signatures of selection for the derived miR-1304 by the study and comparison of linkage disequilibrium (integrated haplotype score, iHS), population differentiation (Fst) and the frequency of rare variants (Tajimas D) along the genomic region using public data from HapMap and HGDP. However, neither a significant excess of rare variants nor significant population differentiation indexes compatibles with a selective sweep were found in the region. Analysis of target gene predictions for miR-1304 To assess the variation in the target gene spectrum for the ancestral and derived miR-1304, we used different prediction algorithms based on seed sequence matching, namely TargetScan and TargetRank. For the ancestral miR-1304, TargetRank predicted 35 target genes and TargetScan predicted only four genes (LCORL, RIMBP2, EDF1 and TCF4, the last two being common to both prediction programs, table 1). A limitation of these predictions is that they were performed on the modern human genome, thus we checked if predicted target genes had identical binding sites for the ancestral miR-1304 in the chimpanzee and Neanderthal

11

genomes. Among the 37 different predicted target genes for this variant, there were 29 genes with an identical binding site in chimpanzee, which we considered a valid proxy for conservation in Neanderthals as well. As for the other eight genes, the Neanderthal genome assembly indicated that in four cases (CD24, TMED4, TAD3L and RIMBP2) the target site is identical between the two hominins, while for the other four genes (DLST, RBPMS2, LCORL and SOX17) complete Neanderthal reads were not available (table 1). Interestingly, for the derived version of miR-1304 we found a large increase in the putative targets in comparison with the ancestral version. TargetRank predictions generated a list of 515 target genes while TargetScan generated 140, with an intersection of 79 (supplementary table 3). Next, we analyzed the association of these 79 potential target genes for the derived miR-1304 with biological processes using the Ingenuity Pathway Analysis software. The program was interrogated for enrichment in biological functions and molecular networks and the statistical significance of the associations was calculated with the right-tailed Fishers Exact Test. As shown in table 2 and supplementary fig 2, some of the most relevant associations for the derived miR-1304 target genes were found with genetic disorders (24 genes) and neurological diseases (14 genes) as well as nervous system development and function (13 genes). As far as top networks, the highest significance was found in one related to neurological disease and behaviour and another one regarding cellular function and maintenance (supplementary fig 2). Despite the small number of predicted target genes, we observed that two out of the 35 genes predicted to be regulated by the ancestral version of miR-1304 by TargetRank - ENAM and AMTN which code for the proteins enamelin and amelotin, respectively- are involved in teeth formation. The finding of two genes both involved in a process of such high specificity as odontogenesis within a relatively small set of

12

putative targets was interesting, particularly given that the best-described differences between Neanderthals and modern humans are related to cranial and dental traits. Thus, among the predicted target genes, we focused on the study of the ENAM and AMTN genes. Analysis of the 3UTR region of the ENAM and AMTN genes in Neanderthal Because the human genome was used as a reference for predicting the role of the ancestral variant of miR-1304 in the regulation of ENAM and AMTN, it is necessary that Neanderthals match humans in the target site sequences for these genes. In the case of the ENAM Neanderthal gene, the sequence corresponding to the target site for this miRNA is identical to the modern human sequence and the 3UTR exhibits only two changes in its vicinity (fig 2A). The first is a C to G substitution corresponding to a common SNP also present in modern humans (rs7664896) located seven bp from the seed region binding site. The second is a G to A nucleotide change, which is among the most common forms of post-mortem DNA damage and may be considered an artefact induced by cytosine deamination occurring in the complementary strand (Hofreiter, et al. 2001; Briggs et al. 2007). Neither of these changes is predicted to interfere with the correct binding of the miRNA. In the case of the AMTN gene, the sole available read from Vindija Neanderthal ends up at the beginning of the target sequence for miR-1304 and displays two G to A nucleotide changes that, as stated before, could be an artifact due to post-mortem DNA damage (fig 2A). To ascertain if the extinct Homo had a complete binding site for the ancestral miR-1304 at AMTN 3'UTR, we sequenced part of this region in a Neanderthal specimen (SD-1253) from El Sidrn site (Asturias, Spain) dated to about 49,000 years ago and extracted under controlled conditions (Rosas et al. 2006; Lalueza-Fox et al. 2011). The analysis of 55 clones showed that

13

the recognition site for miR-1304 in the AMTN gene is identical in both hominin groups (supplementary fig 1). The 3'UTRs of ENAM and AMTN are also highly conserved among other primates studied (chimpanzee, gorilla, orangutan and rhesus monkey) and all but the rhesus monkey carry the exact matching sequence for the ancestral miR-1304 seed region (fig 2). Accordingly, the seed region of this version of miR-1304 presents perfect complementarity with its predicted target site region at the 3UTR of ENAM and AMTN (fig 2B) and might be regulating both dental genes in Neanderthal as well as in other non-human primates and humans carrying the ancestral allele of miR-1304. Functional screening of miR-1304 target sites in ENAM and AMTN To investigate the interaction between ENAM and AMTN mRNA and the different versions of miR-1304, functional validation of the predicted ancestral miR-1304 target site was performed using a dual-luciferase assay in HeLa cells. A luciferease reporter pGL4.13 construct carrying the 3'UTR of either the ENAM or AMTN genes was cotransfected with the corresponding miRNA mimic: ancestral miR-1304, derived miR1304 or 2 different control miRNAs. As shown in fig 3A, a statistically significant reduction of the luciferase activity was observed for ENAM and AMTN when cotransfected with the ancestral miR-1304 as compared with the modern miR-1304 (p<0.0125, Bonferroni corrected Students t test). In the case of the ENAM construct, the associated luciferase activity descended to 51% and for the AMTN construct to 39% when co-transfected with the ancestral miR-1304, which is compatible with a strong repression of the ENAM and AMTN genes by this regulator. Finally, in order to demonstrate the specificity of the binding of the ancestral miR-1304 to the ENAM and AMTN 3UTRs, we used site directed mutagenesis to delete the corresponding target sequences that bind the seed region of the ancestral miR-1304 in the ENAM and

14

AMTN pGL4.13 constructs and co-transfected the plasmids with the respective miRNAs. A statistically significant recovery of the luciferase activity was observed when the ancestral miR-1304 was co-transfected with the deleted ENAM and AMTN plasmids in comparison with the wild type constructs, reaching levels of recovery of about 100% (fig 3B). The observed rescue of the quantitative phenotype further supports the predicted repression of these two dental genes by this version of miR1304. Analysis of the genetic variation of ENAM and AMTN in human populations To further characterize the gene-miRNA interaction we analyzed genetic variation on the ENAM and AMTN miR-1304 target site sequences in present-day populations using the 1000 Genomes Project data. While we did not find genetic variation in the target site of AMTN, the ENAM target site harbors the SNP rs117342040 (G/A with A being the minor allele). Interestingly, the affected nucleotide in the target site of ENAM is located exactly matching the described change in the miR-1304 seed region in such a way that the minor allele of rs117342040 would interfere with the proper binding of the ancestral allele of miR-1304. Moreover, the allele frequencies for this SNP are differentially distributed among human populations being absent in European populations, very rare in Africans (one carrier of African origin from Kenya) and present in about 2.5% of Asian individuals (Supplementary table 2). As this variant is found in populations where the ancestral version of miR-1304 is also detected, we studied if they were correlated using and hypergeometric test and observed that, for the total of 1094 individuals analyzed, our finding of 3 individuals having both minor alleles was significant (p=0.0162) meaning that there are more carriers of both minor ENAM and miR-1304 alleles than expected by chance.

15

DISCUSSION The main basis of miRNA action is its sequence complementarity with the target regions of the genes that they regulate. Functional SNPs that create or disrupt these miRNA target sites have been shown to have diverse phenotypic implications contributing sometimes to human disease susceptibility (Borel and Antonarakis 2008). Since one miRNA can have multiple targets, SNPs located on the miRNA regions important for target recognition would be expected to exhibit a broader biological effect than SNPs on the target sequences. So far, very few functional variants in miRNA genes have been described and only a small number of studies have been devoted to describe their functional consequences (Jazdzewski K et al. 2008; Sun et al. 2009; Vinci et al. 2011). Analysis of the Neanderthal genome sequence led to the detection of a miRNA, miR1304, that differs from the reference human miR-1304 sequence at one nucleotide position in the seed region (Green et al. 2010). Interestingly, analysis of 1000 genomes project data revealed that this nucleotide change is not fixed in present-day human populations and it turned out to be a SNP (rs79759099) present in Asian populations (allele frequency of about 0.06). In European and East-African populations, the ancestral allele was not found and populations with West-African origin carry the ancestral allele but at very low frequencies. Since this distribution schema could be the result of selective sweeps within recent human populations compatible with a beneficial role for the new derived miR-1304 allele, we looked for signatures of selection along the region using HapMap and HGDP data but no evidences for a selective sweep, such as an excess rare allele variants or a block of extensive linkage disequilibrium, could be found. Other explanation for the presence of the ancestral allele in Asian populations could be that it was reintroduced through

16

admixture with Neanderthals; however, because Neanderthal admixture is restricted to non-African populations (Green et al. 2010) and given the presence of the ancestral miR-1304 in some Africans and its low frequency in Asia, the distribution of the ancestral miR-1304 could not solely be explained in terms of introgression. Furthermore, miR-1304 is not included into the 13 regions that have been identified as candidate gene flow regions between Neanderthals and modern humans that were described through the analysis on extended Neanderthal haplotype blocks in modern human genomes (Green et al. 2010). In this scenario, a combination of admixture in Asia such as the described by Skoglund (Skoglund and Jakobsson 2011) with rs79759099 being a retained ancient polymorphism in Africa could be the most plausible explanation. miR-1304 is a relatively novel primate specific miRNA that was recently discovered at very low levels in human embryonic stem cells and after differentiation of these cells into embryonic bodies (Morin et al. 2008). Since then deep RNA sequencing studies have found miR-1304, also lowly expressed, in diverse tissues such as peripheral blood, brain cortex and melanoma (Mart et al. 2010; Stark et al. 2010; Vaz et al. 2010) but its targets and function remain largely unknown. The in silico approach based on target site predictions performed in this work allowed us to gain insights into the putative functional differences between the two versions of miR-1304. One of the main restrictions to find target genes for the ancestral miR-1304 was the dearth of good quality Neanderthal sequence data. The use of the modern humans genome for target gene predictions may have lead us to disregard Neanderthal target genes that diverged from human genes in their 3UTRs; nevertheless, we did not find false positive as all target genes for the ancestral miR-1304 that could be tested (33 out of 37, 89%) have an identical and conserved binding site in the Neanderthal and human

17

genomes. Remarkably, we observed a significant difference in the number of predicted target genes between both miRNAs: the derived version of miR-1304 has more than 14 times more predicted targets than the ancestral one, which is indicative of a critical functional evolution for miR-1304. Interestingly, the analysis performed using the Ingenuity software associated the predicted targets of derived miR-1304 with biological processes and disorders related to central nervous system development and function which suggests that the evolutionary change of miR-1304 may affect aspects of human brain functioning and cognition. Previous work showed that SNP variants in the vicinity of miR-22, miR-148a and miR-488 associate with panic disorder and that these miRNAs effectively repress candidate genes for anxiety (Muios-Gimeno et al. 2011). In this context, involvement of miR-1304 in brain function regulation raises the interesting possibility of particular neurological phenotypes being associated with one miR-1304 allele or the other. On the other hand, comparison of the Neanderthal genome and modern humans genomes identified a number of genomic regions that may have been affected by positive selection in ancestral modern humans, including regions involved in cognitive abilities like NRG3, a gene involved in schizophrenia (Green et al. 2010). Taken together these data suggest that human cognition has been an important target of recent human evolution that could have been shaped in part by miRNAs. Among the few predicted target genes for the ancestral miR-1304, CD24 and the transcription factor 4 (TCF4) are of interest for their involvement in neurological disorders in humans. CD24 has been associated with multiple sclerosis (Zhou et al. 2003) and TCF4 (predicted by TargetRank and TargetScan) has been recently associated with a range of neuropsychiatric phenotypes including schizophrenia,

18

impaired verbal learning and Pitt-Hopkins syndrome (Amiel et al. 2007; Stefansson et al. 2009; Lennertz et al. 2011). Two other noteworthy target genes are ENAM and AMTN, both involved in odontogenesis. We focused on these genes due to the known dental differences between Neanderthals and modern humans, and demonstrated the conservation of their 3'UTRs among primates as well as their differential response after transfection with either the derived or the ancestral miR1304. These two genes are involved in teeth formation and map near each other in an interesting cluster on chromosome 4 together with other genes also involved in dental formation as well as salivary proteins such as ameloblastin (AMBN) or mucin 7 (MUC7). Tooth enamel, the hardest substance in vertebrates, is formed by epithelium-derived ameloblasts. As ameloblasts differentiate, they deposit specific proteins necessary for enamel formation, including amelogenin, ameloblastin and enamelin, in the organic enamel matrix. Enamelin is the largest protein in the enamel matrix of developing teeth. It is involved in the mineralization and structural organization of enamel. Amelotin was discovered in mouse, where it is specifically expressed in maturation-stage ameloblasts. It has been hypothesized that it functions as a protease helping the processing of the enamel matrix at this stage (Iwasaki et al. 2005). We can only speculate about the function of the ancestral miR-1304 in Neanderthals but, given the involvement of ENAM and AMTN in odontogenesis, their observed repression by the ancestral miR-1304 could implicate this miRNA, acting together with other divergent genes, in a range of differences related to Neanderthal and modern humans dentition (Macchiarelli et al. 2006; Olejniczak et al. 2008). For example, the union surface between dentine and enamel was more complex in Neanderthals than it is in Homo sapiens. Additionally, the volume of coronal dentine

19

was larger in the extinct Homo and, since enamel volume is similar in both species, this results in significantly thinner cuspal enamel in Neanderthals (Ramirez Rozzi and Bermudez De Castro 2004; Olejniczak et al. 2008). Moreover it is believed that enamel cusps formed faster in Neanderthals as evidenced by a significantly lower periodicity of their enamel growth marks (Retzius marks or perikymata) (Aiello and Dean 1990), a fact that has often lead to overestimation of Neanderthal's age at death (Olejniczak et al. 2008; Smith et al. 2009, 2010). This would imply a different rate of ameloblastic activity, a process that may be influenced by miR-1304. This hypothesis is in agreement with recent studies performed in rodents that denote that miRNAs have a prominent role in teeth development being required for normal ameloblast differentiation and enamel matrix formation (Cao et al. 2010; Michon et al. 2010). Since the ancestral allele of miR-1304 still segregates in some modern humans, this miR-1304 variant could be involved in dental development and be associated, for example, with the degree of enamel thickness. Although little is known about variations of dental morphology among human populations, it has been reported that average enamel thickness is similar among Asian, European and African dentitions (Feeney et al. 2010). Nevertheless, given the fact that only about 0.5% of Asian individuals would be homozygous for the ancestral miR-1304 we do not expect this allele to be involved in phenotypic traits that differ among populations but rather in less common traits somehow related with disease. In this regard, defects in enamel formation create the condition known as Amelogenesis Imperfecta (AI), a disease in which the enamel does not fully form or forms in insufficient amounts and teeth affected may be discolored, sensitive or prone to disintegration. AI is commonly inherited as an autosomic trait and ENAM mutations appear to be responsible for a big part of the autosomally inherited cases. However, genetic

20

studies provide evidence for the existence of at least one further autosomal AI locus (Krrman et al. 1997; Dong et al. 2000). The ancestral miR-1304, as repressor of ENAM and AMTN, is a strong candidate to be involved in the susceptibility to AI and other forms of enamel hypoplasia. It would be of interest to test this hypothesis in familial cases of the disorder for which the underlying genetic defect has not been identified yet and furthermore study if Asian populations show higher AI incidence than other populations (the exact incidence of AI is uncertain and estimates vary from 1:700 people to 1:14,000 according to the populations studied; Seow 1993). The finding of correlation between the presence of the ancestral allele of miR-1304 and the minor allele of rs117342040 in ENAM, a variant that would interfere with the binding of the ancestral miR-1304 to the mRNA and would avoid its repression rescuing a hypothetical phenotype, is very appealing. It strengthens the idea that the presence of the ancestral miR-1304 in modern humans could have some adverse effect and thus, removal of down-regulation of ENAM could have been beneficial to modern humans. Unfortunately, we could not find evidences for a selective sweep happening in the miR-1304 gene, which would have reinforced this hypothesis. Finally, we should not forget about other predicted target genes for the ancestral miR-1304 that are also related to disease such as the above stated CD24 and TCF4 genes that are involved in neurological disorders in humans. Future analysis on the differential gene repression by the two alleles of miR-1304 and deeper studies on the functional significance of this regulation would be of great interest not only to gain insights into primate evolution but also in the possible implication of this miRNA in phenotypic differences among present-day human populations and its relationship with complex human disorders.

21

SUPPLEMENTARY MATERIAL Supplementary material includes 2 figures and 3 tables

ACKNOWLEDGMENTS We would like to thank M. Valls, B. Kagerbauer and F. Snchez-Quinto for their technical support and advice and to the anonymous referees whose comments and suggestions greatly improved this manuscript. Support was provided by the Spanish National Institute for Bioinformatics (www.inab.org). This work was supported by the Ministerio de Ciencia e Innovacin, Espaa (BFU2010-18477, BFU2009-06974 and CGL2009-09013) and the European Union Seventh Framework Programme (PIOFGA-2009-236836). This publication has been co-financed by FEDER - European Regional Development Fund A way to build Europe. Fieldwork at El Sidrn site has been funded by the Consejera de Cultura, Principado de Asturias. M.L.V. is funded by an FI fellowship from the Generalitat de Catalunya.

22

REFERENCES Aiello LC, Dean MC. 1990. The Microanatomy and Development of Teeth. In: An Introduction to Human Evolutionary Anatomy. Academic Press Limited, London. p. 106-132. Amiel J, Rio M, de Pontual L, et al. (11 co-authors). 2007. Mutations in TCF4, encoding a class I basic helix-loop-helix transcription factor, are responsible for PittHopkins syndrome, a severe epileptic encephalopathy associated with autonomic dysfunction. Am J Hum Genet. 80:988-993. Bartel DP. 2004. MicroRNAs: Genomics, Biogenesis, Mechanism, and Function. Cell. 116:281-297. Bentwich I, Avniel A, Karov Y, et al. (13 co-authors). 2005. Identification of hundreds of conserved and nonconserved human microRNAs. Nat Genet. 37:766-770. Berezikov E, Guryev V, van de Belt J, Wienholds E, Plasterk R, Cuppen E. 2005. Phylogenetic shadowing and computational identification of human microRNA genes. Cell. 120:21-24. Borel C, Antonarakis SE. 2008. Functional genetic variation of human miRNAs and phenotypic consequences. Mamm Genome. 19: 503509. Brameier M. 2010. Genome-wide comparative analysis of microRNAs in three nonhuman primates. BMC Res Notes. 3:64. Brennecke J, Stark A, Russell RB, Cohen SM. 2005. Principles of microRNA-target recognition. PLoS Biol. 3:e85. Briggs AW, Good JM, Green RE, et al. (18 co-authors). 2009. Targeted retrieval and analysis of five Neandertal mtDNA genomes. Science. 325:318-321.

23

Briggs AW, Stenzel U, Johnson PL, et al. (11 co-authors). 2007. Patterns of damage in genomic DNA sequences from a Neandertal. Proc Natl Acad Sci USA. 104:1461614621. Burbano HA, Hodges E, Green RE, et al. (20 co-authors). 2010. Targeted Investigation of the Neandertal Genome by Array-Based Sequence Capture. Science. 328:723-725. Cao H, Wang J, Li X, et al. (11 co-authors). 2010. MicroRNAs play a critical role in tooth development. J Dent Res. 89:779-784. Carroll SB. 2008. Evo-devo and an expanding evolutionary synthesis: a genetic theory of morphological evolution. Cell. 11;134(1):25-36. Chen K, Rajewsky N. 2006. Deep conservation of microRNA-target relationships and 3UTR motifs in vertebrates, flies, and nematodes. Cold Spring Harb Symp Quant Biol. 71:149-156. Dong J, Gu TT, Simmons D, MacDougall M. 2000. Enamelin maps to human chromosome 4q21 within the autosomal dominant amelogenesis imperfecta locus. Eur J Oral Sci. 108:353-358. Endicott P, Ho SY, Stringer C. 2010. Using genetic evidence to evaluate four palaeoanthropological hypotheses for the timing of Neanderthal and modern human origins. J Hum Evol. 59:87-95. Feeney RN. Zermeno JP, Reid DJ, Nakashima S, Sano H, Bahar A, Hublin JJ, Smith TM. 2010. Enamel thickness in Asian human canines and premolars. Anthropological Science. 118 (3):191-198 Filipowicz W, Bhattacharyya SN, Sonenberg N. 2008. Mechanisms of posttranscriptional regulation by microRNAs: are the answers in sight? Nat Rev Genet.

24

9:102-114. Finlayson C, Pacheco FG, Rodrguez-Vidal J, et al. (26 co-authors). 2006. Late survival of Neanderthals at the southernmost extreme of Europe. Nature. 443:850853. Friedman RC, Farh KK, Burge CB, Bartel DP. 2009. Most mammalian mRNAs are conserved targets of microRNAs. Genome Res. 19:92-105. Grad Y, Aach J, Hayes GD, Reinhart BJ, Church GM, Ruvkun G, Kim J. 2003. Computational and Experimental Identification of C. elegans microRNAs. Mol Cell. 11:1253-1263. Green RE, Krause J, Briggs AW, et al. (56 co-authors). 2010. A draft sequence of the Neandertal genome. Science. 328:710-722. Guidi M, Muios-Gimeno M, Kagerbauer B, Mart E, Estivill X, Espinosa-Parrilla Y. 2010. Overexpression of miR-128 specifically inhibits the truncated isoform of NTRK 3 and upregulates BCL 2 in SH-SY 5 Y neuroblastoma cells. BMC Mol Biol. 11:95. Hindorff LA, Sethupathy P, Junkins HA, Ramos EM, Mehta JP, Collins FS, Manolio TA. 2009. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci USA. 106:9362-9367. Hofreiter M, Jaenicke V, Serre D, Haeseler Av A, Pbo S. 2001. DNA sequences from multiple amplifications reveal artifacts induced by cytosine deamination in ancient DNA. Nucleic Acids Res. 29:4793-4799. Hublin JJ. 2009. The origin of Neandertals. Proc Natl Acad Sci USA 106:1602216027. Iwasaki K, Bajenova E, Somogyi-Ganss E, Miller M, Nguyen V, Nourkeyhani H, Gao

25

Y, Wendel M, Ganss B. 2005. Amelotin--a Novel Secreted, Ameloblast-specific Protein. J Dent Res. 84:1127-1132. Jazdzewski K, Murray EL, Franssila K, Jarzab B, Schoenberg DR, de la Chapelle A. 2008. Common SNP in pre-miR-146a decreases mature miR expression and predisposes to papillary thyroid carcinoma. Proc Natl Acad Sci USA. 105 (20): 7269 7274. Krrman C, Bckman B, Dixon M, Holmgren G, Forsman K. 1997. Mapping of the locus for autosomal dominant amelogenesis imperfecta (AIH2) to a 4-Mb YAC contig on chromosome 4q11-q21. Genomics. 39: 164170. Krause J, Lalueza-Fox C, Orlando L, et al. (13 co-authors). 2007. The derived FOXP2 variant of modern humans was shared with Neandertals. Curr Biol. 17:19081912. Krol J, Loedige I, Filipowicz W. 2010. The widespread regulation of microRNA biogenesis, function and decay. Nat Rev Genet. 11:597-610. Lalueza-Fox C, Rmpler H, Caramelli D, et al. (17 co-authors). 2007. A melanocortin 1 receptor allele suggests varying pigmentation among Neanderthals. Science. 318:1453-1455. Lalueza-Fox C, Gigli E, de la Rasilla M, Fortea J, Rosas A, Bertranpetit J, Krause J. 2008. Genetic characterization of the ABO blood group in Neandertals. BMC Evol Biol. 8:342. Lalueza-Fox C, Rosas A, Estalrrich A, et al. (16 co-authors). 2011. Genetic evidence for patrilocal mating behavior among Neandertal groups. Proc Natl Acad Sci USA. 108:250-253. Lennertz L, Rujescu D, Wagner M, et al. (14 co-authors) 2011. Novel Schizophrenia

26

Risk Gene TCF4 Influences Verbal Learning and Memory Functioning in Schizophrenia Patients. Neuropsychobiology. 63:131-136. Lewis BP, Burge CB, Bartel DP. 2005. Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell. 120:15-20. Lin S, Cheung WK, Chen S, Lu G, Wang Z, Xie D, Li K. Lin MC, Kung HF. 2010. Computational identification and characterization of primate-specific microRNAs in human genome. Comput Biol Chem. 34:232-241. Liu N, Okamura K, Tyler DM, Phillips MD, Chung WJ, Lai EC. 2008. The evolution and functional diversification of animal microRNA genes. Cell Res. 18:985996. Macchiarelli R, Bondioli L, Debnath A, Mazurier A, Tournepiche JF, Birch W, Dean MC. 2006. How Neanderthal molar teeth grew. Nature. 444:748-751. Mart E, Pantano L, Baez-Coronel M, Llorens F, Miones-Moyano E, Porta S, Sumoy L, Ferrer I, Estivill X. 2010. A myriad of miRNA variants in control and Huntingtons disease brain regions detected by massively parallel sequencing. Nucleic Acids Res. 38:7219-7235. McLean CY, Reno PL, Pollen AA, et al. (13 co-authors). 2011. Human-specific loss of regulatory DNA and the evolution of human-specific traits. Nature. 471:216-219. Mellars P. 2004. Neanderthals and the modern human colonization of Europe. Nature. 432:461465. Michon F, Tummers M, Kyyrnen M, Frilander MJ, Thesleff I. 2010. Tooth morphogenesis and ameloblast differentiation are regulated by micro-RNAs. Dev Biol. 340:355-368.

27

Morin R, OConnor M, Griffith M, et al. (12 co-authors). 2008. Application of massively parallel sequencing to microRNA profiling and discovery in human embryonic stem cells. Genome Res. 18:610-621. Muios-Gimeno M, Guidi M, Kagerbauer B, Martn-Santos R, Navins R, Alonso P, Menchn JM, Gratacs M, Estivill X, Espinosa-Parrilla Y. 2009. Allele variants in functional MicroRNA target sites of the neurotrophin-3 receptor gene (NTRK3) as susceptibility factors for anxiety disorders. Hum Mut. 30:1062-1071. Muios-Gimeno M, Espinosa-Parrilla Y, Guidi M, et al. (14 co-authors). 2011. Human microRNAs miR-22, miR-138-2, miR-148a, and miR-488 Are Associated with Panic Disorder and Regulate Several Anxiety Candidate Genes and Related Pathways. Biol Psychiatry. 69:526-533. Nielsen CB, Shomron N, Sandberg R, Hornstein E, Kitzman J, Burge CB. 2007. Determinants of targeting by endogenous and exogenous microRNAs and siRNAs. RNA. 13:1894-1910. Olejniczak AJ, Smith TM, Feeney RN, et al. (14 co-authors). 2008. Dental tissue proportions and enamel thickness in Neandertal and modern human molars. J Hum Evol. 55:12-23. Pickrell JK, Coop G, Novembre J et al. (8 co-authors). 2009. Signals of recent positive selection in a worldwide sample of human populations. Genome Research 19(5):826-837 Quach H, Barreiro LB, Laval G, et al. (11 co-authors). 2009. Signatures of purifying and local positive selection in human miRNAs. Am J Hum Genet. 84:316-327. Ramirez Rozzi FV, Bermudez De Castro JM. 2004. Surprisingly rapid growth in Neanderthals. Nature. 428:936-939.

28

Rosas A, Martnez-Maza C, Bastir M, et al. (18 co-authors). 2006. Paleobiology and comparative morphology of a late Neandertal sample from El Sidron, Asturias, Spain. Proc Natl Acad Sci USA. 103:19266-19271. Skoglund P, Jakobsson M. 2011. Archaic human ancestry in East Asia. Proc Natl Acad Sci U S A 108:18301-18306 Seow WK. 1993. Clinical diagnosis and management strategies of amelogenesis imperfecta variants. Pediatr Dent 15: 384-393, 1993 Smith TM, Harvati K, Olejniczak AJ, Reid DJ, Hublin JJ, Panagopoulou E. 2009. Brief communication: dental development and enamel thickness in the Lakonis Neanderthal molar. Am J Phys Anthropol. 138:112-118. Smith TM, Tafforeau P, Reid DJ, et al. (14 co-authors). 2010. Dental evidence for ontogenetic differences between modern humans and Neanderthals. Proc Natl Acad Sci USA. 107:20923-20928. Stark MS, Tyagi S, Nancarrow DJ, Boyle GM, Cook AL, Whiteman DC, Parsons PG, Schmidt C, Sturm RA, Hayward NK. 2010. Characterization of the Melanoma miRNAome by Deep Sequencing. PloS one. 5:e9685. Stefansson H, Ophoff RA, Steinberg S, et al. (87 co-authors). 2009. Common variants conferring risk of schizophrenia. Nature. 460:744-747. Stringer CB, Hublin JJ. 1999. New age estimates for the Swanscombe hominid, and their significance for human. J Hum Evol. 37:873877. Sun G, Yan J, Noltner K, Feng J, Li H, Sarkis DA, Sommer SS, Rossi JJ. 2009. SNPs in human miRNA genes affect biogenesis and function. RNA. 15:1640-1651.

29

Vaz C, Ahmad HM, Sharma P, Gupta R, Kumar L, Kulshreshtha R, Bhattacharya A. 2010. Analysis of microRNA transcriptome by deep sequencing of small RNA libraries of peripheral blood. BMC Genomics. 11:288. Vinci S, Gelmini S, Pratesi N, Conti S, Malentacchi F, Simi L, Pazzagli M, Orlando C. 2011. Genetic variants in miR-146a, miR-149, miR-196a2, miR-499 and their influence on relative expression in lung cancers. Clin Chem Lab Med. DOI: 10.1515/CCLM.2011.708 Published Online ahead of printing. Zhou Q, Rammohan K, Lin S, et al. (14 co-authors). 2003. CD24 is a genetic modifier for risk and progression of multiple sclerosis. Proc Natl Acad Sci USA. 100:1504115046.

30

Table 1. Target gene prediction for the ancestral version of miR-1304 by TargetRank
Conserved Seed Matches M8 Score Total 8mer 7mer 0.337 0.309 0.302 0.281 0.281 0.263 0.261 0.259 0.259 0.257 0.253 0.253 0.253 0.251 0.240 0.237 0.237 0.237 0.236 0.210 0.210 0.210 0.210 0.210 1 1 1 1 1 2 2 1 1 1 1 1 1 2 3 2 1 1 2 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Non-conserved Seed Matches

Gene PTGFRN DEK AMTN CD24 PIK3AP1 C1orf85 CAMKK2 KATNAL 1 MGC130 17 FOXH1 DLST ENAM RIC8A C1orf21 TNNI1 RBPMS2 IGFBP3 TCF4a FIZ1 CABP7 CLCF1 GRAP KLF15 LRRN2

Target Gene prostaglandin F2 receptor negative regulator DEK oncogene amelotin CD24 antigen precursor phosphoinositide-3-kinase adaptor protein 1 kidney predominant protein NCU-G1 calcium/calmodulin-dependent protein kinase b katanin p60 subunit A-like 1 hypothetical protein LOC91368 forkhead box H1 dihydrolipoamide S-succinyltransferase enamelin resistance to inhibitors of cholinesterase 8 chromosome 1 open reading frame 21 troponin I RNA binding protein with multiple splicing 2 insulin-like growth factor binding protein 3 transcription factor 4 FLT3-interacting zinc finger 1 calcium binding protein 7 cardiotrophin-like cytokine factor 1 GRB2-related adaptor protein Kruppel-like factor 15 leucine rich repeat neuronal 2

A1 M8 A1 Conserved 7mer 6mer 8mer 7mer 7mer 6mer in Primates 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 0 0 0 1 1 1 1 1 0 0 0 0 0 1 1 0 0 0 0 0 0 1 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 1 0 0 0 0 0 0 0 0 H, N, C H, N, C H, N, C H, N H, N, C H, N, C H, N, C H, N, C H, N, C H, N, C nr H, N, C H, N, C H, N, C H, N, C nr H, N, C H, N, C H, N, C H, N, C H, N, C H, N, C H, N, C H, N, C

31

PI4K2A SLAMF8 TNIP2 ATP1B3 L3MBTL3 SOX17 TMED4 DAPL1 EDF1a FAM90A 1 TADA3L
a

phosphatidylinositol 4-kinase type 2 alpha SLAM family member 8 A20-binding inhibitor of NF-kappaB activation 2 Na+/K+ -ATPase beta 3 subunit Lethal (3) malignant brain tumor-like protein 3 17 SRY-box transmembrane emp24 protein transport domain death associated protein-like 1 endothelial differentiation-related factor 1 hypothetical protein LOC55138 transcriptional adaptor 3-like

0.210 0.210 0.210 0.209 0.209 0.209 0.209 0.203 0.203 0.203 0.203

1 1 1 1 1 2 1 1 1 1 1

0 0 0 0 0 0 0 0 1 0 0

0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0

1 1 1 0 0 0 0 1 0 1 1

0 0 0 1 1 0 1 0 0 0 0

0 0 0 0 0 1 0 0 0 0 0

0 0 0 0 0 1 0 0 0 0 0

H, N, C H, N, C H, N, C H, N, C H, N, C nr H, N H, N, C H, N, C H, N, C H, N

, target gene prediction common to TargetScan and TargetRank; A1 indicates presence of an adenosine opposite miRNA base 1; M8 indicates a Watson-Crick match to miRNA base 8. Conserved in Primates, indicates conservation of the binding target site among different primate species: H, human; N, Neanderthal; C, chimpanzee; nr, no Neaderthal reads available.

32

Table 2. Top associations of the predicted target genes for the derived version of miR-1304 with biological processes Top Networks Cell Morphology, Cellular Function and Maintenance, Cellular Movement Neurological Disease, Behavior, Genetic Disorder Lipid Metabolism, Molecular Transport, Small Molecule Biochemistry Cell Cycle, Cellular Development, Cellular Growth and Proliferation Auditory Disease, Genetic Disorder, Neurological Disease Diseases and Disorders Genetic Disorder Neurological Disease Cancer Infection Mechanism Cardiovascular Disease Molecular and Cellular Functions Cellular Assembly and Organization Cell Compromise Cellular Growth and Proliferation Cellular Development Gene Expression Physiological System Development and Function p-value 5,93E-05 - 4,04E-02 5,93E-05 - 4,04E-02 1,27E-03 - 3,10E-02 2,12E-03 - 1,81E-02 4,49E-03 - 4,40E-02 p-value 3,21E-04 - 4,40E-02 3,89E-04 - 3,10E-02 1,13E-03 - 4,83E-02 2,00E-03 - 4,68E-02 2,12E-03 - 4,83E-02 p-value Score 51 51 30 5 5 Molecules 24 14 5 6 4 Molecules 16 6 11 9 19 Molecules

Nervous System Development and Function 5,45E-04 - 4,40E-02 13 Tissue Morphology 5,45E-04 - 4,83E-02 8 Tumor Morphology 1,13E-03 - 3,97E-02 5 Connective Tissue Development and Function 3,48E-03 - 4,83E-02 8 Behaviour 4,49E-03 - 4,49E-03 1 The five most significant associations with molecular networks and with different categories of biological functions are shown (Ingenuity Pathway Analysis software).

33

FIGURE CAPTIONS Fig 1. Genomic sequence conservation of miR-1304 among primates. First row of the figure shows the RNA sequence of mature reference human miR-1304. The seed region and the nucleotide change it are highlighted in grey on the genomic DNA alignment. hsa, Homo sapiens; nea, Homo neanderthalensis; ptr, Pan troglodytes; ggo, Gorilla gorilla; ppy, Papio pygmaeus; mmu, Macaca mulatta.

Fig 2. Genomic sequence conservation of ENAM and AMTN 3'UTRs among primates. A; 3UTR of the ENAM and AMTN genes, the target sequence that binds the seed region of miR-1304 is highlighted in grey. hsa, Homo sapiens; nea, Homo neanderthalensis; ptr, Pan troglodytes; ggo, Gorilla gorilla; ppy, Papio pygmaeus; mmu, Macaca mulatta. B; Sequence alignments for the ancestral miR-1304 (Anc. miR-1304) showing the predicted binding with the 3UTRs of the Neanderthal ENAM and AMTN mRNAs.

Fig 3. The ancestral version of miR-1304 targets ENAM and AMTN 3'UTRs. Results of the luciferase-reporter assay used to test the interaction between ancestral (Anc.) and derived (Der.) miR-1304 with the ENAM and AMTN genes. A; HeLa cells were co-transfected with the miRNA mimic of interest, the Renilla reporter plasmid pGL4.75 and either the empty pGL4.13 plasmid or pGL14.3 carrying the 3'UTR of GAD1, ENAM or AMTN followed by the firefly luciferase reporter gene. The previously proven regulation of GAD1 by miR-7 was used as a positive control. The ratios of Firefly to Renilla luciferase luminescence are presented after normalization to the empty pGL4.13 and to the mean of 2 different mimic controls. B; The binding site for the ancestral miR-1304 was removed from the 3'UTRs in the constructs by

34

mutagenesis and the luciferase assay was repeated. Each experiment was done in triplicate and at least 3 independent experiments were performed. Data reported here are the means SD of independent experiments. Significant associations after Bonferroni correction (p<0.0125, corrected Students t test) are indicated with an asterisk. Anc.miR-1304, ancestral miR-1304 mimic.

Supplementary fig 1. Alignment of the 55 sequenced clones from a Neanderthal specimen (SD-1253) from El Sidrn site (Asturias, Spain). Vi33.16, sole available read from Vindija Neanderthal. The target sequence that binds the seed region of miR-1304 is highlighted in grey.

Supplementary fig 2. Target genes common to TargetScan and TargetRank predictions were examined using the Ingenuity Pathway Analysis software. Target genes predicted by both TargetScan and TargetRank are depicted as greyfilled shapes. The two networks with the highest score are shown.

35

hsa-miR-1304 hsa nea ptr ggo ppy mmu

GUGUAGAGUGACAUCGGAGUUU

TGAATCACTTGAGCCCAGGGGTTCGAGGCTACAGTGAGATGTGGCAGGATCACATCTCACTGTAGCCTCAAACCGCTGGGCTCAAGTGTTT ---------------------------------------------------------------------G-----------------------------------------------------------------------------------------G---T-------------------------------------------------------------------------------------G---T----------------------G-----------------------------G----C---------------------------G---T--------T-------------G----------A-----T-----------------------------------T---------G----A-----------G-A--

A
ENAM hsa nea
ptr
CAGACCTTAAAAAAGCAAGAAGGCCATGACCATCCCTGCCTCGAAACTTGCAAGTCACTTGTCTGAGTTAGTTTCCTTTTGCTTGAT --------------------------------------------------G--A----------------------------------------------------------------------------------G-------------------------------------------------------------------------------------G-------------------------------------------------------------------------------------G----------------------------------------------------------T-----------------T--------G----T--------------------------C----

ggo ppy mmu

AMTN hsa nea ptr ggo ppy mmu


CCAGGAATTCAGTAAGCTGTTTCAAATTTTTTCAACTAAGCTGCCTCGAATTTGGTGATACATGTGAATCTTTATCATTGATTATAT --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------G----------------------------G----------------------------------------------------------------------T-----------C----------------------------

B
Anc.miR-1304
3- GUGUAGAGUGACAUCGGAGCUU - 5 ||||||| 5... CAAGGCCAUGACCAUCCCUGCCUCGAAACUUG...3

Anc.miR-1304
3- GUGUAGAGUGACAUCGGAGCUU - 5 ||||||| 5... AAAUUUUUUCAACUAAGCUGCCUCGAAUUUGG...3

mRNA Neanderthal ENAM

mRNA Neanderthal AMTN

A
1,2 1,1 1 0,9 0,8 Luciferase activity 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0

* *
hsa-miR-7 Der.miR-1304 Anc.miR-1304

B
1,2 1,1 1 0,9 0,8 Luciferase activity 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0

GAD1

ENAM

AMTN

* *
ENAM wt ENAM del AMTN wt AMTN del

Der.miR-1304

Anc.miR-1304

You might also like