Identification of novel and conserved microRNAs involved in fruit development and ripening in Fragaria vesca

MicroRNAs (miRNAs) are class of noncoding RNAs that regulate gene expression at the post-transcriptional level, either by endonucleolytic cleavage or by translational inhibition. Strawberry is a popular worldwide fresh fruit and is believed to benefit human health. However, the function of miRNAs during this fruit development and ripening remains unknown and miRNAs for specific for these processes are expected to be discovered. In the study, we identified 218 conserved miRNAs and 87 novel miRNAs in Fragaria vesca. Expression profiling of miRNAs during fruit development and ripening was performed, and the expression of targets of the miRNAs was validated by qRT-PCR (quantitative reverse transcription polymerase chain reaction). This study provided data for further research on molecular mechanisms involved in fruit development and ripening.


Digital signature
This PDF has been certified using digital signature with a trusted timestamp to assure its origin and integrity. A verification trust dialog appears on the PDF document when it is opened in a compatible PDF reader. Certificate properties provide further details such as certification time and a signing reason in case any alterations made to the final content. If the certificate is missing or invalid it is recommended to verify the article on the journal website.
indicates that miRNAs are involved in plant development, signaling, abiotic stresses, and symbiotic relationship regulation [15,16]. Nowadays, more than 35,828 mature miRNAs have been identified from 223 species including flies, worms, vertebrates, viruses, and plants (miRBase sequence database release 22, March 12, 2018, http://www.mirbase. org/). A large number of known miRNAs in the plant kingdom from mosses and ferns to higher flowering plants are evolutionarily conserved and has been used as a practical indicator for identification or prediction of miRNAs by homology searches in other species [11,17,18]. The single strand miRNAs in higher plants are usually small RNA molecules with 21-22 nt, and they are processed by Dicer-like 1 (DCL1). Similar to siRNA duplexes, the resulting miRNA/miRNA* duplex possesses characteristic twonucleotide 3' overhangs and is unwound coincident with the association of the mature, single-stranded miRNA with an AGO protein. The mature miRNA then serves to guide the bound AGO protein to target mRNAs based upon complementarity. By cleavage of target mRNAs (translational repression or transcriptional inhibition), mature miRNAs suppress the expressions of their target genes [19,20].
In recent years, small RNAs are getting more and more attention for their key roles in regulating plant development, the response to environment and many biochemical reactions. Zhu et al. reported miR172 functions in regulating the transitions between developmental stages and in specifying floral organ identity by regulating expression of a small group of AP2-like transcription factors in an evolutionary ancient interaction [21]. MiR167 was involved in auxin response and APETALA2 (AP2) was the target gene of miR172 [22,23]. Several conserved and tissue-specific expressed miRNAs were identified and might play a role in fleshy fruit development [23,24]. The higher expression patterns of four novel miRNAs were found in small fruits than in leaves and flowers and thus might be involved in tomato fruit ripening [25][26][27]. CNR and SIAP2a, the auxin response factor (ARF), and ripening regulators were actively modulated by miR156/157 and miR172 during tomato ripening [28,29]. Silva et al. [30] reported that overexpressing an AtMIR156b precursor generated abnormal flower and fruit morphology in tomato by targeting SBP gene. MiR319/miR319a/miR319e were predicted to target ARF2 gene, coding for a transcriptional suppressor involved in ethylene, auxin, ABA, and brassinosteroid pathways to control the onset of leaf senescence, floral organ abscission, and ovule development. In Japanese apricot, the expressions of miR319/ miR319a/miR319e were shown to be higher in imperfect flower buds, with aborted pistils, than in perfect ones. Therefore, it is conceivable that the overexpression of miR319/miR319a/miR319e may contribute to an increase in imperfect flower ratios in pistil development of Japanese apricot [31]. In pear, the miRNA profiles of fruits from different developmental stages were investigated by high-throughput sequencing and proved that miRNAs were widely involved in the regulation of pear fruit development and quality [32]. In sweet orange, Csi-miR164 targeted the NAC transcription factor and showed a high expression level in fruit and thus was involved in fruit ripening [33].
Strawberry fruit is popular worldwide fresh fruit and is beneficial to human health. Nowadays, two members of miR159 (Fa-miR159a and Fa-miR159b) were identified to interact with Fa-GAMYB during the berry receptacle development. They cooperatively changed GA endogenous levels [34][35][36][37]. The genome-wide identification of miRNAs was reported in strawberry, however, the function of miRNAs in diploid wild strawberry fruit development and ripening remains unknown and specific miRNAs in fruit are expected to be identified. In this study, we investigated the miRNA profiles of diploid wild strawberry fruits from different developmental stages with Illumina HiSeq 2000 (Illumina, USA) platform, combined with bioinformatics analysis and experimental verification. The results suggested that miRNAs are widely involved in the regulation of fruit development and quality in F. vesca.

Plant material and RNA extraction
Fragaria vesca ('Hawaii 4') was cultivated in growth chamber at 22 ±1°C in a 13/11-h dark/light photoperiod. Fruit samples were harvested at 18, 24, 30, 36, and 42 DAF (days after flowering). At each developmental stage, 10 representative fruits without achene were sampled, snap-frozen in liquid nitrogen, and kept at −80°C for further use. Total RNA was extracted from strawberry fruit harvested using miRcute miRNA Isolation Kit (Tiangen, China) according to the manufacturer's instructions.
Small RNA library construction, sequencing, and data analysis A total amount of 3 μg total RNA per sample was used as input material for the small RNA library. Sequencing libraries were generated using NEBNext Multiplex Small RNA Library Prep Set for Illumina (NEB, USA) following manufacturer's recommendations and index codes were added to attribute sequences to each sample. Briefly, NEB 3' SR Adaptor was directly and specifically ligated to 3' end of miRNA, siRNA. After the 3' ligation reaction, the SR RT Primer hybridized to the excess of 3' SR Adaptor (that remained free after the 3' ligation reaction) and transformed the single-stranded DNA adaptor into a double-stranded DNA molecule. This step is important to prevent adaptor-dimer formation, besides, dsDNAs are not substrates for ligation mediated by T4 RNA ligase 1 and therefore do not ligate to the 5' SR Adaptor in the subsequent ligation step. Five-prime ends adapter was ligated to 5' ends of miRNAs, siRNA. Then, first strand cDNA was synthesized using M-MuLV reverse transcriptase (RNase H − ). PCR amplification was performed using LongAmp Taq 2X Master Mix (New England Biolabs, USA), SR Primer for Illumina, and index (X) primer. PCR products were purified on a 8% polyacrylamide gel (100 V, 80 min). DNA fragments corresponding to ca. 140-160 bp (the length of small noncoding RNA plus the 3' and 5' adaptors) were recovered and dissolved in 8 μL elution buffer. At last, library quality was assessed on the Agilent Bioanalyzer 2100 (Agilent, USA) system using DNA High Sensitivity chips.
Raw data (raw reads) were firstly processed through custom Perl and Python scripts. In this step, clean data (clean reads) were obtained by removing reads containing poly-N, with 5' adapter contaminants, without 3' adapter or the insert tag, containing poly A, T, G, or C, and low quality reads from raw data. At the same time, Q20, Q30, and GC-content of the raw data were calculated. Then, all the downstream analyses were processed by choosing a certain range of length from clean reads. The small RNA tags were mapped to reference sequence by Bowtie [38] without mismatch to analyze their expression and distribution on the reference.
Mapped small RNA tags were used to looking for known miRNA. MiRBase 22.0 was used as reference, modified software miRDeep2 [39] and srna-tools-cli were used to obtain the potential miRNA and draw the secondary structures.
The characteristics of hairpin structure of miRNA precursor can be used to predict novel miRNA. The available software miRDeep2 [39] and miREvo [40] were integrated to predict novel miRNA through exploring the secondary structure, the Dicer cleavage site, and the minimum free energy of the small RNA tags unannotated in the former steps. At the same time, custom scripts were used to obtain the identified miRNA counts as well as base bias on the first position with certain length and on each position of all identified miRNA, respectively. In the alignment and annotation, some small RNA tags may be mapped to more than one category. To make sure each unique small RNA is mapped to only one annotation, the following priority rule was adapted: known miRNA > rRNA > tRNA > snRNA > snoRNA > repeat > gene > NAT-siRNA > gene > novel miRNA > ta-siRNA. The total rRNA proportion was used as sample quality indicator; usually it should be less than 60% in plant samples to indicate a high-quality sample.

Differential expression and quantitative qRT-PCR analysis of microRNA expression
The expression patters were analyzed by qRT-PCR to investigate the potential target mRNAs. Total RNA was extracted by the Plant RNA Kit (Omega, USA). And total RNA was reverse transcribed into cDNA by the reverse transcription kit (SuperScript III Reverse Transcriptase; China). Real-time PCR was performed using a Light Cycler 96 SW1.1 Real Time PCR System (Roche, Swizerland), with SYBR-Green (Takara, China). The primer sequences used (Tab. S1) were designed by the Beacon Designer 7.9 (Premier Biosoft International, USA). Each reaction was carried out in a 10-μL volume, consisting of 5 μL SYBR, 3.5 μL ddH 2 O, 1 μL diluted template (1 μL of the generated first-strand cDNA diluted by 9 μL ddH 2 O), and 0.5 μL gene-specificity primers (0.25 μL for each). The following program was used for RT-PCR: 95°C for 10 min followed by 40 cycles at 95°C for 20 s, 54°C for 20 s, 72°C for 20 s.

Preparation and deep sequencing of small RNAs library
To identify the miRNAs involved in fruit development and ripening, five independent small RNA libraries from five different developmental stages of diploid wild strawberry fruits (F. vesca 'Hawaii 4') were generated and sequenced by high-throughput Illumina Solexa system. Total RNAs extracted at different developmental stages from 18 days after flowering (DAF) to 42 DAF were employed to build small RNA libraries for sequencing. Total reads, ranging from 6.8 million to 7.5 million of the five libraries, were obtained from Illumina HiSeq 2000, after adapter trimming and removing the contaminants, and the low quality reads. The clean reads varying from 6.3 million to 7.2 million were collected. The total numbers of clean reads, ranging from 18 to 30 nucleotides in length, were yielded as the follows from each of five libraries after precluding the low-quality reads and 3' adaptor and 5' contaminant sequences: 2,779,259 (18 DAF), 5,156,298 (24 DAF), 5,301,319 (30 DAF), 3,632,063 (36 DAF), and 4,585,900 (42 DAF), respectively (Tab. 1). In this study, we also detected known-miRNA, novel-RNA, rRNA, tRNA, snRNA, snoRNA, siRNA, repeat-associated sRNA, and degraded fragments of rRNA introns and exons in each RNA library (Tab. 1).
The length distribution of the small RNAs ranged from 18 to 30 nt (Fig. 1), of which the majority were 21-24-nt long. The numbers of reads with different lengths were counted in the five small RNA libraries. It was observed that small RNAs with length 21 nt and 24 nt were most frequent of all the clean reads of each library, followed by 23 nt and 22 nt. In two libraries comparison, common and specific total and unique small RNA sequences were common to other species. Total sRNAs were greater than 60% and unique sRNAs were only 8-12%, which suggested that there was a less abundant but much more diverse pool of small RNAs (Fig. 2). Overall, these data presented the difference and complexity in fruit different development stages.
In this study, we aligned all reads against the strawberry genome, and more than 1.7 million clean reads (62%) were perfectly mapped to the strawberry genome (Tab. 1). Thirty DAF had mapped the greatest number of reads (4,235,787; 79.90%), while 18 DAF mapped the least (1,726,493; 62.12%). In Fig. 3, data show that all strawberry chromosomes had miRNAs distribution on both sense and antisense strands, but with uneven distribution of miRNAs. The results indicated that miRNAs have uneven distribution on strawberry fruit development stages.

Identification of conserved and novel miRNAs in strawberry fruit
In order to identify the known miRNAs in strawberry fruit, the clean reads of each library were compared to other plant miRNAs deposited in miRBase 22.0. Following the BLASTn searches and further sequence analysis, total reads for five libraries were perfectly matched to 218 known miRNAs (148,178,194,170, and 166 in the 18 DAF, 24 DAF, 30 DAF, 36 DAF, and 42 DAF libraries, respectively) during the fruit development stages (Tab. S2). Among these miRNAs, 130 were found to be shared by all five RNA libraries, which accounted for 52.21% of the 219 known miRNAs, while 51 miRNAs were expressed in at least two of our five small RNA libraries, and 32 miRNAs were detected only once among the five samples (Tab. S2).
In this study, a total of 87 predicted novel miRNAs were obtained (Tab. S2) and the stem-loop structures were predicted (see Fig. S1). Of the 87 predicted novel miRNAs, 48 miRNAs were expressed in all five libraries, while 55 were detected in 18    abundance of both known miRNAs and novel miRNAs (148 and 56, respectively), which implied that more miRNAs remain to be revealed in the 18 DAF stage.

MiRNA expression profiles at different developmental stages
Based on the normalized read count for each identified miRNA, differential expression analysis was performed, and 218 known and 87 novel miRNAs were found to show statistically significant changes during fruit development stages, respectively. In this study, the expression levels were lower in many novel miRNAs (Tab. S1). This result indicates that the novel miRNAs play a different role than known miRNAs. During the several fruit development stages, 18 DAF had the least number of known or novel miRNAs, which might suggest that miRNAs have less significant function at the early period of fruit development.
In addition, all miRNAs were divided into 10 subclusters according to their expression profiles at different stages (Fig. 4). The results showed that some groups of miRNAs were upregulated (Subcluster 2, 3, 5, 8, 9, and 10) and three downregulated (Subcluster 4, 6, and 7). Among these clusters, the miRNAs of Subcluster 1 presented a gentle expression profile during all fruits development stages. Some miRNAs continued to increase (Subcluster 3), while others appeared downregulated (Subcluster 7). The miRNAs of Subcluster 9, 10 were accumulated during the early stages, then significantly decreased at 36 DAF, and increased thereafter. The miRNAs of Subcluster 8 still increased from 18 DAF to 36 DAF, and then went down at the last stage (42 DAF).

Target prediction, correlation with miRNA expression, and functional analysis
MicroRNA regulates gene expression through inhibiting translation or degrading mRNA at a certain site [41]. The major challenge about illustrating the functions of miRNAs is to identify their regulatory targets. Currently, bioinformatics methods have been used to decipher target genes in several studies [42,43]. To better understand the roles which strawberry miRNAs play in fruit development stages, we predicted potential miRNA target genes. In the study, we found 1,536 target genes for 213 miRNAs (Tab. S3). Among these miRNAs, 28 miRNAs had only one target gene, one miRNA (vvi-miR396b) had most target genes (49 target genes) followed by gma-miR396h with 38 target genes.
It is known that target genes of miRNAs are involved in physiological or biochemical processes. Thus, all target genes were searched against the protein database, then the genes' putative functions were analyzed by gene ontology (GO). As shown in Fig. 5, all miRNAs target genes were categorized according to biological process (BP) and molecular function (MF) (see also Tab. S4). To further investigate the biochemical pathways of these targets, we mapped them to terms in the KEGG database and compared this with the whole transcriptome background. One hundred and eighty-nine genes had a KO ID and could be categorized into 76 pathways. Of those, the top 20 pathways are listed in Tab. 2.

Validation of microRNA expression by qRT-PCR
To verify the sequencing results, a conserved miRNAs were randomly selected from each subcluster in Fig. 5 for qRT-PCR analysis. All the five miRNA expression patterns using qRT-PCR were similar to the results from high-throughput sequencing in Fig. 6. The results showed that the expression patterns of microRNA obtained through the sequencing are reliable.

Discussion
The types of fruit ripening are categorized into two groups: climacteric and nonclimacteric. The classification of fruits depends on whether ripening-associated increased respiration occurs in the fruit. During the fruit ripening, many phenomena vary significantly and fruits generally display various biochemical and physiological modifications [44: p. 50-58]. In addition to color, smell, taste, and other quality changes, softening is an important feature of fruit ripening. Also, in strawberry fruit softening is associated with its growth and maturation.
In plants, a high number of experimental and computational studies have indicated that mature miRNAs are evolutionarily conserved. MiRNAs regulate gene expressions post-transcriptionally by degrading or repressing translation of target mRNAs [34]. MiRNA-mediated gene regulation has an ancient phylogenetic origin and plays an important regulatory role in physiological processes [45] and many aspects of plant growth, development, and environmental adaptability. In the past years, some studies examined the miRNA regulation mechanisms in fruit of Fragaria ananassa. Xu et al. [37] used high-throughput sequencing and degradome analysis to identify miRNAs and their targets involved in these fruits postharvest (stored at 20°C for 0 h and 24 h) senescence. Eighty-eight known and 1,224 new candidate miRNAs and 103 targets cleaved by 19 known miRNAs families and 55 new candidate miRNAs were obtained which targets were associated with development, metabolism, defense response, signaling transduction, and transcriptional regulation. Kang et al. [46] were focused on long noncoding RNAs in diploid strawberry Fragaria vesca from 35 different flower and fruit tissues, which provided the first look at the lncRNA landscape in a strawberry fruit.
Fragaria vesca, the woodland strawberry, is becoming a new model organism for octoploid cultivated strawberry, which has a short life cycle and small sequenced genome (2n = 14,240 Mb) [47,48]. Hence, in this study, five libraries from strawberry  Note: rich factor refers to the radio of the number of differentially expressed genes that locate the pathway to the total number of all genes that locate the pathway. The larger the rich factor, the higher the degree of enrichment. Q value is corrected p value and the value ranges from 0 to 1. The smaller the value, the more obvious the enrichment.
fruit five development stages (18 DAF, 24 DAF, 30 DAF, 36 DAF and 42 DAF) were constructed to understand the roles of miRNAs in regulating strawberry fruit maturation. We identified 18-30-nt sRNAs (Fig. 1). Among these five libraries, lengths of 21-and 24-nt sRNA sequences are more frequent than other lengths. During fruit development, 24-nt sRNAs prevailed in the early stage, while 21-nt sRNAs were more abundant in the later stages of fruit development. In plants, 21-nt miRNAs and transacting siRNAs mediate endogenous gene silencing at the post-transcriptional level by guiding mRNA degradation or translational inhibition [49][50][51], while 24-nt siRNAs cause transcriptional gene silencing through directing DNA and histone modification [48,51]. This result suggests that miRNA may regulate target genes though different mechanisms in five stages , and 21-nt sRNAs post-transcriptional regulation is more important in the later stage of strawberry fruit development, while 24-nt sRNAs direct transcriptional gene regulation is the major mechanism in the early fruit developmental stages.
Tab. S2 All known and novel miRNAs.
Tab. S4 KEGG pathway annotation of target gene.