Next-Generation Sequencing of Protein-Coding and Long Non-protein-Coding RNAs in Two Types of Exosomes Derived from Human Whole Saliva

Yuko Ogawa; Masafumi Tsujimoto; Ryohei Yanoshita

doi:10.1248/bpb.b16-00297

Abstract

Exosomes are small extracellular vesicles containing microRNAs and mRNAs that are produced by various types of cells. We previously used ultrafiltration and size-exclusion chromatography to isolate two types of human salivary exosomes (exosomes I, II) that are different in size and proteomes. We showed that salivary exosomes contain large repertoires of small RNAs. However, precise information regarding long RNAs in salivary exosomes has not been fully determined. In this study, we investigated the compositions of protein-coding RNAs (pcRNAs) and long non-protein-coding RNAs (lncRNAs) of exosome I, exosome II and whole saliva (WS) by next-generation sequencing technology. Although 11% of all RNAs were commonly detected among the three samples, the compositions of reads mapping to known RNAs were similar. The most abundant pcRNA is ribosomal RNA protein, and pcRNAs of some salivary proteins such as S100 calcium-binding protein A8 (protein S100-A8) were present in salivary exosomes. Interestingly, lncRNAs of pseudogenes (presumably, processed pseudogenes) were abundant in exosome I, exosome II and WS. Translationally controlled tumor protein gene, which plays an important role in cell proliferation, cell death and immune responses, was highly expressed as pcRNA and pseudogenes in salivary exosomes. Our results show that salivary exosomes contain various types of RNAs such as pseudogenes and small RNAs, and may mediate intercellular communication by transferring these RNAs to target cells as gene expression regulators.

Human whole saliva (WS) contains an aqueous complex mixture of proteins, peptides, hormones, metabolites, DNAs and RNAs. WS contributes to maintain the integrity of the oral cavity through its lubricating, antibacterial, antiviral and buffering actions, and facilitates chewing and swallowing food. It plays an important role in front-line body defense. Because WS can be collected simply, cheaply and noninvasively, it has been used to monitor human health and disease. Because saliva contains ribonucleases and nucleases from various sources, which influence RNA stability, relatively few studies have analyzed human salivary RNA.¹⁾ Salivary nucleotides are protected from degradation by inclusion in extracellular vesicles such as exosomes. Recently, RNAs and DNAs in WS have been highlighted in the fields of biomarker research, disease diagnostics and forensic study.^2–4)

Exosomes are small (30–100 nm) membrane vesicles of endocytic origin that are released into the extracellular environment upon fusion of multivesicular bodies with the plasma membrane. Exosomes are present in various body fluids including blood, breast milk, malignant ascites, urine, amniotic fluid and saliva.⁵⁾ Exosomes can contain the proteins and nucleic acids of their cell of origin and can transfer their contents to recipient cells at a distance.⁵⁾ Exosomes are now thought to be secreted by various cell types, and numerous components of exosomal proteins, RNAs and lipids are registered in Vesiclepedia, the database of extracellular vesicles.⁶⁾ Previous studies have shown that exosomes also contain protein-coding RNAs (pcRNAs), usually referred to as mRNAs, and small non-coding RNAs (sncRNAs) called microRNAs (miRNAs). These RNA molecules can be transferred to other cells and are functional in the new environment.⁷⁾ Because exosomes can transfer their information from secreted cells to other cells, they are attracting attention in the study of cancer metastasis, immune reaction and biomarker research.^8–10)

Although more than 90% of the human genome is transcribed into RNA, only about 2% is translated into proteins.¹¹⁾ Non-coding RNAs (ncRNAs) do not encode proteins but function directly at the level of the RNA in the cell. NcRNAs are generally classified by size: sncRNAs are less than 200 bases, and long ncRNAs (lncRNAs) are greater than 200 bases.¹¹⁾ miRNAs are a class of small (17–25 nucleotides) single-stranded sncRNAs that control gene expression in animals, plants, and unicellular eukaryotes. In addition to miRNAs, the following regulatory ncRNA gene types are also annotated in the Ensemble database: transfer RNAs (tRNAs), transfer RNAs located in the mitochondrial genome (Mt-tRNAs), ribosomal RNAs (rRNAs), piwi-interacting RNAs (piRNAs), small cytoplasmic RNAs (scRNAs), small nuclear RNAs (snRNAs), small nucleolar RNAs (snoRNAs), miscellaneous other RNAs (misc_RNAs), and long intergenic non-coding RNAs (lincRNAs), along with pseudogenes¹²⁾ (Ensemble Asia website; http://asia.ensembl.org/info/genome/genebuild/ncrna.html). However, the function of each ncRNA is yet to be fully elucidated.

We previously isolated two types of extracellular vesicles in human WS by ultrafiltration and gel-exclusion column chromatography.^13,14) Although our method has not been universally recognized for purifying exosomes, Wyss et al. recently showed that ultrafiltration and size-exclusion are valid methodologies for intact exosome purification.¹⁵⁾ Because exosomal marker proteins (Alix, tsg101, hsp70 and CD63) were detected in both samples by Western blot analysis, we designated the vesicles as exosomes I and II. Exosome I was derived from the void fraction from the column. Exosome II was derived from the second small protein peak with high activity of dipeptidyl peptidase-4 (DPP4). The mean diameter of exosome I was 83.5 nm and that of exosome II was 40.5 nm as calibrated by transmission electron microscopy. We also performed proteome and small RNA transcriptome analyses^14,16) and the results showed that the protein and sncRNA components of exosomes I and II did not completely coincide with each other. The reads mapping to sncRNAs by next-generation sequencing (NGS) showed that sncRNAs of rRNAs, piRNAs, snoRNAs, short interspersed nuclear elements (SINEs), long interspersed nuclear elements (LINEs) and reads defined as ‘repeats’ included long terminal repeats (LTRs) were abundant in the two types of exosomes and WS, in addition to miRNAs.

In recent years, many studies have performed NGS analysis of exosomal RNA.^17–19) However, most of these analyses were targeted to known RNAs, such as mRNAs and miRNAs, and studies on lncRNAs in exosomes remain limited.²⁰⁾ In this study, we investigated long RNA components (pcRNA and lncRNA) of exosomes I, II and WS by NGS using total RNA samples from our previous study of the small RNA transcriptome.¹⁶⁾

MATERIALS AND METHODS

Isolation of Total RNA from Human WS

Ethical approval was obtained from the institutional review board of Teikyo Heisei University (approval number 26-088). Isolation of total RNA from WS was performed as described previously.¹⁶⁾ Briefly, human WS was collected from a single healthy female volunteer of Japanese origin (38 years old) from our laboratory, and written informed consent for this specific study was obtained. Total RNA was isolated from 1 mL of WS using the RNeasy Protect Saliva Mini Kit (Qiagen, Valencia, CA, U.S.A.), according to the manufacturer’s instructions. The quantity and quality of total RNA was assessed using a bioanalyzer (Agilent 2100; Agilent Technologies, Palo Alto, CA, U.S.A.).

Salivary Exosome Isolation and RNA Isolation

Exosomes were purified from WS as previously described.¹⁴⁾ Briefly, 30 mL of WS was added to an equal volume of Tris-buffered saline (20 mM Tris–HCl, pH 7.4, 150 mM NaCl). The cell debris and bacteria in oral cavity were pelleted and removed by centrifuging the WS sample at 8000×g for 5 min at room temperature. A part of the supernatant was used for RNA isolation from WS. The supernatant was filtered through a 5 µm cellulose acetate filter and ultrafiltered using an Amicon Ultra-15 centrifugal filter device with a 100 kDa exclusion (Millipore Corporation, MA, U.S.A.). The concentrated filtrate was subjected to gel filtration on a Sephacryl S-500 column (GE Healthcare, Buckinghamshire, U.K.) equilibrated with Tris-buffered saline. Void fractions (exosome I) and the subsequent fractions displaying high DPP4 activity (exosome II) were collected and ultrafiltered using Amicon Ultra-4 with a 100 kDa exclusion. Purification was performed seven times from independently collected WS samples. Purified exosome samples were pooled and used for RNA isolation. Total RNA was isolated using the RNeasy Protect Saliva Mini Kit (Qiagen), according to the manufacturer’s instructions. Extracted RNA was concentrated, and the quantity and quality of total RNA were assessed using an Agilent 2100 Bioanalyzer. The total RNA concentrations of exosomes I, II and WS were 0.21, 0.10 and 4.2 ng/mL of saliva, respectively.¹⁶⁾

RNA Library Construction and Sequencing

Isolated RNA samples from exosomes I, II and WS were amplified using the RampUP RNA Amplification Kit (Genisphere, PA, U.S.A.) according to the manufacturer’s protocol. Total RNA (2 ng) from exosomes I, II or WS was used for amplification. The amplification method was as follows: first, cDNA sequences were generated by reverse transcription from all RNA molecules using reverse transcriptase, dT primer and/or random primers. Next, a poly(A) tail was added using 2′-deoxyadenosine 5′-triphosphate (dATP) and terminal deoxynucleotidyl transferase. The T7/T3 template Oligo was annealed to the 3′ end of the cDNA. Klenow enzyme filled in the 3′ end of first strand cDNA to produce a double-stranded T7/T3 promoter. In vitro transcription with T7 RNA polymerase was then performed and sense RNA copies of the original RNA molecules were generated. The sense RNA was subjected to reverse transcription to produce second cDNA. The sense RNA was then degraded by RNase H. The second cDNA was annealed with T3 template Oligo and in vitro transcription with T3 RNA polymerase was performed to generate sense RNA.

Sequencing samples were generated using the amplified sense RNA, according to the manufacturer’s instructions (Illumina Inc., San Diego, CA, U.S.A.). Briefly, first-strand cDNA was synthesized by reverse transcription, and second-strand cDNA was generated using the first-strand cDNA. After treatment by Klenow DNA polymerase, a poly(A) tail was added, and the adaptor oligo sequence was ligated. The cDNA product was size-fractionated by agarose gel electrophoresis and the cDNA fraction in the range of 200–400 nucleotides was extracted. The resulting cDNA was amplified and sequenced for 51 cycles, single-end, on the Illumina Genome Analyzer IIx (Illumina) by Hokkaido System Science Co., Ltd. (Sapporo, Japan).

RNA Genome Mapping

Read sequences containing primer/adaptor sequences were discarded as follows: if the estimated quality value (QV) was lower than 10, the bases were trimmed and adaptor sequences were removed using Cutadapt version 1.1 (https://cutadapt.readthedocs.org/en/stable/).²¹⁾ We then filtered out the trimmed reads with lengths shorter than 36 bases.

The read sequences were mapped to the human Ensembl GRCh37.p6 genome assembly using TopHat (http://ccb.jhu.edu/software/tophat/index.shtml).²²⁾ TopHat first maps non-junction reads (those contained within exons) using Bowtie (http://bowtie-bio.sourceforge.net/index.shtml).²³⁾ The reads were mapped to the human genomic sequence, with at most two mismatches or 2 bases of indels allowed. The reads that did not map were used to find splice junctions without a reference annotation by TopHat with 25 bases segments. The 25 bases reads allow up to two mismatches.

The mapped sequences were quantitated by expressional intensity, RPKM (reads per kilobase of exon per million mapped reads) using Cufflinks version 1.3.0 (http://cufflpnks.cbcb.umd.edu/).²⁴⁾ RPKM values of ≥10 were used for further analysis.

Pseudogenes were surveyed using the pseudogene.org database (http://pseudogene.org/).²⁵⁾ The reads that predicted ‘pseudogene’ by ensemble annotation were collected and each gene’s position on a chromosome was referred to on pseudogene.org. The reads with start and end positions of genes within 100 bases between the two databases were designated as ‘nearest pseudogenes.’ To count the nearest pseudogene, the redundancy of the parental genes of the pseudogenes was removed because a single pcRNA often has more than one related pseudogene.

Gene ontology (GO) analysis was performed using CateGOrizer (http://www.animalgenome.org/bioinfo/tools/countgo/).²⁶⁾ GO analysis was performed for biological process and molecular function categories with multiple occurrences. The results with unknown annotations were excluded from the counting.

RESULTS

Sequencing and Annotation of RNAs from Salivary Exosomes and Whole Saliva (WS)

Bioanalyzer profiles of the total RNA isolated from exosomes I, II and WS were previously described.¹⁶⁾ A broad range of RNA sizes were detected in all three samples. rRNA (18s, 28s rRNA) peaks were not as high in our samples compared with those of urinary exosomes.²⁰⁾ Figure 1 shows an overview of the analysis of pcRNAs and lncRNAs in the two types of salivary exosomes and WS. Total RNA yields from exosomes I and II starting from 210 mL of WS were 44 and 21 ng, respectively. The total RNA yield from 1 mL of WS was 4.2 ng. Because the levels of RNAs in the three samples were too low to analyze by NGS, 2 ng of total RNAs of the salivary exosomes and WS were amplified using random primer and dT primer. Bioanalyzer profiles of the amplified RNA showed that the RNA fragment peak was slightly lower than 200 nucleotides (Supplementary Fig. S1). Because the Illumina Genome Analyzer IIx can apply a maximum of 400 nucleotides, RNAs between 200–400 nucleotides in size were used to construct the cDNA libraries of large RNA. The amounts of amplified sense RNA obtained from exosomes I, II and WS were 944.4, 17578.8 and 1053.6 ng, respectively.

Fig. 1. Overview of the Analysis of Protein-Coding RNAs and Long Non-protein Coding RNAs in Two Types of Salivary Exosomes (Exosomes I, II) and Human Whole Saliva (WS)

We performed NGS on the RNAs of exosomes I, II and WS using Illumina high-throughput RNA sequencing technology. The RNAs amplified from each sample group were reverse transcribed and sequenced. After removing low-quality regions, adaptors and all possible contaminations, we obtained a total of 9935476, 7684370 and 6283177 sequence reads from exosomes I, II and WS, respectively (Table 1, DNA Data Bank of Japan Accession No. DRA003516). The reads generated in this study were subjected to cluster and assembly analyses using TopHat and Cufflinks.

Table 1. Numbers of Mapped Reads

Sample	Total filtered reads	Unique						Multiple	%	Unmapped	%
Sample	Total filtered reads	Total	%	Genome	%	Bridged	%	Multiple	%	Unmapped	%
Exosome I	9935476	3103431	31.2	3050922	30.7	52509	0.53	287134	2.89	6544911	65.9
Exosome II	7684370	2277858	29.6	2055046	26.7	222812	2.90	383641	4.99	5022871	65.4
WS	6283177	159803	2.54	143666	2.29	16137	0.26	35395	0.56	6087979	96.9

RNAs were prepared from salivary exosome (exosomes I, II) and WS. Numbers of reads and their percentages of the total number of filtered reads (Total filtered reads) are shown as the reads mapped to the human genome (Unique-Genome), the reads mapped uniquely to a predicted exon–exon bridging sequence (Unique-Bridged), the total number of reads mapped uniquely to the genome and to a predicted exon–exon bridging sequence (Unique-Total), the reads mapped to multiple loci of the human genome (Multiple) and reads unable to be mapped to the human genome (Unmapped).

Of the 10, 8 and 6 million quality-evaluated reads of exosomes I, II and WS, respectively (Table 1; Total filtered reads), a total of 31, 30 and 2.5% of the reads were mapped uniquely to the human genome or to exon–exon junctions (Table 1; Unique-total). We used 4260922 reads for mapping in exosome I, 3572880 reads in exosome II and 285918 reads in WS. These numbers contain the reads of ‘Unique-total’ and the reads of ‘Multiple’ of multiple loci. Figure 2A shows the genomic context of pcRNA and lncRNA (intron, intergenic region, rRNA, other ncRNAs and pseudogene classified according to Ensemble annotation) in the three samples. In addition to pcRNA (14/43/27%, exosomes I/II/WS), lncRNAs (86/57/73%, exosomes I/II/WS) were predominantly detected in all three samples. The main contents of lncRNAs were intergenic regions (54/19/32%, exosomes I/II/WS), introns (27/16/17%, exosomes I/II/WS) and pseudogenes (5.3/20/19%, exosomes I/II/WS). Other ncRNA (0.61/1.9/3.9%, exosomes I/II/WS), such as precursors of sncRNA (e.g., miRNA, snoRNA) and rRNA (0.17/1.2/0.49%, exosomes I/II/WS) were detected at low levels. The major portions of sequencing reads in exosome I and WS were intergenic regions, whereas the major portion of exosome II was pcRNA. Because RNAs categorized as intergenic and introns do not correspond to specific genes, they were eliminated from gene quantification (Fig. 2B, Known RNA). The identified numbers of known RNAs were 837872, 2322933 and 145374 sequence reads from exosomes I, II and WS, respectively.

Fig. 2. (A) Genomic Context of Sequencing Reads in Salivary Exosomes and WS; (B) Sequences That Mapped to Known RNAs

Each pie chart represents the percentage of sequencing reads of exosomes I, II and WS. (A) Total RNA reads of 4260922 reads in exosome I, 3572880 reads in exosome II and 285918 reads in WS were used for mapping. (B) The numbers of known RNAs were 837872, 2322933 and 145374 sequence reads from exosomes I, II and WS, respectively.

The RNAs of pseudogenes in the transcriptome suggest that processed pseudogenes were included in salivary exosomes and WS. Recent studies have shown that processed pseudogenes function in gene regulation.²⁷⁾ Therefore, we focused on the relationship of pcRNAs and pseudogenes, as described later.

Analysis of Common Reads in Exosomes I, II and WS

Figure 3 shows Venn diagrams of all RNAs, pcRNAs and pseudogenes classified by Emsembl annotation. The identified number of quantitated genes (RPKM value of ≥10) were 3035, 4826 and 2489 sequence reads from exosomes I, II and WS, respectively (Supplementary Tables S1–S3). A total of 11% (1189 reads) of all RNAs (10350 reads), 10% (671 reads) of pcRNAs (6454 reads), and 17% (399 reads) of pseudogenes (2281 reads) were commonly detected among exosomes I, II and WS. The transcripts detected in exosome I were highly similar to those of exosome II. In exosome I, 68% (2075 reads) of all RNAs (3035 reads), 84% (1297 reads) of pcRNAs (1541 reads) and 74% (532 reads) of pseudogenes (720 reads) were shared with exosome II. In exosome II, a total of 47% (2257 reads) of all RNAs, 50% (1665 reads) of pcRNAs and 32% (298 reads) of pseudogenes were detected only in exosome II. In contrast, the number of pcRNAs in WS was similar to that in exosome I, while fewer total RNAs and pseudogenes were found in WS compared with exosomes I and II.

Fig. 3. Venn Diagrams of All RNAs, Protein-Coding RNAs and Pseudogenes Expressed in Exosomes I, II and WS

Numbers indicate RNAs that overlap by mapping. Parenthetical numbers indicate numbers of total RNA expressed in exosomes I, II and WS.

RPKM values of reads in each sample showed a high correlation in exosome I compared with exosome II (r²=0.96), exosome I compared with WS (r²=0.77) and exosome II compared with WS (r²=0.79) by scatter plots (Supplementary Fig. S2). This suggests the high-level transcripts in all three samples were coincident.

Characteristics of the Different Classes of RNAs

The 30 most highly expressed RNAs of all transcripts of exosomes I, II and WS are shown in Table 2. Lists of all RNAs of exosomes I, II and WS are shown in Supplementary Tables S1–S3 and contain RPKM rankings of all RNAs, pcRNAs, pseudogenes and pseudogene of highest RPKM with information regarding the nearest pseudogenes (see Materials and Methods). In the RNA population, the lncRNA GenBank accession No. AC091047.1 was the most highly expressed sequence in all three samples. Exosomes I and II shared the same five most highly expressed RNAs. WS also contained the five most highly expressed RNAs of salivary exosomes. The compositions of the 30 most highly expressed RNAs were comparable among the three samples: pseudogenes (50/47/53%, exosomes I/II/WS), other ncRNAs (30/27/27%, exosomes I/II/WS), pcRNAs (17/23/17%, exosomes I/II/WS) and rRNAs (3/3/3%, exosomes I/II/WS). Although pseudogenes and other ncRNAs are not predominantly shown in Fig. 2B, this is the most abundant category in the list of high RPKM reads (Table 2).

Table 2. The 30 Most Highly Expressed Genes in Exosomes I, II and WS

A. Exosome I
Rank	Ensembl gene ID	Gene short name	Locus	RPKM	Category*
1	ENSG00000252197	AC091047.1	chr8: 70602343–70602417	478152	Other NC
2	ENSG00000240831	AC112777.1	chr12: 20704357–20704522	369166	Pseudogene
3	ENSG00000252229	AC098691.1	chr1: 91852861–91852949	142297	Other NC
4	ENSG00000252318	AC097532.1	chr2: 133038646–133038738	132016	Other NC
5	ENSG00000251948	AC092279.1	chr19: 24184074–24184165	107644	Other NC
6	ENSG00000251705	RN5-8S6	chrY: 10037763–10037915	46183	rRNA
7	ENSG00000242257	AC044839.1	chr11: 45848152–45848297	40074	Pseudogene
8	ENSG00000241482	AC064836.1	chr2: 203210988–203211097	32662	Pseudogene
9	ENSG00000226958	RN28S1	chrX: 108297360–108297792	25448	Pseudogene
10	ENSG00000239935	AC116340.1	chr5: 71146739–71146942	23479	Pseudogene
11	ENSG00000241530	AC006368.1	chr2: 230045487–230045666	18265	Pseudogene
12	ENSG00000241376	AL606830.1	chr6: 120583431–120583547	17007	Pseudogene
13	ENSG00000243013	AL592307.1	chr1: 145277249–145277501	16710	Pseudogene
14	ENSG00000256393	AC138123.2	chr12: 93477373–93477451	16498	Pseudogene
15	ENSG00000242604	AL512503.1	chr1: 120543873–120544125	16141	Pseudogene
16	ENSG00000243185	AC108078.1	chr4: 70296578–70296753	15958	Pseudogene
17	ENSG00000227063	RPL41P1	chr20: 21735865–21736171	12936	Pseudogene
18	ENSG00000243884	AL163011.1	chr14: 90341364–90341577	11654	Pseudogene
19	ENSG00000143546	S100A8	chr1: 153362507–153363664	11260	ProteinCoding
20	ENSG00000210140	J01415.10	chrMT: 5760–5826	8232	Other NC
21	ENSG00000210144	J01415.11	chrMT: 5825–5891	7889	Other NC
22	ENSG00000213741	RPS29	chr14: 50043389–50065408	5407	ProteinCoding
23	ENSG00000243172	AP003035.1	chr11: 85195011–85195304	5173	Pseudogene
24	ENSG00000210174	J01415.16	chrMT: 10404–10469	5038	Other NC
25	ENSG00000252248	AC093693.1	chr7: 68527370–68527457	4998	Other NC
26	ENSG00000201098	RNY1	chr7: 148684227–148684340	4849	Other NC
27	ENSG00000253945	RP11-328L11.1.1	chr8: 96416029–96416122	4658	Pseudogene
28	ENSG00000171195	MUC7	chr4: 71296208–71348714	4465	ProteinCoding
29	ENSG00000205649	HTN3	chr4: 70894129–70902255	4423	ProteinCoding
30	ENSG00000197756	RPL37A	chr2: 217362911–217443903	3743	ProteinCoding
B. Exosome II
Rank	Ensembl gene ID	Gene short name	Locus	RPKM	Category*
1	ENSG00000252197	AC091047.1	chr8: 70602343–70602417	1.02E+06	Other NC
2	ENSG00000240831	AC112777.1	chr12: 20704357–20704522	931345	Pseudogene
3	ENSG00000252229	AC098691.1	chr1: 91852861–91852949	433010	Other NC
4	ENSG00000252318	AC097532.1	chr2: 133038646–133038738	410868	Other NC
5	ENSG00000251948	AC092279.1	chr19: 24184074–24184165	316818	Other NC
6	ENSG00000256393	AC138123.2	chr12: 93477373–93477451	193124	Pseudogene
7	ENSG00000227063	RPL41P1	chr20: 21735865–21736171	173497	Pseudogene
8	ENSG00000242257	AC044839.1	chr11: 45848152–45848297	112919	Pseudogene
9	ENSG00000241482	AC064836.1	chr2: 203210988–203211097	100453	Pseudogene
10	ENSG00000143546	S100A8	chr1: 153362507–153363664	96291	ProteinCoding
11	ENSG00000226958	RN28S1	chrX: 108297360–108297792	74745	Pseudogene
12	ENSG00000241376	AL606830.1	chr6: 120583431–120583547	65417	Pseudogene
13	ENSG00000210140	J01415.10	chrMT: 5760–5826	63515	Other NC
14	ENSG00000251705	RN5-8S6	chrY: 10037763–10037915	63089	rRNA
15	ENSG00000210144	J01415.11	chrMT: 5825–5891	58794	Other NC
16	ENSG00000239935	AC116340.1	chr5: 71146739–71146942	48285	Pseudogene
17	ENSG00000243013	AL592307.1	chr1: 145277249–145277501	47861	Pseudogene
18	ENSG00000242604	AL512503.1	chr1: 120543873–120544125	45872	Pseudogene
19	ENSG00000241530	AC006368.1	chr2: 230045487–230045666	45111	Pseudogene
20	ENSG00000243185	AC108078.1	chr4: 70296578–70296753	42385	Pseudogene
21	ENSG00000243884	AL163011.1	chr14: 90341364–90341577	28801	Pseudogene
22	ENSG00000171195	MUC7	chr4: 71296208–71348714	28477	ProteinCoding
23	ENSG00000205649	HTN3	chr4: 70894129–70902255	28160	ProteinCoding
24	ENSG00000210112	J01415.6	chrMT: 4401–4469	27693	Other NC
25	ENSG00000241800	AC114498.5	chr1: 567995–568067	24865	Pseudogene
26	ENSG00000229117	RPL41	chr12: 56510369–56511727	24393	ProteinCoding
27	ENSG00000210151	J01415.12	chrMT: 7445–7514	23862	Other NC
28	ENSG00000197756	RPL37A	chr2: 217362911–217443903	21331	ProteinCoding
29	ENSG00000131469	RPL27	chr17: 41150445–41154956	20401	ProteinCoding
30	ENSG00000213741	RPS29	chr14: 50043389–50065408	19147	ProteinCoding
C. Whole saliva
Rank	Ensembl gene ID	Gene short name	Locus	RPKM	Category*
1	ENSG00000252197	AC091047.1	chr8: 70602343–70602417	3.65E+06	Other NC
2	ENSG00000252229	AC098691.1	chr1: 91852861–91852949	3.03E+06	Other NC
3	ENSG00000251948	AC092279.1	chr19: 24184074–24184165	2.97E+06	Other NC
4	ENSG00000240831	AC112777.1	chr12: 20704357–20704522	2.38E+06	Pseudogene
5	ENSG00000252318	AC097532.1	chr2: 133038646–133038738	805671	Other NC
6	ENSG00000243185	AC108078.1	chr4: 70296578–70296753	515461	Pseudogene
7	ENSG00000239935	AC116340.1	chr5: 71146739–71146942	231714	Pseudogene
8	ENSG00000242257	AC044839.1	chr11: 45848152–45848297	199273	Pseudogene
9	ENSG00000226958	RN28S1	chrX: 108297360–108297792	183758	Pseudogene
10	ENSG00000251705	RN5-8S6	chrY: 10037763–10037915	172762	rRNA
11	ENSG00000241376	AL606830.1	chr6: 120583431–120583547	151241	Pseudogene
12	ENSG00000241482	AC064836.1	chr2: 203210988–203211097	142535	Pseudogene
13	ENSG00000241530	AC006368.1	chr2: 230045487–230045666	142027	Pseudogene
14	ENSG00000256393	AC138123.2	chr12: 93477373–93477451	138602	Pseudogene
15	ENSG00000243013	AL592307.1	chr1: 145277249–145277501	133734	Pseudogene
16	ENSG00000242604	AL512503.1	chr1: 120543873–120544125	131823	Pseudogene
17	ENSG00000227063	RPL41P1	chr20: 21735865–21736171	127740	Pseudogene
18	ENSG00000243172	AP003035.1	chr11: 85195011–85195304	77836	Pseudogene
19	ENSG00000243884	AL163011.1	chr14: 90341364–90341577	58579	Pseudogene
20	ENSG00000201098	RNY1	chr7: 148684227–148684340	56291	Other NC
21	ENSG00000244469	TRNAU2	chr22: 44546536–44546622	51075	Pseudogene
22	ENSG00000143546	S100A8	chr1: 153362507–153363664	45291	ProteinCoding
23	ENSG00000188846	RPL14	chr3: 40498782–40506549	32624	ProteinCoding
24	ENSG00000258486	RN7SL1	chr14: 50053296–50053596	28711	Other NC
25	ENSG00000241781	AL161626.1	chr9: 79186648–79186950	27011	Pseudogene
26	ENSG00000216144	AL136373.1	chr1: 47006015–47006106	25389	Other NC
27	ENSG00000206696	SNORD58B	chr18: 47018033–47018099	24835	Other NC
28	ENSG00000131469	RPL27	chr17: 41150445–41154956	21620	ProteinCoding
29	ENSG00000229117	RPL41	chr12: 56510369–56511727	20448	ProteinCoding
30	ENSG00000197756	RPL37A	chr2: 217362911–217443903	18974	ProteinCoding

*Categories are provided by ensemble annotations. ‘Other NC’ means lncRNA except pseudogenes and rRNAs.

The 10 most highly expressed pcRNAs of exosomes I, II and WS are shown in Table 3. In the pcRNA population, S100A8 was the most abundant sequence in all three samples. Ribosomal proteins (ribosomal proteins of the large subunit [RPL] and ribosomal proteins of the small subunit [RPS]) were preferentially detected among the three samples (60/57/53%, exosomes I/II/WS) (Table 3 and Supplementary Tables S1–S3). Salivary proteins including MUC7, HTN3 and STATH were also detected among these proteins. The most highly expressed pcRNA, except for ribosomal proteins and salivary proteins, was tumor protein translationally controlled 1 (TPT1) in exosomes I and II (Table 3).

Table 3. The 10 Most Highly Expressed pcRNAs in Exosomes I, II and WS

Rank	Ensembl gene ID	Gene short name	RPKM	Pseudogene ID of ensemble	Nearest pseudogene ID of pseudogene org.	RPKM of related pseudogene	Redundancy*
Exosome I
1	ENSG00000143546	S100A8**	11260	—	—	—	—
2	ENSG00000213741	RPS29	5407	ENSG00000230777	PGOHUM00000245093	1182	9
3	ENSG00000171195	MUC7**	4465	—	—	—	—
4	ENSG00000205649	HTN3	4423	—	—	—	—
5	ENSG00000197756	RPL37A	3743	ENSG00000226243	PGOHUM00000247540	135	2
6	ENSG00000229117	RPL41	3571	—	—	—	—
7	ENSG00000126549	STATH	3516	—	—	—	—
8	ENSG00000131469	RPL27	3106	ENSG00000225616	PGOHUM00000243951	142	1
9	ENSG00000133112	TPT1	2364	ENSG00000234782	PGOHUM00000236321	118	3
10	ENSG00000138326	RPS24	2209	ENSG00000227008	PGOHUM00000242019	251	7
Exosome II
1	ENSG00000143546	S100A8**	96291	—	—	—	—
2	ENSG00000171195	MUC7**	28477	—	—	—	—
3	ENSG00000205649	HTN3	28160	—	—	—	—
4	ENSG00000229117	RPL41	24393	—	—	—	—
5	ENSG00000197756	RPL37A	21331	ENSG00000226243	PGOHUM00000247540	658	4
6	ENSG00000131469	RPL27	20401	ENSG00000225616	—	1040	3
7	ENSG00000213741	RPS29	19147	ENSG00000230777	—	5103	4
8	ENSG00000126549	STATH	18096	—	—	—	—
9	ENSG00000133112	TPT1	17277	ENSG00000234782	PGOHUM00000236321	696	6
10	ENSG00000138326	RPS24	13362	ENSG00000227008	PGOHUM00000242019	1711	8
Whole saliva
1	ENSG00000143546	S100A8**	45291	—	—	—	—
2	ENSG00000188846	RPL14	32624	ENSG00000241923	PGOHUM00000245093	103	1
3	ENSG00000131469	RPL27	21620	ENSG00000225616	—	1177	4
4	ENSG00000229117	RPL41	20448	—	—	—	—
5	ENSG00000197756	RPL37A	18974	ENSG00000176343	PGOHUM00000247540	574	2
6	ENSG00000171195	MUC7**	14311	—	—	—	—
7	ENSG00000213741	RPS29	14010	ENSG00000235354	—	4525	2
8	ENSG00000114391	RPL24	10812	ENSG00000181524	PGOHUM00000243951	3247	4
9	ENSG00000166441	RPL27A	9283	ENSG00000182383	PGOHUM00000236321	14	1
10	ENSG00000205649	HTN3	8582	—	—	—	—

* The number of the redundant reads of same parental gene of the pseudogenes. ** Translated proteins were detected in the proteome data.¹⁴⁾

We compared the pcRNA data and proteome data of exosome I (105 proteins) and exosome II (154 proteins).¹⁴⁾ Among the top 10 pcRNAs, two RNAs (S100A8, MUC7) were expressed in both types of exosomes. In total, 35 pcRNAs in exosome I (33%) and 66 pcRNAs in exosome II (43%) were expressed. In WS, 35 pcRNAs of exosome I (33%) and 39 pcRNAs of exosome II (25%) were expressed. Therefore, pcRNAs and expressed proteins of the salivary exosome may coexist in secreted exosomes. Moreover, part of the pcRNAs in WS may be derived from salivary exosomes.

Investigating Pseudogenes and Parent Genes

As described above, pseudogenes were detected in all three samples. Thus we next investigated whether the pseudogenes and their parental genes were detectable together in each sample. The total RNA reads that were annotated as pseudogenes by ensemble annotation in exosomes I, II and WS were 720, 945 and 616, respectively (RPKM>10, Supplementary Tables S1–S3). To detect the nearest pseudogene, the start and end positions of the reads of the pseudogenes were searched in the database Pseudogene.org. The numbers of identified nearest pseudogenes were 388, 608 and 377 in exosomes I, II and WS, respectively (Supplementary Tables S1–S3). The annotations of the parental genes of the nearest pseudogenes were searched in the pcRNA datasets. In terms of the total parental genes of the nearest pseudogenes, RNAs of ribosomal proteins were preferentially detected in all three samples (62/65/73%, exosomes I/II/WS). Because a single pcRNA often has more than one pseudogene, Table 3 shows the nearest pseudogene with highest RPKM number. All of the nearest pseudogenes with highest RPKM number are listed in Supplementary Tables S1–S3. After redundancy of the parental gene of the pseudogenes was removed, the numbers of species of the parental gene type were 155, 194, and 128 in exosomes I, II and WS, respectively (Supplementary Tables S1–S3). In the parental genes of the nearest pseudogenes with the highest RPKM number, many ribosomal proteins were detected in all three samples (43/38/52%, exosomes I/II/WS). As for TPT1, the RPKM value of the pcRNA and the pseudogene were high, especially in exosomes I and II. Although the pcRNA of S100A8 showed the highest RPKM value in all three samples, the pseudogene of S100A8 was not detected.

Figure 4 shows Venn diagrams demonstrating the intersections of the pcRNAs and parental RNAs of the pseudogenes. The results showed that 71% (110/155 reads) of pseudogenes of exosome I, 86% (166/194 reads) of those of exosome II and 80% (102/128 reads) of those of WS were common between the pcRNAs and parental RNAs of the pseudogene.

Fig. 4. Venn Diagrams of Protein-Coding RNAs and the Parent Genes of the Pseudogenes Expressed in Exosomes I, II and WS

Numbers of pseudogenes indicate the nearest pseudogene with highest RPKM number.

Gene Ontology Analysis

Because there is little information available on the function of proteins that are translated from the RNAs of salivary exosomes, GO analysis of pcRNAs was performed using CateGOrizer. Similar distributions of biological process (Table 4A) and molecular function categories (Table 4B) were observed among the three samples. The three functional groups most commonly identified in the biological process category were genes associated with cellular processes (29/30/30%, exosomes I/II/WS), metabolism (16/15/14%, exosomes I/II/WS) and macromolecule metabolism (12/12/11%, exosomes I/II/WS). The three functional groups most commonly identified in the molecular function category were genes associated with binding (38/38/39%, exosomes I/II/WS), protein binding (21/21/21%, exosomes I/II/WS) and catalytic activity (11/12/11%, exosomes I/II/WS). Therefore, the RNAs in two types of salivary exosomes and WS may play similar roles in the oral cavity.

Table 4. GO Terms in the Biological Process and the Molecular Function Category

A. Biological process category
GO Class ID	Definitions	pcRNA (%)			Parent gene of pseudogene (%)
GO Class ID	Definitions	Exosome I	Exosome II	WS	Exosome I	Exosome II	WS
GO:0006139	Nucleobase, nucleoside, nucleotide and nucleic acid metabolism	5.7	5.6	5.2	7.3	7.4	6.9
GO:0006810	Transport	2.4	2.4	2.3	2.0	2.1	1.7
GO:0006928	Cell motility	0.61	0.54	0.58	0.27	0.25	0.31
GO:0006944	Membrane fusion	0.034	0.037	0.040	—	—	—
GO:0007154	Cell communication	3.6	3.7	4.1	1.3	1.8	1.9
GO:0007275	Development	2.5	2.7	3.1	0.57	0.71	0.65
GO:0007610	Behavior	0.30	0.18	0.20	0.038	0.028	0.055
GO:0008152	Metabolism	15.8	15.3	14.4	21.8	20.7	20.0
GO:0008219	Cell death	1.3	1.5	1.3	0.71	0.69	0.70
GO:0009056	Catabolism	1.4	1.5	1.3	1.7	1.7	1.5
GO:0009058	Biosynthesis	5.0	4.7	5.0	8.8	8.2	8.6
GO:0009405	Pathogenesis	0.0028	—	—	—	—	—
GO:0009987	Cellular process	29.1	29.6	29.6	28.8	28.7	29.3
GO:0030154	Cell differentiation	1.4	1.6	1.8	0.45	0.24	0.41
GO:0043062	Extracellular structure organization and biogenesis	0.017	0.027	0.035	—	0.014	—
GO:0043170	Macromolecule metabolism	12.2	11.6	11.3	19.3	18.5	19.0
GO:0046903	Secretion	0.45	0.43	0.45	0.13	0.056	0.018
GO:0050789	Regulation of biological process	9.9	10.4	10.7	3.6	4.5	4.3
GO:0050896	Response to stimulus	8.3	8.2	8.6	3.5	4.4	4.7
	Total	100	100	100	100	100	100
	Total counts	35644	69931	37624	5206	7079	5409
B. Molecular function category
GO Class ID	Definitions	pcRNA (%)			Parent gene of pseudogene (%)
GO Class ID	Definitions	Exosome I	Exosome II	WS	Exosome I	Exosome II	WS
GO:0003676	Nucleic acid binding	7.4	6.8	7.5	15.1	13.7	16.5
GO:0003774	Motor activity	0.098	0.11	0.17	0.15	0.23	0.17
GO:0003824	Catalytic activity	10.8	12.1	10.5	8.1	7.1	4.7
GO:0004386	Helicase activity	0.25	0.37	0.25	0.30	0.23	—
GO:0004871	Signal transducer activity	1.2	1.2	1.4	0.61	0.34	0.50
GO:0004872	Receptor activity	1.2	1.2	1.3	0.46	0.23	0.83
GO:0005198	Structural molecule activity	2.1	1.4	2.0	10.5	8.9	12.2
GO:0005215	Transporter activity	2.4	1.8	1.8	1.2	2.9	0.50
GO:0005488	Binding	37.9	37.9	38.8	38.4	38.2	41.7
GO:0005515	Protein binding	20.7	20.5	20.9	14.6	16.8	16.5
GO:0008565	Protein transporter activity	0.24	0.19	0.13	—	0.34	—
GO:0008907	Integrase activity	0.028	0.0064	0.013	—	—	—
GO:0015075	Ion transporter activity	1.3	0.84	1.1	0.91	1.7	0.33
GO:0015267	Channel or pore class transporter activity	0.34	0.22	0.38	—	0.34	—
GO:0016209	Antioxidant activity	0.18	0.21	0.13	—	—	—
GO:0016301	Kinase activity	1.4	1.6	1.6	0.30	0.92	0.17
GO:0016491	Oxidoreductase activity	2.3	2.2	1.6	4.4	2.6	1.7
GO:0016740	Transferase activity	2.8	3.3	2.9	0.30	1.1	0.33
GO:0016787	Hydrolase activity	4.1	4.6	4.6	3.0	2.7	1.8
GO:0016829	Lyase activity	0.27	0.21	0.19	0.15	0.11	0.33
GO:0016853	Isomerase activity	0.17	0.31	0.21	—	0.23	0.33
GO:0016874	Ligase activity	0.91	1.1	0.75	0.15	0.23	0.17
GO:0030234	Enzyme regulator activity	1.8	1.7	1.7	0.91	0.69	0.67
GO:0045182	Translation regulator activity	0.13	0.052	0.091	0.30	0.22	0.50
	Total	100	100.0	100	100	100	100
	Total counts	7156	15533	7719	656	874	599

Numbers show the percentage in each category. The data are based on the RPKM value of ≥10 differentially expressed pcRNAs and the parent genes of the pseudogenes. The GO analysis was performed using CateGOrizer. Total counts show the sum of the read count of each category. Hyphens (—) indicate that any RNAs were not categorized.

We also performed GO analysis of the parent genes of the pseudogenes (Table 4). The three functional groups of the parental genes most commonly identified in the biological process category were genes associated with cellular process (29/29/29%, exosomes I/II/WS), metabolism (22/21/20%, exosomes I/II/WS) and macromolecule metabolism (19/19/19%, exosomes I/II/WS). These percentages are similar to those of pcRNAs. The three functional groups of the parental genes most commonly identified in the molecular function category were genes associated with binding (38/38/42%, exosomes I/II/WS), protein binding (15/17/17%, exosomes I/II/WS) and nucleic acid binding (15/14/17%, exosomes I/II/WS). However, the percentages of the catalytic activity category in the parental genes were lower than those in pcRNAs in exosomes I, II and WS (8.1/7.1/4.7%, exosomes I/II/WS).

DISCUSSION

We performed transcriptome analysis of the large RNAs in salivary exosomes and WS using NGS. Palanisamy et al. previously analyzed mRNAs in salivary exosomes by microarray and detected 509 mRNAs.²⁸⁾ To the best of our knowledge, the current study is the first report of an exhaustive analysis of lncRNAs and pcRNAs of salivary exosomes. In addition to pcRNAs, lncRNAs of pseudogenes were also abundant in exosomes. In a previous study, we performed NGS of small non-coding RNAs and demonstrated that miRNAs, piRNAs, snoRNAs, and other small RNAs were found in exosomes I, II and WS.¹⁶⁾ A recent study showed that ncRNAs were highly abundant in exosomes secreted by HeLa and MCF-7 cell lines.²⁹⁾ More recently, ncRNAs of human urinary exosomes were analyzed using NGS.²⁰⁾ Thus, these results show that exosomes can entrap many types of RNAs and potentially transfer large amounts of information between cells. Exosomes were shown to have the ability to establish communication between neighboring cells through RNA signal delivery via exosomal RNAs. The mRNAs contained within exosomes can be transcribed into cDNA or translated in the recipient cell.³⁰⁾ However, further study will be needed to elucidate whether the pcRNAs in salivary exosomes are expressed.

The rate of the reads of WS unmapped to human genome was high compared with exosomes I and II (Table 1). In our preliminary analyses, we detected RNAs of exogenous species such as bacteria and fungi in WS (data not shown). Moreover, in the extracellular environment, most miRNAs are associated with Ago2 proteins not encapsulated within exosomes.³¹⁾ Together these may be reasons why the total RNA yield from WS was larger than the total RNA yield from exosomes. However, the status of these exogenous RNAs is not clear. Further study is needed to identify the exogenous RNAs in WS and exosomes.

The most highly expressed genes of pcRNA were ribosomal RNA proteins (RPL, RPS). A previous study reported that ribosomal RNA proteins were predominantly expressed in urinary exosomes²⁰⁾ and salivary exosomes.²⁸⁾ The pcRNA with the highest RPKM was S100A8 in the two types of salivary exosomes and WS. The translated product of S100A8 is S100 calcium-binding protein A8 (protein S100-A8), which was detected in saliva³²⁾ and salivary exosomes.¹⁴⁾ Protein S100-A8 often forms a complex with S100 calcium-binding protein A9 (protein S100-A9), also called calprotectin.³³⁾ Both the pcRNA and the expressed protein of protein S100-A9 were found in exosomes I and II in this study (Supplementary Tables S1, S2) and our proteome study.¹⁴⁾ Calprotectin plays important roles in the regulation of inflammatory processes and immune response. It has functional roles in the activation of leukocytes and promotion of cytokine production via Toll-like receptor 4.³³⁾

Our proteomic study showed that exosomes I and II preferentially contained immune-related proteins, such as IgA and polymeric immunoglobulin receptor.¹⁴⁾ However, RNAs of immune-related proteins were rarely detected by NGS. While DPP4 (also known as CD26) is abundantly present in exosome II,¹⁴⁾ the RNA of DPP4 was not detected in all three samples. Although it is considered that there is a selective loading of specific mRNA and miRNA molecules into exosomes, its mechanism is not clear and further investigation is necessary.³⁰⁾

In our study, pcRNAs and the parental genes of the pseudogenes were highly coincident in exosomes I, II and WS. Recent studies showed that transcribed pseudogenes can regulate the translation of homologous protein-coding genes, such as mRNAs of their parental gene, by an small interfering RNA (siRNA)-like function and/or miRNA sponge.^27,34) Notably, miRNAs were abundantly present in exosomes I and II.¹⁶⁾ In addition, in the GO category of nucleic acid binding, the percentage of parental genes of pseudogenes was higher than that of pcRNA in all three samples (Table 4B). It is possible that the pseudogenes, along with miRNAs, regulate the corresponding mRNAs in salivary exosomes and those in the target cells of the exosomes. The pseudogenes of highly expressed pcRNAs of salivary proteins such as MUC7 were not detected (Supplementary Tables S1–S3). Because salivary proteins are sequentially secreted, the pseudogene of them may not need to express.

The translated product of TPT1 is translationally controlled tumor protein (TCTP), which is a highly conserved protein that is widely expressed in all eukaryotic organisms³⁵⁾ and plays an important role in cell proliferation, cell death and immune responses. Notably, the pseudogene of TPT1 is also highly expressed in exosomes I and II (Supplementary Tables S1, S2). Previous studies showed that TCTP is expressed in salivary glands.³⁶⁾ TCTP was detected in WS but not in exosomes I and II by Western blot analysis (data not shown). The function of TPT1 and the pseudogenes in exosomes in the oral cavity, including in salivary glands, should be examined in future studies.

In conclusion, our study is the first report demonstrating that exosomes contain a large repertoire of lncRNAs, such as processed pseudogenes, in addition to pcRNAs. The mRNA content of exosomes is modulated by the physiological state of the cell and stress conditions and may be useful in investigating the functional state of oral tissue.^37,38) Our transcriptional profiles of the salivary exosome can be constructed non-invasively, and can be used for the applications for the discovery of new biomarkers of oral disease such as salivary gland cancer.

Acknowledgments

We thank Dr. Yoshitaka Taketomi and Dr. Makoto Murakami of Lipid Metabolism Project, and the Tokyo Metropolitan Institute of Medical Science for technical assistance with exosomal RNA detection. We acknowledge Dr. Tsukasa Okada of Hokkaido System Science Co., Ltd. for support with RNA data handling. We are grateful to Dr. Kazuma Aoki of Teikyo Heisei University for helpful discussions. This work was supported by JSPS KAKENHI Grant Numbers 25460172 and 25293083.

Conflict of Interest

The authors declare no conflict of interest.

Supplementary Materials

The online version of this article contains supplementary materials.

Fig. S1. Bioanalyzer profiles of amplified RNA isolated from exosome I, exosome II, and WS.

Fig. S2. Scatter plots of exosome I against exosome II, exosome I against WS and exosome II against WS.

Table S1. List of all RNAs of exosome I. RPKM rankings of all RNAs, pcRNAs, pseudogenes and the pseudogene of highest RPKM with information regarding the nearest pseudogenes (see Materials and Methods).

Table S2. List of all RNAs of exosome II. RPKM rankings of all RNAs, pcRNAs, pseudogenes and the pseudogene of highest RPKM with information regarding the nearest pseudogenes (see Materials and Methods).

Table S3. List of all RNAs of WS. RPKM rankings of all RNAs, pcRNAs, pseudogenes and the pseudogene of highest RPKM with information regarding the nearest pseudogenes (see Materials and Methods).

REFERENCES

1) Park NJ, Li Y, Yu T, Brinkman BM, Wong DT. Characterization of RNA in saliva. Clin. Chem., 52, 988–994 (2006).
2) Hu S, Wang J, Meijer J, Ieong S, Xie Y, Yu T, Zhou H, Henry S, Vissink A, Pijpe J, Kallenberg C, Elashoff D, Loo JA, Wong DT. Salivary proteomic and genomic biomarkers for primary Sjogren’s syndrome. Arthritis Rheum., 56, 3588–3600 (2007).
3) Zimmermann BG, Park NJ, Wong DT. Genomic targets in saliva. Ann. N. Y. Acad. Sci., 1098, 184–191 (2007).
4) Zubakov D, Kokshoorn M, Kloosterman A, Kayser M. New markers for old stains: stable mRNA markers for blood and saliva identification from up to 16-year-old stains. Int. J. Legal Med., 123, 71–74 (2009).
5) Raposo G, Stoorvogel W. Extracellular vesicles: exosomes, microvesicles, and friends. J. Cell Biol., 200, 373–383 (2013).
6) Kalra H, Simpson RJ, Ji H, Aikawa E, Altevogt P, Askenase P, Bond VC, Borràs FE, Breakefield X, Budnik V, Buzas E, Camussi G, Clayton A, Cocucci E, Falcon-Perez JM, Gabrielsson S, Gho YS, Gupta D, Harsha HC, Hendrix A, Hill AF, Inal JM, Jenster G, Krämer-Albers EM, Lim SK, Llorente A, Lötvall J, Marcilla A, Mincheva-Nilsson L, Nazarenko I, Nieuwland R, Nolte-’t Hoen EN, Pandey A, Patel T, Piper MG, Pluchino S, Prasad TS, Rajendran L, Raposo G, Record M, Reid GE, Sánchez-Madrid F, Schiffelers RM, Siljander P, Stensballe A, Stoorvogel W, Taylor D, Thery C, Valadi H, van Balkom BW, Vázquez J, Vidal M, Wauben MH, Yáñez-Mó M, Zoeller M, Mathivanan S. Vesiclepedia: a compendium for extracellular vesicles with continuous community annotation. PLoS Biol., 10, e1001450 (2012).
7) Valadi H, Ekstrom K, Bossios A, Sjostrand M, Lee JJ, Lotvall JO. exosome-mediated transfer of mRNAs and microRNAs is a novel mechanism of genetic exchange between cells. Nat. Cell Biol., 9, 654–659 (2007).
8) Azmi AS, Bao B, Sarkar FH. Exosomes in cancer development, metastasis, and drug resistance: a comprehensive review. Cancer Metastasis Rev., 32, 623–642 (2013).
9) Bobrie A, Colombo M, Raposo G, Thery C. Exosome secretion: molecular mechanisms and roles in immune responses. Traffic, 12, 1659–1668 (2011).
10) Properzi F, Logozzi M, Fais S. Exosomes: the future of biomarkers in medicine. Biomark. Med., 7, 769–778 (2013).
11) Mercer TR, Dinger ME, Mattick JS. Long non-coding RNAs: insights into functions. Nat. Rev. Genet., 10, 155–159 (2009).
12) “Ensembl release 78. Annotation of Non-Coding RNAs.”: ‹http://asia.ensembl.org/info/genome/genebuild/ncrna.html›, cited 8 March, 2016.
13) Ogawa Y, Kanai-Azuma M, Akimoto Y, Kawakami H, Yanoshita R. Exosome-like vesicles with dipeptidyl peptidase IV in human saliva. Biol. Pharm. Bull., 31, 1059–1062 (2008).
14) Ogawa Y, Miura Y, Harazono A, Kanai-Azuma M, Akimoto Y, Kawakami H, Yamaguchi T, Toda T, Endo T, Tsubuki M, Yanoshita R. Proteomic analysis of two types of exosomes in human whole saliva. Biol. Pharm. Bull., 34, 13–23 (2011).
15) Wyss R, Grasso L, Wolf C, Grosse W, Demurtas D, Vogel H. Molecular and dimensional profiling of highly purified extracellular vesicles by fluorescence fluctuation spectroscopy. Anal. Chem., 86, 7229–7233 (2014).
16) Ogawa Y, Taketomi Y, Murakami M, Tsujimoto M, Yanoshita R. Small RNA transcriptomes of two types of exosomes in human whole saliva determined by next generation sequencing. Biol. Pharm. Bull., 36, 66–75 (2013).
17) Zhou Q, Li M, Wang X, Li Q, Wang T, Zhu Q, Zhou X, Wang X, Gao X, Li X. Immune-related microRNAs are abundant in breast milk exosomes. Int. J. Biol. Sci., 8, 118–123 (2012).
18) Nolte-’t Hoen EN, Buermans HP, Waasdorp M, Stoorvogel W, Wauben MH, t Hoen PA. Deep sequencing of RNA from immune cell-derived vesicles uncovers the selective incorporation of small non-coding RNA biotypes with potential regulatory functions. Nucleic Acids Res., 40, 9272–9285 (2012).
19) Huang X, Yuan T, Tschannen M, Sun Z, Jacob H, Du M, Liang M, Dittmar RL, Liu Y, Liang M, Kohli M, Thibodeau SN, Boardman L, Wang L. Characterization of human plasma-derived exosomal RNAs by deep sequencing. BMC Genomics, 14, 319 (2013).
20) Miranda KC, Bond DT, Levin JZ, Adiconis X, Sivachenko A, Russ C, Brown D, Nusbaum C, Russo LM. Massively parallel sequencing of human urinary exosome/microvesicle RNA reveals a predominance of non-coding RNA. PLoS ONE, 9, e96094 (2014).
21) Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal, 17, 10–12 (2011).
22) Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics, 25, 1105–1111 (2009).
23) Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol., 10, R25 (2009).
24) Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL, Wold BJ, Pachter L. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol., 28, 511–515 (2010).
25) Karro JE, Yan Y, Zheng D, Zhang Z, Carriero N, Cayting P, Harrrison P, Gerstein M. Pseudogene.org: a comprehensive database and comparison platform for pseudogene annotation. Nucleic Acids Res., 35 (Database), D55–D60 (2007).
26) Zhi-Liang H, Bao J, Reecy JM. CateGOrizer: A Web-Based Program to Batch Analyze Gene Ontology Classification Categories. Online J. Bioinform., 9, 108–112 (2008).
27) Pink RC, Wicks K, Caley DP, Punch EK, Jacobs L, Carter DR. Pseudogenes: pseudo-functional or key regulators in health and disease? RNA, 17, 792–798 (2011).
28) Palanisamy V, Sharma S, Deshpande A, Zhou H, Gimzewski J, Wong DT. Nanostructural and transcriptomic analyses of human saliva derived exosomes. PLoS ONE, 5, e8577 (2010).
29) Gezer U, Ozgur E, Cetinkaya M, Isin M, Dalay N. Long non-coding RNAs with low expression levels in cells are enriched in secreted exosomes. Cell Biol. Int., 38, 1076–1079 (2014).
30) Ramachandran S, Palanisamy V. Horizontal transfer of RNAs: exosomes as mediators of intercellular communication. Wiley Interdiscip. Rev. RNA, 3, 286–293 (2012).
31) Turchinovich A, Weiz L, Langheinz A, Burwinkel B. Characterization of extracellular circulating microRNA. Nucleic Acids Res., 39, 7223–7233 (2011).
32) Ghafouri B, Tagesson C, Lindahl M. Mapping of proteins in human saliva using two-dimensional gel electrophoresis and peptide mass fingerprinting. Proteomics, 3, 1003–1015 (2003).
33) Ehrchen JM, Sunderkötter C, Foell D, Vogl T, Roth J. The endogenous Toll-like receptor 4 agonist S100A8/S100A9 (calprotectin) as innate amplifier of infection, autoimmunity, and cancer. J. Leukoc. Biol., 86, 557–566 (2009).
34) Ebert MS, Sharp PA. Emerging roles for natural microRNA sponges. Curr. Biol., 20, R858–R861 (2010).
35) Bommer UA, Thiele BJ. The translationally controlled tumour protein (TCTP). Int. J. Biochem. Cell Biol., 36, 379–385 (2004).
36) “The Human Protein Atlas Version: 13.”: ‹http://www.proteinatlas.org/›, cited 8 March, 2016.
37) Yáñez-Mó M, Siljander PR, Andreu Z, Zavec AB, Borràs FE, Buzas EI, Buzas K, Casal E, Cappello F, Carvalho J, Colás E, Cordeiro-da Silva A, Fais S, Falcon-Perez JM, Ghobrial IM, Giebel B, Gimona M, Graner M, Gursel I, Gursel M, Heegaard NH, Hendrix A, Kierulf P, Kokubun K, Kosanovic M, Kralj-Iglic V, Krämer-Albers EM, Laitinen S, Lässer C, Lener T, Ligeti E, Linē A, Lipps G, Llorente A, Lötvall J, Manček-Keber M, Marcilla A, Mittelbrunn M, Nazarenko I, Nolte-’t Hoen EN, Nyman TA, O’Driscoll L, Olivan M, Oliveira C, Pállinger É, Del Portillo HA, Reventós J, Rigau M, Rohde E, Sammar M, Sánchez-Madrid F, Santarém N, Schallmoser K, Ostenfeld MS, Stoorvogel W, Stukelj R, Van der Grein SG, Vasconcelos MH, Wauben MH, De Wever O. Biological properties of extracellular vesicles and their physiological functions. J. Extracell. Vesicles, 4, 27066 (2015).
38) Eldh M, Ekström K, Valadi H, Sjöstrand M, Olsson B, Jernås M, Lötvall J. Exosomes communicate protective messages during oxidative stress; possible role of exosomal shuttle RNA. PLoS ONE, 5, e15353 (2010).

Corresponding author

Correction information

Register with J-STAGE for free!