Ureolytic Prokaryotes in Soil: Community Abundance and Diversity

Although the turnover of urea is a crucial process in nitrogen transformation in soil, limited information is currently available on the abundance and diversity of ureolytic prokaryotes. The abundance and diversity of the soil 16S rRNA gene and ureC (encoding a urease catalytic subunit) were examined in seven soil types using quantitative PCR and amplicon sequencing with Illumina MiSeq. The amplicon sequencing of ureC revealed that the ureolytic community was composed of phylogenetically varied prokaryotes, and we detected 363 to 1,685 species-level ureC operational taxonomic units (OTUs) per soil sample, whereas 5,984 OTUs were site-specific OTUs found in only one of the seven soil types.


Supplementary text
The following supplementary text includes a detailed protocol of the experiments conducted in the present study.

DNA extraction and quantitative PCR
Genomic DNA was extracted using the PowerSoil DNA Isolation kit (MO BIO Laboratories).
Sequences of primers L2F_V1 and 733R were taken from another study with minor modifications.
Degenerate bases at positions +3, +6, +9, +12, +15, and +18 relative to the 5 terminus of the original L2F primer were modified to decrease sequence complexity, i.e. H to C at position +3, Y to C at +6, R to G at +9, N to C at +12, N to C at +15, and Y to C at +18. The PCR mixture had a volume of 20 μl and contained 2 ng of an extracted DNA sample or 2 µl of standard DNA, oligonucleotide primers (0.3 μM each), and 1×SSoFast EvaGreen Supermix (Bio-Rad). The cycling conditions were as follows: 98°C for 2 min; 40 cycles at 98°C for 5 s and 50°C for 10 s; and finally, 65C to 95°C with 0.5°C increments for melting curve analysis. The assay was conducted in triplicate on a MiniOpticon thermal cycler (Bio-Rad), and specific amplification of the 16S rRNA gene and of ureC was ascertained by agarose gel electrophoresis of the amplicons.
Genomic DNA of Pseudomonas aeruginosa PAO1 (JCM14847) with 4 and 1 copies of the 16S rRNA gene and ureC, respectively, was used as a standard for quantification. The cells were cultured as recommended by the supplier, genomic DNA was extracted, and DNA concentration was determined as mentioned above. The genomic DNA was serially diluted with distilled water to concentrations of 10 5 to 10 0 copiesµl 1 .

Bioinformatics
The generated ureC and 16S rRNA gene sequence reads were processed for removal of adapter sequences using cutadapt and for quality trimming using Trimmomatic v0.33 (1) as previously described (10). The reads that contained <50 bp or were associated with an average Phred-like quality score <30 were removed. Paired-end sequence reads were assembled in the paired-end assembler for the Illumina sequence software package (PANDAseq) (12). The obtained ureC reads were subjected to a blastn search (threshold e-value; 10 -10 ) against the known 60,733 ureC sequences downloaded from fungene database (8) and the database of Integrated Microbial Genomes & Microbiome Samples (IMG/MER) (11) to remove non-ureC sequences. As for the 16S rRNA gene, the assembled sequence reads with 97% sequence identity were grouped into an OTU by UCLUST (6). Phylogenetic affiliations of the OTUs were identified using a blastn search against reference sequences in the Greengenes database version 13_5 (4) and in the nr database (National Center for Biotechnology Information). As for ureC, sequence reads with 91% sequence identity were grouped into an OTU, and the phylogenetic affiliation was examined using a blastn search in the nr database. Putative chimeric sequences were removed using UCHIME (7). Alpha diversity indices (observed species, Chao1, Good's coverage, and Simpson's index) were calculated in QIIME (2). Chao1 was computed at a sampling depth of 5,500 reads and 2,500 reads for the 16S rRNA gene and ureC gene, respectively. A phylogenetic tree was constructed using the nucleic acid sequences of ureC by the maximum likelihood method with the Jones-Taylor-Thornton model in the MEGA 6.06 software (14). Cluster analysis was carried out to examine similarities of community composition among the soils using the STAMP software (13).

Evaluation of ureC primers for PCR amplification of the known ureC sequences
Coverage of the previously designed ureC primers was examined by aligning known ureC sequences and the ureC primer sequences, and by counting the primer-template mismatches. A total of 17,312 ureC sequences were downloaded from the database of IMG/MER (11). We found that the ureC sequences derived from some bacterial species were tremendously abundant because many genome sequences have been determined and deposited, e.g. 3,174 genomes were deposited in the database entry for the species Escherichia coli (accessed on 16 th May 2017). On the other hand, for the majority of bacterial species, only a limited number of genome sequences was available. To evaluate the coverage of ureC primers for phylogenetically distinct ureC sequences uniformly, the ureC sequences derived from the genome sequences affiliated with the same bacterial species were grouped, and a representative ureC sequence was subjected to sequence alignment. Alignment of 2,653 ureC sequences was performed using the MUSCLE software under default conditions (16 iterations) (5). Regions in which primers L2F_V1, 733R, ureC_F, and ureC_R hybridised were manually examined, and primer-template mismatches were counted.