2013 Volume 63 Issue 1 Pages 31-41
As tomatoes are one of the most important vegetables in the world, improvements in the quality and yield of tomato are strongly required. For this purpose, omics approaches such as metabolomics and transcriptomics are used not only for basic research to understand relationships between important traits and metabolism but also for the development of next generation breeding strategies of tomato plants, because an increase in the knowledge improves the taste and quality, stress resistance and/or potentially health-beneficial metabolites and is connected to improvements in the biochemical composition of tomatoes. Such omics data can be applied to network analyses to potentially reveal unknown cellular regulatory networks in tomato plants. The high-quality tomato genome that was sequenced in 2012 will likely accelerate the application of omics strategies, including next generation sequencing for tomato breeding. In this review, we highlight the current studies of omics network analyses of tomatoes and other plant species, in particular, a gene coexpression network. Key applications of omics approaches are also presented as case examples to improve economically important traits for tomato breeding.
The tomato (Solanum lycopersicum) is one of the most important vegetables in the world, both as a fresh fruit and as the main ingredient of staple products like tomato puree and ketchup. As consumer demands for more variation, improvement of quality and taste, year-round availability and human health benefits have increased, scientists have had to obtain new insights into underlying genetic factors and regulation of metabolic pathways related to biochemical traits to accommodate such demands.
The omics approaches such as transcriptomics, proteomics and metabolomics constitute a trilogy in the post-genomics era to elucidate key steps in cellular events. Metabolomics, transcriptomics, or an integration of these 2 omics strategies have been used to investigate the metabolic networks of tomatoes to improve the quality and yield, because an increased knowledge that improves taste and quality, stress resistance, and/or potentially health-beneficial metabolites (e.g. antioxidants) is connected to the improvement of the biochemical composition of tomato plants. The tomato genome was sequenced in 2012 (Sato et al. 2012); therefore, annotations of tomato gene expression arrays provided by manufacturers can now be more precise using the information. Furthermore, metabolomic approaches based on chromatographic separation techniques connected with mass spectrometry (MS) as well as nuclear magnetic resonance spectroscopy (NMR) have been widely used for tomato metabolomics research because metabolites have beneficial traits such as taste, fragrance, softness and colour and are the ultimate phenotypic representatives of homeostasis in highly complex biochemical networks (Bovy et al. 2007, Deshmukh et al. 2003, Kusano et al. 2011a, Le Gall et al. 2003, Moco et al. 2008, Stark et al. 2008, Tikunov et al. 2005). Fig. 1 presents the current coverage of the tomato metabolome using our MS-based metabolomics platforms that consist of gas chromatography–electron ionization–time-of-flight–MS (GC-EI-TOF-MS), ultraperformance liquid chromatography–electron splay ionization–quadrupole–TOF–MS (UPLC-ESI-Q-TOF-MS) and capillary electrophoresis–ESI–TOF–MS (CE-ESI-TOF-MS) (Kusano et al. 2011a). We have covered more than 80% of the tomato metabolome when we evaluated the coverage by comparing physicochemical properties of the detected metabolites and those in the LycoCyc database (http://solgenomics.net/tools/solcyc/) (Mazourek et al. 2009).
To-date coverage of the tomato metabolome using the MS-based metabolomics platform in PRIMe (Platform for RIKEN Metabolomics, http://prime.psc.riken.jp/). Principal component analysis was performed using the physicochemical properties of the metabolomic dataset obtained from Kusano et al. (2011a) and those from the LycoCyc database (http://solgenomics.net/tools/solcyc/) (Mazourek et al. 2009). We used the latest version of the metabolite information in our custom database and ChemSpider (http://www.chemspider.com/). PC, principal component.
In this review, we will first highlight omics network studies to identify and infer cellular regulatory networks that have crucial roles in metabolic regulations of tomato plants and other plant species using gene-to-gene correlation analysis, which were generated by microarray and next-generation sequencer (NGS)-based technologies. Second, key applications of omics approaches aimed to improve economically important traits for tomato breeding are presented.
A high-quality genome sequence of the tomato (Sato et al. 2012) facilitates a better understanding of molecular mechanisms regulating important traits such as yield and fruit quality characteristics. In this section, we focus on the role of the ‘omics’ network analysis using microarray- and NGS-based technologies in plants, including tomatoes.
Genomic- and post-genomic resources in tomatoesThe Expressed Sequence Tag (EST) database for the tomato has many sequences corresponding to 40,000 Uni-Genes (http://www.sgn.cornell.edu) (Mueller et al. 2005). A large-scale collection of >13,000 full-length cDNAs generated from the tomato cultivar Micro-Tom has been previously reported (Aoki et al. 2010). TOMATOMICS, which is the integrated omics database for tomato plants, is constructed from the latest UniGene set (KTU4) that is made up of 125,883 ESTs from 9 cDNA libraries and other publicly available 196,912 ESTs from Sol Genomics Network (SGN), resulting in 58,083 UniGenes (http://bioinf.mind.meiji.ac.jp/tomatomics/). These genomic resources contain fundamental information reflecting complex gene expression in a plant cell. Developments in microarray technology have had a striking impact on the ability of researchers to monitor the expression of thousands of gene simultaneously. In the tomato, many types of microarray platforms, including TOM1, TOM2, Affymetrix GeneChip, Agilent custom array and TomatoArray (COMBIMATRIX), have enabled the investigation of responses to several stress conditions (Cantu et al. 2009, Khodakovskaya et al. 2011, Sun et al. 2010), the comparison of the expression profiles of wild-type and transgenic or mutant plants (Kumar et al. 2012, Martinelli et al. 2009, Nashilevitz et al. 2010, Povero et al. 2011) and the study of host-pathogen interactions (Alkan et al. 2012, Balaji et al. 2008, Owens et al. 2012). Archives of these comprehensive databases are in public repositories like NCBI GEO (Barrett et al. 2009) and ArrayExpress (Parkinson et al. 2009). Fig. 2 shows information about the collection of 393 Affymetrix tomato GeneChip data from NCBI GEO, ArrayExpress and TFGD (Fei et al. 2011). Recently, there have been tomato expression datasets generated by NGS-based RNA-sequencing (RNA-seq) and digital expression analysis; 7 RNA-seq datasets have been stored in the Sequence Read Archive (SRA) at NCBI (http://www.ncbi.nlm.nih.gov/sra). Other public databases for tomato research are summarized by Suzuki and the colleagues (Suzuki et al. 2009).
The current status of (A) experimental information and (B) tissues in 23 tomato microarray datasets. We used a collection of 393 Affymetrix tomato GeneChip data from the NCBI GEO, ArrayExpress and TFGD (Fei et al. 2011) public databases for this calculation (collection date, July 2012).
The publicly available datasets mentioned above have facilitated the development of in silico tools and databases to predict the functions of unknown genes. These are useful for model plants such as Arabidopsis and rice (see (Aoki et al. 2007, Usadel et al. 2009) and also for research on crops like tomatoes. Most tools and databases use a ‘gene coexpression’ approach. Gene coexpression is based on expression similarities between gene pairs across many experimental samples. It is predicted that the coexpressed gene pairs share a similar function in biological processes and have related regulatory mechanisms. In coexpression approaches, Pearson’s product-moment correlation coefficient is the most widely used for similarity measures. The Pearson’s correlation coefficient, r, can range from −1 to 1. An r = 1 indicates a perfect positive linear relationship between gene expressions and r = −1 indicates a perfect negative relationship. An r = 0 implies no linear relationship between gene expressions. The calculation of Pearson’s correlation coefficient is not robust for outliers and assumes that the data are from a standard normal distribution. On the other hand, Spearman’s rank correlation coefficient does not depend on a linear relationship and is more resistant to outliers than Pearson’s correlation coefficient. In many cases, such associations can be described as a ‘coexpression networks’, where nodes represent genes and links between nodes represent significant correlations that are higher than a threshold, resulting in a large-scale undirected graph. Accordingly, for the interpretation of coexpressed gene pairs, properties concerning their similarity measures should be noted.
Detecting meaningful associations in transcriptomic dataAs several groups have demonstrated, coexpression network-based approaches are useful to characterise gene functions involved mainly in secondary metabolites such as flavonoids and glucosinolates in Arabidopsis (Saito et al. 2008, Usadel et al. 2009). However, a correlation does not always reflect a linear relationship and does not necessarily reflect causal relationships. Markowetz and Spang indicated that coexpression networks visualised by undirected graphs cannot easily explain the difference between direct and indirect dependencies in gene networks, although they should contain causal regulatory relationships (Markowetz and Spang 2007). A partial correlation coefficient may be used for constructing coexpression networks. It measures the correlation between 2 gene expressions x and y conditioning on 1 variable z. Although partial correlation still does not indicate causal relationships, it can be useful to exclude many indirect relationships (de la Fuente et al. 2004). In this framework, Ma et al. (2007) have inferred Arabidopsis coregulation patterns between genes using the graphical Gaussian model (GGM), which is a robust estimation of a direct relationship between variables (Ma et al. 2007). By measuring general dependencies between variables, mutual information based on information theory can identify coexpressed genes in large-scale high-throughput data (Steuer et al. 2002). In contrast to Pearson’s correlation, Galili and colleagues presented a novel bioinformatics tool called ‘Gene Coordination’ (Less and Galili 2009, Less et al. 2011). This method is based on the number of biological perturbations (e.g. drought stress), in which both genes of a given gene pair are significantly upregulated or significantly downregulated together, compared to non-treated controls. The authors demonstrated that the approach identified highly coordinated genes involved in the aspartate-family pathway in Arabidopsis. Reshef et al. (2011) proposed the maximal information coefficient (MIC), which is a novel measure of the association between variables (Reshef et al. 2011). This method can identify both linear and nonlinear relationships (i.e. is not limited to linear relationships), allowing us to explore variable relationships in a given data set in a non-biased manner. Construction of coexpression networks using RNA-seq data of tomatoes will be increased in the near future, as described for mice RNA-seq data (Lancu et al. 2012).
Expanding network concepts from single- to multinetworkIntegrating other omics data is another promising strategy to predict unknown gene functions. The first integrative study to construct a comprehensive multinetwork model of molecular interactions in Arabidopsis was performed using multiple microarray datasets treated by different combinations of carbon and nitrogen sources (Gutierrez et al. 2007, 2008). Gutierrez et al. integrated multiple information, including protein-DNA interactions, protein-protein interactions, metabolic pathways and molecular interactions from literature mining. Coruzzi and coworkers experimentally assessed those subnetworks inferred from the multinetwork (Gifford et al. 2008, Katari et al. 2010, Vidal et al. 2010). From a probabilistic point of view, functional networks integrated from multiple genomic-scale data have recently emerged for several species, including Arabidopsis and rice (Lee et al. 2010, 2011). Of these, AraNet (Lee et al. 2010) includes a million functional links among 19,647 corresponding to ~73% of the total Arabidopsis genes and the functional map demonstrates the usefulness of the network by characterising the predicted function of several genes based on the reverse genetics approach (Table 1).
species | database | URL | reference |
---|---|---|---|
Arabidopsis | AraNet | http://www.functionalnet.org/aranet/ | (Lee et al. 2010) |
CYPedia | http://www-ibmp.u-strasbg.fr/~CYPedia/ | (Ehlting et al. 2008) | |
VirtualPlant | http://virtualplant.bio.nyu.edu/cgi-bin/vpweb/ | (Katari et al. 2010) | |
rice | OryzaExpress | http://riceball.lab.nig.ac.jp/oryzaexpress/ | (Hamada et al. 2011) |
soybean | Soybean Proteome Database | http://proteome.dc.affrc.go.jp/Soybean/ | (Sakata et al. 2009) |
tomato | TOMATOMICS | http://bioinf.mind.meiji.ac.jp/tomatomics/ | |
Arabidopsis and six crop species | PlaNet | http://aranet.mpimp-golm.mpg.de | (Mutwil et al. 2011) |
Attempts to compare multiple molecular networks are growing rapidly (Fukushima et al. 2009a, Kourmpetis et al. 2011, Lysenko et al. 2011). Investigations of coexpression between adjacent genes along genomes in Arabidopsis and rice demonstrated that the physically neighbouring genes were relatively highly coexpressed (Ren et al. 2005, 2007, Zhan et al. 2006). In bacterial genomes, multiple gene clusters of coregulated genes called operons were found, which have similar functions or belong to the same pathways. For a long time, eukaryotic genomes were not considered to have operons. However, several lines of evidence indicating the presence of operon-like gene clusters in plants have recently been reported (Field and Osbourn 2008, Field et al. 2011), including genes related with triterpene biosynthetic pathways in Arabidopsis and oat, genes in the cyclic hydroxamic acid pathways in maize and genes associated with diterpenoid momilactone production in rice (see the reviews by Mizutani and Ohta (2010), Takos and Rook (2012)). CYPedia is an expression database of Arabidopsis cytochrome P450 monooxygenases (CYPs) (Ehlting et al. 2008). It may be useful to have knowledge about coexpressed gene pairs between each CYP and a gene in the database; the orthologous gene pairs can be searched in tomato transcript datasets. Furthermore, novel statistical methods have recently emerged to extract operon-like gene clusters, including CYPs (Wada et al. 2012).
Using a combination of sequence similarity and gene co-expression, we can examine similar expression patterns and differences among multiple species, including humans and mice (Piasecka et al. 2012) and important cereal crops (Davidson et al. 2012, Van Bel et al. 2012). PlaNet (Mutwil et al. 2011) provides multiple coexpression networks from 7 plants based on an algorithm of network comparisons. A tool in PlaNet can extract conserved network modules across plant species, allowing us to identify reliable homologs. There exists a method available that evaluates the number and significance of shared orthologs between coexpressed modules (Chikina and Troyanskaya 2011, Movahedi et al. 2011, Zarrineh et al. 2011). Interestingly, many groups have reported conserved modules associated with photosynthesis, translation, cell division and DNA metabolism in dicot and monocot plants (Ficklin and Feltus 2011, Fukushima et al. 2008, 2009b, 2012, Mao et al. 2009, Mentzen and Wurtele 2008, Movahedi et al. 2011, Mutwil et al. 2011).
Differential network analyses in plant scienceGenerally, graph clustering such as Markov clustering (Van Dongen 2000) and DPClus (Altaf-Ul-Amin et al. 2006) can be used for detecting coexpressed modules or clusters in a non-biased manner. Graph clustering is an algorithm for efficiently extracting densely connected genes in coexpression networks. Using this type of network-based approach and >60 GeneChips, Ozaki and colleagues characterised transcription factor regulating flavonoid pathways in the tomato (Ozaki et al. 2010). This approach has also provided insights into transcriptional organization in Arabidopsis and rice as well as the tomato (Fukushima et al. 2009b, 2012, Ma et al. 2007, Mao et al. 2009, Mentzen and Wurtele 2008). Together with conserved coexpressions, a differential network strategy (Ideker and Krogan 2012) has been applied to animals and plants (Choi et al. 2005, de la Fuente 2010, Fukushima et al. 2012, Gillis and Pavlidis 2009). Differential metabolomic correlation analysis has been used for dissecting complex metabolism (Fukushima et al. 2011, Morgenthal et al. 2006, Weckwerth et al. 2004). As we introduced in the review (Table 1), we believe that omics network approaches are useful to understand underlying molecular mechanisms regulating important agronomic traits such as yield and fruit development in the tomato.
In this section, we will introduce application examples of omics approaches to improve agronomically and economically important traits as well as metabolite composition in the tomato.
Metabolite quantitative trait loci analysesPlant breeding uses genetic variation to identify interesting traits and characteristics for growers and consumers. Uncovering the genetic basis of quantitative variation in the tomato is important for breeding purposes. Expression quantitative trait loci (eQTL) analysis, as a genome-wide association study, is becoming a powerful tool to identify the ‘hot spot(s)’ of genetic locations(s) of DNA sequence variation at the mRNA transcript level across a segregating population. Association analysis between genome-wide genetic variation and the levels of metabolites is also a promising approach in this field. In recent studies, eQTL analysis has been applied to Arabidopsis, maize, wheat and tree species (see the review by Druka et al. (2010)), whereas metabolite QTL (mQTL) analysis has been applied to Arabidopsis, rice, maize, potato, populus and tomatoes (Carreno-Quintero et al. 2012, Lisec et al. 2008, Matsuda et al. 2012, Morreel et al. 2006, Riedelsheimer et al. 2012, Toubiana et al. 2012, Wentzell et al. 2007). Of these, we will summarise the works of mQTL analysis based on metabolite profiling approaches to improve interesting traits in the tomato.
Tomato fruit quality is one of the most important traits. The utility of introgression lines (ILs), which were generated by crossing the cultivar Solanum lycopersicum ver. M82 with its wild relative S. pennellii, enables mQTL analysis of each ripe pericarp of the ILs using GC-EI-MS (Schauer et al. 2006). This was the first publication of mQTL analysis using metabolomics and correlation networks in the world. The authors quantified 74 metabolites in the metabolite profiles that were obtained from 2 independent experiments across 2 years (each data matrix: 76 ILs vs. 74 metabolites). Using these data matrices, 889 mQTLs were identified by analysis of variance (ANOVA) tests. Next, correlation network analysis was conducted to integrate metabolic and phenotypic traits of the ILs, then the network was visualised by cartographic representation to display the patterns of intra- and intermodule connections (Guimera and Nunes Amaral 2005). The results showed that 50% of the metabolites were considered to be morphologically associated. Additional mQTLs were identified by examining changes in the metabolite levels in the fruit profiles of the additional year’s harvest and then the extent of heritability of metabolites was investigated (Schauer et al. 2008). Metabolite profiling of the fruit pericarp of the heterozygous hybrids between ILs and M82 (ILHs) as well as that of the 76 homozygous ILs was performed. Then, the authors identified 332 putative mQTLs. Furthermore, the use of additional ILHs provided the chance to classify each putative QTL derived from S. pennellii into 4 types of mode-of-inheritance categories using the method described by Semel et al. (2006). The authors found that most of the putative wild species QTLs showed an increasing effect on the metabolite content when compared to that in the S. lycopersicum line and that these effects were inherited in a dominant or additive manner.
Toubiana and colleagues demonstrated mQTL analysis using tomato seeds of 76 ILs; metabolite profiles of each fruit pericarp were harvested in parallel (Schauer et al. 2006, 2008) to determine the association between seed quality traits and mQTLs (Toubiana et al. 2012). They compared the extent of the broad-sense heritability of each metabolite content in seed and fruit tissues of the ILs and the control line M82 using ANOVA and coefficient of variation tests. The observed patterns suggested that the broad-sense heritability of metabolite content in the seed was greater than that in the fruit. By using the seed and fruit datasets, network analysis based on metabolite-to-metabolite associations was applied to investigate the degree of correlation between metabolites in the seed and the fruit. Metabolite-to-metabolite correlation analyses have been widely applied to visualise genotype-dependent relationships in Arabidopsis (Allen et al. 2010, Kusano et al. 2007), potato (Weckwerth et al. 2004), melon (Biais et al. 2009) and the tomato (Ursem et al. 2008). This approach revealed more intensified correlation relationships in seed metabolism than those found in the fruit. Furthermore, the correlation analysis emphasized the centrality of the amino acid module in the seed metabolic network.
Taken together, these approaches introduced here show that mQTL analysis in combination with introgression breeding is likely to be useful not only to enhance biochemical traits but also to find important ‘hubs’, which are highly connected genes and/or metabolites in coexpression modules, in tomato metabolism.
ParthenocarpyParthenocarpy is the ability to produce fruits in the absence of pollination. It is an economically valuable trait for many horticultural crops and vegetables, including tomatoes. Parthenocarpy can prevent mechanical vibration of the flowers to endure pollen shedding. For manufacturing, parthenocarpic tomatoes can reduce processing of tomato products. As such, parthenocarpy is an important trait for tomato breeding. However, the molecular mechanism of fruit development, particularly the onset of ovary development, remains unclear. To address the issue, omics approaches have been applied by combining the reverse genetic approach.
It is known that downregulation of indole acetic acid 9 (IAA9) in tomatoes showed pollination-independent fruit production (Wang et al. 2005). IAA9 is a negative auxin response regulator belonging to the Aux/IAA transcription factor gene family (Abel et al. 1995, Nebenfuhr et al. 2000). Integrated analysis of transcript and metabolite profiling of MicroTom tomato lines downregulated in the expression of the IAA9 gene was applied to investigate underlying molecular events during fruit set (Wang et al. 2009). Three independent experiments were carefully designed; these datasets provided information about (i) natural pollination-induced fruit set, (ii) pollination-dependent fruit set and (iii) genotype-dependent comparison of fruit set between the control (wild type) and 2 independent transgenic antisense lines. Such comprehensive analysis generated a model of the molecular events mediated by IAA9 and those underlying the fruit set process in tomatoes. The model showed that 1455 genes were involved in bud-to-flower transition and 1650 genes were involved in flower-to-fruit transition. Particularly, novel pathways such as photosynthesis, auxin and ethylene signalling as well as the requirement of a high number of transcriptional regulators associated with natural pollination fruit set and pollination-independent fruit set were found using the omics approach.
The second example is the transcript profiling analysis of a parthenocarpic line, the pat3/pat4 mutant (Pascual et al. 2009). Three major mutants that show parthenocarpic growth in tomatoes: pat, pat-2 and pat/3pat4 have been identified (Carmi et al. 2003, Fos et al. 2001, Gorguet et al. 2008, Philouze 1983, 1985). Pascual and colleagues performed transcript profiling of the tomato carpel and fruit of the pat3/pat4 mutant and the UC82 line as a representative parthenocarpic and non-parthenocarpic line, respectively. Time-series experiments allowed the extraction of 2842 differentially expressed genes during carpel development and fruit set. Of these, the major differences between pat3/pat4 and UC82 lines were observed at the anthesis stage. Genes involved in cell division and cell cycle events and genes responsible for gibberellins (GAs) and ethylene biosynthetic pathways were highly expressed in the pat3/pat4 lines at the anthesis stage, suggesting that the transition point at the anthesis stage seems to be shorter in the parthenocarpic line pat3/pat4 than in the non-parthenocarpic line, and this event may be regulated by phytohormones such as GAs and ethylene.
The third example of a parthenocarpy study is downregulation of chalcone synthase (CHS), which encodes the first step of the enzyme in the flavonoid pathway (Schijlen et al. 2007). Flavonoids are a type of phenylpropanoid, which are involved in one of the major pathways in the plant kingdom. Although the study did not employ an omics approach, the data were interesting because suppression of CHS caused not only inhibition of pigment accumulation in the flower tissue but also male sterility in petunia (van der Meer et al. 1992). These data imply that flavonoids and/or genes in flavonoid pathways may have crucial roles in fruit development. Indeed, RNAi-mediated suppression of CHS in tomato plants caused parthenocarpy. These findings encourage us to apply omics approaches for the study of parthenocarpy, particularly focusing on metabolomic changes in flavonoids in parthenocarpic tomatoes.
Investigation of other important traits using omics approachesOmics approaches have been applied to improve other beneficial traits, e.g. fruit architecture and biochemical traits. Aharoni’s group performed transcript and metabolite profiling of tomato peel (outer layers) and flesh (pericarp after removal of the peel) tissues to obtain new insights into gene expression patterns and metabolite composition of the fruit surface (Mintz-Oron et al. 2008). As expression patterns and changes in metabolite levels, including primary and secondary metabolites in peel and flesh tissues, were well captured at the 5 growth stages in fruit, omics approaches have a great potential to obtain insights for improvement of the tomato fruit surface. To date, the GC-EI-MS technique has been used for determination of cutin and wax components in the tomato peel (Adato et al. 2009, Saladie et al. 2007). This technique is one of the gold standards for metabolomics. Thus, it is easy to apply peak pretreatment and data alignment techniques developed in the metabolomics field for cutin and wax analyses. Using our metabolomics pipeline based on GC-EI-TOF-MS, we recently conducted metabolite fingerprinting analysis of cutin and wax fractions of a tomato mutant, which have both sticky peel (pe) and light green (lg) mutations, as an application example of metabolomic techniques for cutin and wax analysis, although complete identification of cutin and wax monomers requires authentic standards (Kimbara et al. 2012).
Tomato fruits contain flavonoids, which are health-protecting components in the human diet. Recently, flavonoids have been shown to have critical roles in plant physiology, e.g. auxin transport, allelopathy, sterility and stress resistance (Buer et al. 2010, Mol et al. 1998, Peer and Murphy 2007, Ulm and Nagy 2005). It is well known that flavonoid accumulation is strongly induced to protect against ultraviolet (UV)-induced damage when plants are exposed to UV-B light (Kootstra 1994, Kusano et al. 2011b). Giuntini and colleagues treated tomato fruits with a normal sunlight spectrum deprived of the UV-B region using a poly-ethylene film (Giuntini et al. 2008). They used 2 tomato genotypes: the hybrid Esperanza F1 with low lycopene levels in the fruit and the hybrid DRW 5981 with high lycopene content in the fruit. Tomato fruits derived from the 2 cultivars were harvested at the 3 growth stages, and then flavonoid content was quantified using LC-ESI-MS. The flavonoid profiles of the 2 cultivars showed distinct patterns in the presence or absence of the UV-B region. A similar approach was employed by the same group to investigate how the high pigment-1 mutant responds under UV-B depleted conditions; this mutant accumulates fruit pigments and has a mutation in the tomato HIGH-PIGMENT1/UV-DAMAGEDDNA-BINDING PROTEIN 1 (HP1/LeDDB1) gene (Calvenzani et al. 2010). They found that (i) flavonoid biosynthetic genes and genes involved in light signal transduction were induced by UV-B at the early growth stage and (ii) the expression level of LeDDB1 was not regulated by UV-B.
Tomato genome information provides us with more precise gene annotations and probe sets for microarray analysis and thus, we can use more ‘reliable’ microarray chips, such as Arabidopsis and rice, at a lower cost than at present. More datasets, including microarray and RNA-seq data will be stored in public databases like NCBI GEO. Publicly available datasets enable us to generate coexpression networks in tomatoes. Highly connected hub genes in coexpression modules tend to be important for genomic reasons. It should be noted that such ‘hub’ genes are not always good candidate genes for tomato breeding when these hub genes were found to be present in a certain gene cluster (for example, see Fig. 3). Cytosolic glutamine synthetase (GS) and plastidic GS have crucial roles in ammonium assimilation and recycling in various plant species (Bernard and Habash 2009, Diaz et al. 2010, Kusano et al. 2011c). These plants often showed lethal or visual phenotypic changes when these hub genes were knocked out (Martin et al. 2006, Tabuchi et al. 2005). To identify genes and metabolites that contribute to important traits for tomato breeding, multiple datasets obtained from phenotyping and integrated omics analysis as introduced in this review provide great opportunities to conduct a systems biological approach based on multinetwork analysis. This approach will shed light on the points of robustness and weakness in metabolic systems of tomato plants; breeders and breeding companies may use this information for the development of next generation breeding strategies.
An example of a tomato multinetwork. (A) We constructed a tomato multinetwork that consisted of 10209 nodes and 85352 links. Information for the tomato metabolic pathway was obtained from the LycoCyc database (http://solgenomics.net/tools/solcyc/) (Mazourek et al. 2009), coexpression information (Fukushima et al. 2012), protein-protein interactions from interolog (Yu et al. 2004) and miRNA-gene relationships from PMRD (Zhang et al. 2010). (B) An expansion of the glutamine synthetase (GS)-related network in the multinetwork. Because the genes annotated as GS1 (SGN-U313258), GTS1 (SGN-U313257) and CGS (SGN-U314517) have many links to other genes, these genes are thought to be ‘hub’ gene candidates. PPI, protein-protein interaction; miRNA, micro RNA; GTS, putative GS; CGS, chloroplastic GS.
The work was partly supported by the foundation of Bio-oriented Technology Research Advancement Institution (BRAIN), Japan and a Grant-in-Aid for Young Scientists (B; grant no. 23700355 to A.F.) from the Ministry of Education, Culture, Sports, Science and Technology.