Abstract
Photosynthesis is a good target of phylogenetic profiling, as chloroplasts are definitely descendents of a cyanobacteria-like endosymbiont. Genomic data of many photosynthetic organisms are now available. We have developed a software GCLUST to extract clusters of homologous proteins encoded in these genomes. GCLUST uses all-against-all BLASTP data and extracts homolog groups by progressively increasing the threshold E-value. Except for the largest cluster that includes multidomain proteins, the homolog groups represent phylogenetically related families of proteins. We applied phylogenetic profiling to such homolog groups to analyze relationship between cyanobacteria and photosynthetic eukaryotes, relationship of various photosynthetic eukaryotes, as well as the phylogenetic status of the nucleomorph. In addition, We have extracted 40 homolog groups of unknown function that are conserved in cyanobacteria, a red alga and a plant, for further functional genomic studies. The results indicated that cyanobacterial disruptants displayed various defects in photosynthesis as expected.