Abstract
I present an approach to compare the genome contents of cyanobacteria, photosynthetic bacteria, non-photosynthetic bacteria, and eukaryotes by whole-genome clustering using the homology group method. Organellar genomes were also included as part of eukaryotic genomes. The clustering was done according to different E-values given by the BLASTP program as a threshold. I found about 70 homology groups that are shared by Arabidopsis and cyanobacteria and/or photosynthetic bacteria but not by non-photosynthetic organisms. The genes in a half of these groups are related to photosynthesis. Many Arabidopsis genes in the remaining half groups possess a putative transit sequence and are candidates for novel photosynthesis-related genes. A phylogenetic relationship was inferred by the presence or absence of homology groups, but this was different from the phylogeny based on 16S rRNA sequences in some important aspects.