Please note: This catalog
is part of a manuscript:
Mekhedov, S., de Ilárduya,
O., and Ohlrogge., J.(2000)
TOWARDS A FUNCTIONAL CATALOG
OF THE PLANT GENOME:
A SURVEY OF GENES FOR LIPID
BIOSYNTHESIS, Plant Physiology 122:389-401
Abstract
Conclusions
The current version of this catalog contains more than 2600 sequence files,
many of them with annotation and results of our analysis. This version is
updated as of Aug. 1999 and includes essentially all publicly available
genomic, cDNA, EST and GSS sequences for 62 plant polypeptides involved in
lipid metabolism in higher plant species. An important feature of the
catalog are the multiple alignments of amino acid sequences deduced from
genomic and EST sequences. This version of the dataset accounts for
approximately 70% of the Arabidopsis genome.
NOTE: Many of the pages of this database are best or correctly viewed
only with a 17 inch or larger monitor or with screen resolution of 800 x
600 or greater. Multiple alignment and some other files are large and may
require substantial time to download via modem or from non USA locations.
If access to information from this site is too slow from your computer, you
may request a CD-ROM version by sending a written request, via non-electronic
mail, to
Sergei Mehkedov, Dept. Botany and Plant Pathology, Michigan State University,
East Lansing, MI 48824, USA
| CATALOG |
Table 1.
List of reactions surveyed and estimated numbers of genes for each reaction
in Arabidopsis and rice. |
Schematic of fatty acid biosynthesis. |
Table 2 and histograms.
Comparison of numbers of ESTs in public databases for plant enzymes involved
in fatty acid and lipid metabolism.
|
|
Table 3.
Genes which are missing from GenBank for plant enzymes involved in fatty
acid and lipid metabolism.
Please send comments to Sergei Mekhedov at
mekhedov@pilot.msu.edu
|
Multiple alignments provided a method to estimate the number of genes in gene families.Further analysis of sequences allowed us to tentatively identify several previously undescribed genes.For example, two genomic sequences were identified as candidates for the palmitate-specific monogalactosyldiacylglycerol desaturase (FAD5). A candidate genomic sequence for keto-acyl-ACP synthaseinvolved in mitochondrial fatty acid biosynthesis was also identified. Biotin carboxyl carrier protein (BCCP) in Arabidopsis is encoded by at least two genes and the most abundant BCCP transcript so far has not been characterized. The large number (>165,000) of plant ESTs also provides an opportunity to perform "digital northern" comparisons of gene expression levels across many genes.EST abundance in general correlated with biochemical and flux characteristics of the enzymes in Arabidopsis leaf tissue.In a few cases, statistically significant differences in EST abundance levels were observed for several genes which catalyze similar reactions in fatty acid metabolism.For example, the FatB acyl-ACP thioesterase ESTs occur 21 times compared to 7 times for FatA acyl-ACP thioesterase although flux through the FatA reaction is several fold higher than FatB. Such comparisons may provide initial clues toward previously undescribed regulatory phenomena. The abundance of ESTs for ACP compared to that of stearoyl-ACP desaturase and enoyl-ACP reductase suggests that concentrations of some enzymes of fatty acid synthesis may be higher than their acyl-ACP substrates.
By surveying GenBank data and ESTs, the data mining analyses described
in the manuscript have yielded several new types of information. First, a
number of previously un-described genes for plant glycerolipid synthesis have
been putatively identified. Second, the extent to which proteins of the plant
lipid synthesis pathway are encoded by gene families, and the size of each
family has been estimated.Third, more than 160,000 publicly available ESTs
have been analyzed to provide a digital northern estimate of gene expression
levels for 62 plant proteins involved in plant lipid metabolism. With only
a few exceptions, the EST abundance patterns for Arabidopsis and rice are
very similar adding support to this method of estimating relative gene expression
levels. Such a pathway wide overview has not been available through previous
analyses and has provided new insights regarding the regulation of expression
of the pathway. In the near future, with further rapid accumulation of sequence
data, such a detailed analysis of gene sequences and ESTs for many metabolic
pathways will become an essential approach that will contribute to the development
of a functional catalog of plant genes. Of course, such analyses require
revision as more information becomes available and the establishment of a
website provides users a convenient source of such updates. The database
constructed from this survey will be updated as the complete Arabidopsis
and rice genome sequences become available.