http://www.abbs.info E-mail: [email protected]
ISSN
1672-9145
Acta Biochim Biophys Sin
2005, 37(2): 88–96
CN 31-1940/Q
Predicting Protein
Subcellular Location Using Digital Signal Processing
Yu-Xi PAN1,2, Da-Wei LI1,2, Yun DUAN2,1, Zhi-Zhou ZHANG1,2, Ming-Qing
XU1,2, Guo-Yin FENG1,2, and Lin HE2,3*
1Bio-X Life Science
Research Center, Shanghai Jiaotong University, Shanghai 200030, China;
2Institute for
Nutritional Sciences, Shanghai Institutes for Biological Sciences, Chinese
Academy of Science, Shanghai 200030, China;
3Neuropsychiatric &
Human Genetics Group, Bio-X Center, Shanghai Jiaotong University, Shanghai
200030, China
Abstract The
biological functions of a protein are closely related to its attributes in a
cell. With the rapid accumulation of newly found protein sequence data in databanks,
it is highly desirable to develop an automated method for predicting the
subcellular location of proteins. The establishment of such a predictor will
expedite the functional determination of newly found proteins and the process
of prioritizing genes and proteins identified by genomic efforts as potential
molecular targets for drug design. The traditional algorithms for predicting
these attributes were based solely on amino acid composition in which no
sequence order effect was taken into account. To improve the prediction
quality, it is necessary to incorporate such an effect. However, the number of
possible patterns in protein sequences is extremely large, posing a formidable
difficulty for realizing this goal. To deal with such difficulty, a well-developed
tool in digital signal processing named digital Fourier transform (DFT) [1] was
introduced. After being translated to a digital signal according to the
hydrophobicity of each amino acid, a protein was analyzed by DFT within the
frequency domain. A set of frequency spectrum parameters, thus obtained, were
regarded as the factors to represent the sequence order effect. A significant
improvement in prediction quality was observed by incorporating the frequency
spectrum parameters with the conventional amino acid composition. One of the
crucial merits of this approach is that many existing tools in mathematics and
engineering can be easily applied in the predicting process. It is anticipated
that digital signal processing may serve as a useful vehicle for many other
protein science areas.
Key words
sequence order effect; digital signal processing; digital Fourier
transform (DFT); frequency domain; covariance discriminant algorithm;
bioinformatics; proteomics
-----------------
Received: September
29, 2004 Accepted: December 25,
2004
This work was supported by the grants from the Major State Basic Research
Development Program of China (No. 001CB510301), the National High Technology
Research and development Program of China (No. 2002AA223021), the National Natural
Science Foundation of China, and Shanghai Municipal Commission for Science and
Technology
*Corresponding author: Tel,
86-21-62822491; Fax, 86-21-62822491; E-mail, [email protected]
& [email protected]