http://www.abbs.info e-mail:[email protected]

ISSN 0582-9879 ACTA BIOCHIMICA et BIOPHYSICA SINICA 2003, 35(11):965-975 CN 31-1300/Q

Mini Review

Proteomic Technology and Its Biomedical Application

LAU Andy T. Y.^1,2, HE Qing-Yu^2,3, CHIU Jen-Fu^1,2*

( ¹Institute of Molecular Biology, ²Open Laboratory of Chemical Biology of the Institute of Molecular Technology for Drug Discovery and Synthesis, ³Department of Chemistry, The University of Hong Kong, Pokfulam Road, Hong Kong, China )

Abstract Proteomics has its origins in two-dimensional gel electrophoresis (2-DE), a technique developed more than twenty years ago. 2-DE has a high-resolution capacity, and was initially used primarily for separating and characterizing proteins in complex mixtures. 2-DE remains an important tool for protein identification, but is now normally coupled with mass spectrometry (MS), a technique which has advanced considerably in recent years. The recent completion of human genome project has produced a large DNA database which can be utilized through bioinformatics, and the next challenge for scientists is to uncover the entire proteome of a particular organism. The integration of genomic and proteomic data will help to elucidate the functions of proteins in the pathogenesis of diseases and the ageing process, and could lead to the discovery of novel drug target proteins and biomarkers of diseases. This review describes recent advances in proteomic technology and discusses the potential applications of proteomics in biomedical research.

Key words proteomics; two-dimensional gel electrophoresis (2-DE); matrix assisted laser desorption/ionization-time of flight-mass spectrometry (MALDI-TOF-MS); biomarkers

The term 'proteomics' seems to have been coined in 1995, to describe the large-scale characterization of the entire protein components of a cell type, tissue or whole organism［1－3 ］. Proteomics studies the global protein expression profile instead of the behaviour of single proteins.

The study of genes cannot provide much information on the properties of proteins, because the molecules responsible for cellular functions (e.g. signal transduction) are proteins. Proteins may undergo more than 200 different types of post-translational modification, including phosphorylation, glycosylation, acetylation, deamination, farnesylation, myristolation, palmitoylation, and proteolysis［4］. Such a wide range of modifications cannot be predicted purely from DNA sequences. Only through the study of proteins themselves can their characteristics and functions be elucidated. From the data gathered in the genome project, we now have an estimated number of proteins encoded by the genome. However, it is difficult to predict the actual numbers of proteins encoded based on genomic data, for a number of reasons［5］. Firstly, the exon-intron cannot be accurately predicted from genomic DNA［6］, i.e. genomic information needs to be integrated with data obtained from protein studies to confirm the existence of a particular gene. Secondly, alternative splicing of a transcript can yield more than one protein product［7］. Therefore, the direct analysis of mRNA or genome does not reflect the exact number of protein products in a cell. Thirdly, as a result of compartmentalization and translocation, the same protein can be found with different properties and functions in different locations (Fig.1)［8］. These problems can only be solved by proteomics, which can directly identify the proteins and provide the genomic information by the appropriate integration of genomic and proteomic data (Fig. 2). Proteomics is a growing new discipline. Scientists worldwide are applying proteomic technology to solve problems which cannot be resolved by traditional methods. There is little doubt that the next decade will be the era of proteomics.

In this article, we describe recent advances in proteomic technology, and discuss the potential applications of proteomics in biomedical research.

Fig.1 The flow of genomic information to protein products
The concept of one-gene to one-protein is over-simplified since an RNA can be differentially spliced and can produce various protein products. Furthermore, the protein may be affected by more than 200 different types of post-translational modifications. Finally, as a result of compartmentalization and translocation, the same protein can be found with different properties and functions in different locations.

Fig. 2 Categories, potential applications of proteomics, and the benefits of integrating proteomic and genomic data

1 Types of proteomics

Technically, proteomics can be classified into three types (Fig. 2).

The first type is protein expression proteomics. It is the quantitative study of protein expressions between samples. In this approach, protein expressions of the entire proteome (ideally) or of subset proteomes can be compared. Novel proteins (e.g. disease-specific biomarkers) can also be identified. We recently used protein expression proteomics to study serum samples from hepatitis B virus (HBV) infected individuals［9］ and tissues from primary oral tongue squamous cell carcinoma patients (in press, Proteomics, 2003), our results identified a number of biomarkers that were altered in diseased individuals.

The second type is structural proteomics. The main goal of this approach is to map out the structure of protein complexes or the proteins present in a subcellular localization or an organelle［10］. This approach can identify all the structural proteins or protein species within a compartment such as mitochondria, chloroplast and nuclei, or protein-protein interactions in a complex such as the transcriptome, where many proteins work as a gigantic complex during transcription.

The third type is functional proteomics. Analyzing protein profiles at subcellular sites is an important approach in understanding the functional organization of cells at the molecular level. In this respect, information about the specific subcellular localization of a protein may help to elucidate its function. The combination of protein identification by mass spectrometry with fractionation techniques such as immunoprecipitation or chromatography for the enrichment of particular subcellular structures is a profitable avenue of research, and this approach has been termed 'subcellular proteomics'［11］. In addition, the analysis of proteins at the subcellular level is the basis for monitoring important aspects of dynamic changes in the proteome such as protein translocation.

The aim of all types of proteomics is not only to identify all the proteins in a cell but also to create a complete three-dimensional (3-D) map of protein localizations in a cell or an organism. The completion of this map will certainly require contributions from various disciplines such as biochemistry, molecular biology, biophysics and bioinformatics. However, it should be noted that the proteome of a particular cell is in a dynamic state, and is likely to change at any moment upon external stimuli. Thus, studying the proteome is similar to taking a snapshot of the global expression pattern at a particular time.

2 Technology of proteomics

2.1 Separation and isolation of proteins

In order to study the global protein expression profile and characterize or identify proteins of interest, the proteins must be separated and then identified by techniques such as Western blot or mass spectrometry. Initially, 1-DE was the method of choice to resolve proteins. 1-DE can separate proteins ranging from ten to several hundred kilodaltons on the basis of molecular mass, and is easy to manipulate and highly reproducible. In the past, partial protein purification by chromatography followed by preparative 1-DE was the preferred method of purifying proteins of interest in a quantity suitable for further characterization or for analysis by techniques such as enzyme kinetics, amino acid sequencing by Edman degradation, and amino acid composition analysis［12］. However, 1-DE technology has its limitations. It can only be used to separate proteins in certain applications, as it is only able to distinguish proteins by mass difference. In a cell extract, many proteins have an identical or very similar molecular mass. The resolved single band in 1-DE may not consist of homogenous proteins, as more than one type of protein may migrate to the same distance on the gel. As a result of the limited resolution capability of 1-DE, alternative methods must be employed to separate complex or crude samples. 2-DE specifically addresses this problem, since it separates proteins based on two properties: their sizes, and their individual unique isoelectric points. This combination of properties amplifies the resolution capability of 2-DE tremendously, and no other method can replace it. The most useful application of 2-DE is to resolve protein isoforms of the same size, or proteins that have undergone various post-translational modifications which changes their electric charges, such as phosphorylation. Usually, the phosphorylated forms of a protein can be resolved from the nonphosphorylated counterpart and appear as a chain of spots horizontally on the 2-D gel［13］. The ability of 2-DE to resolve thousands of proteins in one gel makes 2-DE an unbeatable technology, and it is expected to remain in use for another decade.

The main objective of 2-DE is to obtain protein profiling. In this regard, the protein expression of different samples, i.e. control and treatment, can be qualitatively and quantitatively compared. The appearance or disappearance of specific protein spots indicates differential protein expression, while the degree of spot intensity provides quantitative information about protein expression levels. Another objective of 2-DE is proteomic mapping in cells. 2-DE is used to map proteins from various microorganisms［14, 15］, organelles［16］, protein complexes,［17］ and even subproteomes. As a result, many 2-DE databases have been established, and are accessible on the Internet.

2-DE has recently been improved with the introduction of immobilized pH gradients. Originally the pH gradient for isoelectric focusing in 2-DE was created by carrier ampholytes. These are mixtures of a few hundred different homologues of amphoteric buffers, synthesized together in one reaction flask. The mixtures contain buffers with isoelectric points evenly distributed over a wide spectrum from pH 3 to pH 10, and they have high buffering power at their isoelectric points. When an electric field is applied, they start to migrate according to their charges toward the anode or the cathode respectively, and form automatically stable pH gradients. The pH gradient works well when protein separation is performed in native conditions. However, in order to gain a high-resolution 2D protein separation, electrophoresis is usually performed under denaturing conditions, which prolongs the running time and destabilizes the pH gradient. Gradient drifting with prolonged isoelectric focusing time causes the loss of almost all basic and some of the acidic proteins. As a result, the patterns were insufficiently reproducible. Today, with the development of immobilized pH gradients［18］, the pH gradients in these gels are prepared by co-polymerizing acrylamide monomers with acrylamide derivatives containing carboxylic and tertiary amino groups. Because the buffering groups that form the pH gradient are fixed, the gradient cannot drift and is not influenced by the sample composition. The use of immobilized pH gradients has allowed many methodical innovations for 2-DE. Pre-manufactured gel strips and instruments are available as commercial products. This is a prerequisite for the development of 2-DE into a highly reproducible and reliable method. The immobilized strips are of various lengths (7 cm, 13 cm, 18 cm, and 24 cm), and cover various pH ranges, including broad range pH 3－10, narrow ranges pH 4－7 and pH 6－11, and a range of a single pH increment. The method can be applied to different experimental purposes and conditions as the degree of separation can be enhanced by using larger gel format and narrow pH range immobilized strips.

2-DE has also been improved recently with the introduction of DIGE (differences gel electrophoresis) technology. This technology utilizes fluorescent tagging of two protein samples with two different dyes. These dyes are amine reactive and have the same molecular mass in order to eliminate mass differences between tagged samples. The tagged proteins are mixed together and run on the same 2-D gel. After image acquisition by a fluorescent scanner using different excitation wavelength of each dye, the gel images are superimposed to identify differences［19］. This technique can be used for experiments with large sample sizes (e.g. clinical samples from normal and diseased patients). It minimizes gel-to-gel variation and reduces workload, as the number of gels needed to be run is reduced.
Recently, a novel method for protein expression profiling has been invented, which does not require the separation of proteins by 2-DE. This method is called isotope-coded affinity tags (ICAT). Protein samples from two different sources are labeled by two chemically identical reagents that differ only in mass as a result of isotope composition［20］. Differential labeling of samples by mass allows the relative amount of proteins between two samples to be quantitated in the mass spectrometer. The major advantage of this method is that it obviates the need to perform 2-DE, enabling a larger sample to be loaded for low copy number proteins. The main disadvantages are that this method works only for cysteine containing proteins, and peptides must contain appropriately spaced protease cleavage sites flanking the cysteine residues［21］. Moreover, the ICAT label is large and remains with each peptide throughout the analysis. This can make database searching more complicated, especially for small peptides with limited sequences[22, 23].

Although the resolution capability of 2-DE is high, the technique has its limitations. Firstly, it is a labor-intensive and tedious procedure. In our laboratory two days are needed to carry out IEF, electrophoresis, silver staining and image acquisition. Secondly, many large or hydrophobic proteins do not enter the gel during the first dimension. In addition, proteins with extreme pH (below 3 or above 10) are not separated［24］, but focused as vertical lines on both sides. Finally, low-copy number proteins either cannot or can hardly be detected on 2-D gel. Although this problem can be overcome by increasing the sample loading, there may be a risk of overloading the system and reducing the resolution［25］.

2.2 Methods of protein detection, image analysis and documentation

There are many ways to detect proteins in 2-D gels. The method used is very dependent on protein loading on the gel (analytical or preparative), the purpose of the gel (for protein quantitation or for blotting), and the sensitivity required. The most common methods are silver staining［26, 27］ and Coomassie blue staining［28－31］. Other methods, including ［³⁵S]-Met or ¹⁴C radiolabelling［32, 33］, colloidal gold［28, 34, 35］, zinc imidazole［36, 37］, ponceau S, amido black［28, 31］, and India ink［28, 38, 39］, can also be used in different applications to achieve better sensitivity and performance in particular cases. However, some drawbacks are obvious. For example, glycoproteins are not stained by Coomassie blue［30］, and many organic dyes are unsuitable for protein detection on PVDF membranes if samples are to be used for MALDI-TOF［28］. In addition, although most means of protein detection can give some indication of the quantities of protein present, in general they cannot be used for global quantitation. This is because no protein stain is able to detect proteins over a wide range of concentrations, isoelectric points and amino acid compositions, especially given the variety of post-translational modifications［30, 38］.

After 2-D electrophoresis and protein visualization by staining, images of gels are digitized for computer analysis by an image scanner or fluorescent scanner, and are then subjected to analysis by special image analysis software (either ImageMaster from Amersham Biosciences or PDQUEST from BioRad). The 2-D patterns are very complex, and special software tools are required to find differentially expressed proteins in a series of gels, such as up and down-regulated proteins, post-translational modified proteins. Image spots on the gels are initially detected, manually edited and then matched. Each spot's intensity (volume) is processed by background subtraction and total spot volume normalization, and the resulting spot volume percentage is used for comparison. The reliability of quantitative determinations of protein amounts in spots is largely dependent on the protein detection technique applied. Usually, only significantly up/down-regulated spots or appearing/disappearing spots are selected for analysis with mass spectrometry.

2.3 Analysis of protein using mass spectrometry

After resolving the protein mixtures and image analysis, the next step is protein identification. The protein spots are excised and in-gel digested with an enzyme (e.g. trypsin or chymotrypsin). The digest is then applied onto a sample plate and coated with matrix. If necessary, the in-gel digest is extracted with acetonitrile and then concentrated and desalted by a Ziptip prior to application on the sample plate. The matrix is typically a small energy-absorbing molecule such as 2,5-dihydroxybenzoic acid or α-cyano-4-hydroxycinnamic acid. The analyte is spotted, along with the matrix, on the sample plate and allowed to evaporate, resulting in the formation of crystals. The plate is then put into the MALDI-TOF mass spectrometer, and the laser is automatically targeted to specific places on the plate and peptide mass spectra are then obtained. If the protein is digested with trypsin, the trypsin autolytic fragment peaks (906.5049, 1153.5741 and 2163.0570) can serve as internal standards for mass calibration. Several software packages are available to perform database matching such as MASCOT at www.matrixscience.com［40］, ProFound at www.prowl.com［41］ and Protein Prospector at www.prospector.ucsf.edu/［42］. The degree of accuracy, reliability and speed differs from software to software, depending on user preferences. However, regardless of which software is used, four variables are normally required for a peptide mass fingerprinting (PMF) search: (1) peptide mass list; (2) specification of the cleavage agent; (3) error tolerance, i.e., the accuracy of mass measurement; and (4) peptide modifications (e.g. N-terminal acetylation). The criteria for matching can be made more stringent by setting a smaller error tolerance or better mass accuracy, a greater number of matching peptide masses, and a narrow molecular weight and pI ranges. Also, the species origin of the unknown protein is important during the matching.
Analogous to the DNA chip technologies, the ProteinChip technology coupled with SELDI-TOF-MS (surface-enhanced laser desorption/ionization-time of flight-mass spectrometry) has recently been developed by Ciphergen Biosystems, Inc. to facilitate protein profiling of complex biological mixtures［43, 44］. This technology utilizes patented ProteinChip arrays to capture individual proteins from complex mixtures, which are subsequently resolved by mass spectrometry using the same principle as MALDI-TOF. The efficacy of the SELDI technology for discovery of prostate cancer protein markers in serum, seminal plasma, and cell extracts was demonstrated long ago［45, 46］. In a recent study, we also utilized ProteinChip SELDI-TOF-MS system to detect potential alteration of protein expression in rat lung epithelial cells (LEC) during arsenic-induced malignant cell transformation［47, 48］.
2.4 Protein identification
Proteins can usually be identified by peptide mass fingerprinting (PMF)［37, 49－52］ and database searching. However, it is sometimes necessary to use post source decay (PSD) to confirm the search result, when an insufficient number of proteolytic peptides is available for confident matching. Because PSD can deduce the amino acid sequence of peptides from normal, post-translational modified or novel proteins of interest (Fig.3), this can greatly enhance the accuracy of the protein identification process, and sometimes leads to the discovery of new proteins.

Fig.3 A general scheme for protein identification by mass spectrometry using either PMF, peptide amino acid sequencing/PSD, or both, to improve the success rate or throughput of protein identification

2.4.1 Peptide mass fingerprinting (PMF) by database matching In PMF, the peptide masses of unknown proteins are compared to the predicted masses of peptides from the theoretical digestion of proteins in a database (Fig.4). The more numbers of peptides match to a protein in database, the more likely the unknown protein is. The advantage of using PMF is that the protein identification process is fast and user-friendly. We can routinely identify a hundred unknown protein spots in a single day's work. But the success of the method can be compromised by several factors: (1) insufficient peptides are obtained in the peptide mass fingerprint to submit to the database search, i.e. there is insufficient data to identify the unknown protein; (2) PMF cannot analyze samples containing a mixture of proteins since they generate mixtures of peptides after the digestion; (3) mass redundancy of peptides with the same masses but different amino acid compositions, can cause ambiguity in protein identification. PMF cannot identify post-translationally modified peptides since there is no such information available from the database. When such problems occur, peptide amino acid sequencing/post source decay (PSD) is used for further validation.
2.4.2 Peptide amino acid sequencing/Post source decay (PSD) With peptide amino acid sequencing, the amino acid sequence of unknown peptides can be identified (Fig. 4), and then used to search the database to identify the protein from which it was derived. Unlike PMF, PSD can be used for gels containing more than one protein. This advantage greatly enhances the protein identification process since protein bands from 1-D gel can be identified whether homogenous or not. The PMF data can be supplemented with partial amino acid sequence information along the result of database search, i.e. an unsuccessful search of the protein database with the PMF data may be reversed with an additional partial sequence. The drawback of PSD is that it is not user-friendly, since the process is not easily automated. As a result, MS analysis and database searching takes considerable time, and must be performed by an experienced operator.

Fig. 4 PMF and PSD in mass spectrometric analysis
For PMF, total ions are transmitted through quadrupoles for mass determination. For PSD, selected parent ion is transmitted into the collision chamber and then fragmented, resulting in numerous daughter ions for amino acid sequence determination.

3 Biomedical applications
3.1 Study on the pathogenesis of human diseases--arsenic carcinogenesis

Our recent studies demonstrated that a low level (1.5 μmol/L) of arsenite induces B[a]P-treated lung cell transformation［47, 48］(Fig.5). We used ProteinChips to identify different protein expression, which could potentially be important for cell transformation induced by this toxic agent. The protein profiles of cell extracts from all samples, including the control, B[a]P-treated, and (B[a]P+As)-treated cells, were similar. However, surface-enhanced laser desorption/ionization time of flight (SELDI-TOF) analysis with Cu-ProteinChips and WCX-ProteinChips revealed several dramatically different protein peaks that appeared in lung cells after transformation by treatment of 1.5 μmol/L arsenite for 12 weeks. Some of these proteins were found to present in mitochondria and participate in mitochondrial respiratory chain and ATP production. Results from this study also suggested that the expression of the pro-apoptotic protein Bax was suppressed in arsenite-induced transformed cells. SAX2 ProteinChip also identified prominent protein peaks that were preferentially expressed in control cells. Interestingly, by using SAX2 chip, we were able to detect several protein peaks whose expression was increased in lung cells treated only with B[a]P. Identification and characterization of these proteins may reveal the molecular basis of arsenite-induced cell transformation and help to elucidate the mechanisms by which arsenic induces carcinogenesis.

Fig. 5 Protein spectra and gel views of SELDI analysis of cellular proteins bound to copper ProteinChip array[48]
Cellular extract of control LEC cells, cells treated with B[a]P, and cells treated with B[a]P+arsenite were spotted onto Cu-ProteinChip array. Two protein peaks with Mr of 4099.3 and 8175.5 were present in the As-transformed LEC cells, but absent in the control and B[a]P-treated LEC cells.
3.2 Identification, characterization and clinical application of biomarkers of human diseases--serum biomarkers in HBV-infected patients
Hepatitis B virus (HBV), a serious infectious and widespread human pathogen, represents a major health problem worldwide. Chronic HBV infection has a very high chance of evolving into hepatocellular carcinoma. Although considerable progress has been made in the past few years, the pathogenesis of HBV infection is still elusive and a definite diagnosis of HBV-infected liver inflammation still relies on biopsy histological test. Our recent studies used proteomic technology to globally examine HBV-infected serum samples in a search for disease-associated proteins that can be used as serological biomarkers for diagnosis and/ or target proteins for pathogenetic study［9］(Fig.6). After a comparison with normal serum samples, we found that at least seven proteins were significantly changed in HBV-infected sera. These greatly altered proteins were identified as haptoglobin β and α2 chain, apolipoprotein A-I and A-IV, α1-antitrypsin, transthyretin and DNA topoisomerase IIβ. The alteration of these proteins presents not only in their quantities but also in their patterns (or specificity), some of which can be correlated with the necroinflammatory scores. In particular, apolipoprotein A-I displays heterogeneous change in expression level with different isoforms, and α1-antitrypsin produces evidently different fragments, implying diverse cleavage pathways. These phenomena are unparalleled, and appear to be specific to HBV infection. A combination simultaneously considering the quantities and isoforms of these proteins could be useful serum biomarkers for HBV diagnosis and therapy.

Fig. 6 Three representative 2D-gel images for normal, low NIS and high NIS serum samples, respectively (A), and an enlarged low NIS gel displaying the common features of human serum proteins (B)
Area 1, haptoglobinβ & cleaved β chain; area 2, haptoglobinα2 chain; area 3, apolipoprotein A-I; area 4, apolipoprotein A-I & A-IV; area 5 & 6, α1-antitrypsin; area 7, DNA topoisomerase IIβ (NIS: necroinflammatory score)[9]

3.3 Proteomic studies on chemotherapeutic agents
3.3.1 To study the mechanisms of drug actions Studies to investigate the protein changes in rat livers have been conducted by using various agents including hepatotoxicants, methapyrilene, cyproterone acetate and dexamethasone［53］. Two-dimensional polyacrylamide gel electrophoresis and mass spectrometry were used for the identification of compound specific biomarkers. The different treatments caused distinct changes in the rat liver proteome. Many of the protein changes could be associated with the known pharmacological and toxicological mechanisms of action of these drugs. This approach could open up new avenues for the exploration of molecular mechanisms of toxicity, and is a good illustration of how proteomics can provide valuable information on the biochemical consequences elicited by hepatotoxic drugs.
In another study, a single dose of puromycin aminonucleoside (PAN) given peritoneally to rats induced ultrastructural glomerular changes and a nephrotic syndrome similar in many respects to human minimal change nephropathy［54］. Increased plasma protein excretion in urine is a consequence of nephritic syndrome and nephropathy. 2-DE has therefore been used to profile urinary proteins during PAN-induced nephrotoxicity and subsequent recovery in rats. In addition, urinary high performance liquid chromatography (HPLC) profiles and high resolution proton nuclear magnetic resonance (NMR) spectroscopy have also been used to detect toxin-induced changes in the relative concentrations of a number of metabolites. This demonstrates that a proteomic approach, used in conjunction with other techniques, has the potential to provide valuable information which traditional clinical chemistry is unable to supply.

3.3.2 To monitor drug effectiveness in clinical trials Recent technological progress in genomics and proteomics has created a unique opportunity for significantly improving the pharmaceutical drug development processes. The fact that cells and whole organisms express specific inducible responses to drug treatment implies that unique expression patterns and molecular fingerprints indicating a drug's efficacy and potential toxicity are accessible［55］.
Bodily fluids such as cerebrospinal fluid (CSF) and serum can be analyzed throughout the course of a disease. Changes in the protein composition of CSF may be indicative of altered protein expression in the central nervous system (CNS), which may be used for causative study or diagnostic biomarkers［56］. These findings can be strengthened through subsequent proteomic analysis of specific brain areas implicated in the pathology. This may facilitate pre-clinical and clinical development of more specific disease markers and new selective fast acting therapeutics.
Severe adverse drug reactions occur in approximately 7% of hospital patients. In some cases the side effect is difficult to predict or elucidate because the pharmacology of the causative agent is unknown. Proteomics may have some predictive value, but is likely to be of greater use in diagnosis, e.g. by recognizing a drug signature in an accessible tissue［57］. This may be possible on a blood sample or biopsy taken at presentation. Alternatively an in vitro assay that replaced rechallenging the patient with a drug would be helpful. The goal is to identify target proteins of the causative drug permitting the development of a safer alternative.
Clinical proteomics, as a new and most exciting sub-discipline of proteomics, involves the bench-to-bedside clinical application of proteomic tools. Unlike the genome, there are potentially thousands of proteomes: each cell type has its own unique proteome. Moreover, each cell type can alter its proteome depending on the unique tissue microenvironment in which it resides, giving rise to multiple permutations of a single proteome. Since proteomics has nothing equivalent to a polymerase chain reaction, identifying and discovering human diseased cell in a biopsy specimen remains a daunting challenge. New micro-proteomic technologies are being, and still need to be, developed into the clinical proteomes. Cancer, as a model disease, provides an excellent environment for the study of the application of proteomics at the bedside. The promise of clinical proteomics and related technological development is that cancer can be identified earlier through the discovery of biomarkers, that the next generation of targets can also be identified, and that we can then apply this knowledge to patient-tailored therapy［58］. The ultimate goals of personalized medicine are to take advantage of a molecular understanding of disease, both to optimize drug development and direct preventive resources and therapeutic agents at individuals at risk while they are still well［59］. The benefits will improve lead selection, and optimized monitoring of drug efficacy and safety in pre-clinical and clinical studies based on biologically relevant tissue and surrogate markers［55］.

3.4 Development of a tool for therapy
3.4.1 To accelerate the advance of gene therapy Antisense oligonucleotides are synthetic stretches of DNA, which hybridize with specific mRNA strands that correspond to target genes. They are a new class of therapeutic agent. By binding to the mRNA, the antisense oligonucleotides prevent the target gene being translated into a protein, thereby blocking the action of the gene. Several genes known to be important in the regulation of apoptosis, cell growth, metastasis, and angiogenesis, have proved feasible as molecular targets for gene therapy. Furthermore, new targets are rapidly being uncovered through integration of functional genomics and proteomics effects［60］. By using the proteomic approach, proteins that are altered in diseased individuals compared with normal individuals can be applied for gene therapy. Since the antisense oligonucleotides can be designed based on the target protein sequence. This can effectively advance the gene therapy by pinpointing the specific defects by a 'reverse genetic' basis.

3.4.2 To identify disease-associated membrane targets for development of antibody based therapy Membrane proteins are responsible for some of the most important functions in cells, including the regulation of cell signaling through surface receptors, cell-to-cell interactions, and the intracellular compartmentalization of organelles. Recently, proteomic techniques have focused on high-throughput analyses of membrane proteins using liquid chromatography coupled to mass spectrometry (LC/MS). This can identify large numbers of membrane proteins and modifications, and may also provide insights into protein topology and orientation in membranes［61］.
HER2 (erbB2/neu) is a member of the erbB family of receptor tyrosine kinases and is involved in regulating the growth of several types of human carcinomas. The discoveries that this gene is amplified in breast tumors and its protein product is overexpressed at the cell surface have led to an effective form of therapy for breast cancer which utilizes an antibody that targets HER2［62］. The elucidation of the role of growth factor receptors expressed on the cell surface in signaling and in uncontrolled cell proliferation, as in epidermal growth factor receptor, has led to the development of new anticancer therapies that target specific components of the epidermal growth factor receptor signal transduction pathway. Selective compounds have been developed to target the extracellular ligand-binding region of epidermal growth factor receptor. Thus, the protein profiling of the cell surface proteome would have important implications. In cancer, cell surface proteins that are restricted in their expression to specific cancer(s) could be utilized for antibody-based therapy, as in the case of HER2 or for vaccine development or other forms of immunotherapy.

3.5 Studies on cellular processes including protein-protein interactions and signal transduction pathway
Post-translational modification on proteins is a fundamental cellular regulatory mechanism. Protein phosphorylation is the main mechanism by which cells modulate enzyme activity and protein-protein interactions. The study of protein kinases is the key to the identification of signal transduction pathways. So far 518 human protein kinases have been identified, and termed the 'human kinome'. They control protein activity by catalyzing the addition of a negatively charged phosphate group to other proteins. Protein kinases modulate a wide variety of biological processes, especially those that carry signals from the cell membrane to intracellular targets and coordinate complex biological functions. Protein-protein interactions play a central role in numerous processes in the cell and are one of the main fields that functional proteomic study［63］. MS can be used to identify novel phosphoproteins, measure changes in the phosphorylation state of proteins in response to an effector, and determine phosphorylation sites in proteins. Identification of phosphorylation sites can provide information about the mechanism of enzyme regulation and the protein kinases and phosphatases involved. Proteomics can study global phosphorylation changes in response to stimuli, by a simultaneous study of the phosphoproteome. A common approach in studying phosphorylation is the labeling of phosphoproteins with 32P in vivo . The phosphoproteome of cells (e.g. normal versus diseased cells) can be analyzed by culturing cells with 32P and then obtaining the labeled cell lysates. Changes in the phosphorylation state of the proteins can then be studied by 2-DE and autoradiography. Proteins of interest are excised from the gel and microsequenced by MS. MALDI-TOF mass spectrometry can also be used to identify phosphopeptides［64－67］. When phosphorylated peptides are subjected to ionization by MALDI, phosphate groups are frequently liberated from the peptides. This is the case for phosphoserine- and phosphothreonine-containing peptides, which can liberate HPO3 or H3PO4, resulting in a neutral loss of 80 and 98 Daltons respectively. Careful examinations of the spectrum for differences in peptide masses of 80 Daltons that are not found in the unphosphorylated peptide control can identify phosphopeptides. Phosphopeptides can also be identified by treating one of the two identical samples with protein phosphatase to liberate phosphate groups. Once a phosphopeptide is identified, it can be again sequenced by MS/MS for identification of the phosphorylation site[67].During the last decade, several studies were made with the aim of discovering or designing small molecules that block protein dimerization or protein (peptide)receptor interaction or, on the contrary, induce protein dimerization［63］. Mass spectrometry-based proteomics can reveal protein-protein interactions on a large scale. In a study that investigated the epidermal growth factor receptor (EGFR) pathway［68］, stable isotopic amino acids in cell culture (SILAC) were used to differentially label proteins in EGF-stimulated versus unstimulated cells. Combined cell lysates were affinity-purified over the SH2 domain of the adapter protein Grb2 (GST-SH2 fusion protein) that specifically binds phosphorylated EGFR and Src homologous and collagen (Shc) protein. 228 proteins were identified, of which 28 were selectively enriched upon stimulation. SILAC combined with modification-based affinity purification is a useful approach to detect specific and functional protein-protein interactions. Indeed, a significant number of human diseases can be attributed to defects in cellular signal transduction pathways. Proteomics can define critical components of signal transduction networks, thereby contributing to the development of more effective therapeutic agents that can specifically target individual disease-altered proteins.4 Future prospects in proteomic technology
Although proteomic technology is certainly capable of characterizing the proteome of a given cell or organism, both techniques continue to have their limitations. Although the resolving power of 2-DE remains unchallenged, mass spectrometry has become more sensitive, faster, and more reproducible. However, examination of the proteome of a cell or organism is like taking a snapshot of its activity at a single point in time. This may underestimate or miss the significance of processes taking place over time. One of the greatest challenges for proteomics is the study of low-copy number proteins. Many classes of proteins, e.g. transcription factors and some enzymes, are in low-copy number. These proteins are unlikely to be detected on 2-D gels unless they are enriched or partially purified. Therefore, resolution is a bottleneck for proteomics advancement for the time being, and development of higher resolution techniques for proteomics is an urgent issue. We believe that technological advances will result in the gradual maturation of proteomics, and that we will ultimately be able to study the proteome comprehensively. We forecast that future proteomic technology will focus on real-time proteomics, i.e. monitoring the proteome in a real time/time-lapse manner, and this will greatly enhance our knowledge of how proteins behave. The era of proteomics is on its way.

Acknowledgements
We thank Dr. D. Wilmshurst for reviewing the manuscript, Amersham Biosciences and Yuan Zhou for technical assistance in our proteomic studies at the University of Hong Kong.

References

1Anderson NG, Anderson NL. Twenty years of two-dimensional electrophoresis: Past, present and future. Electrophoresis, 1996, 17(3): 443－453
2Wasinger VC, Cordwell SJ, Cerpa-Poljak A, Yan JX, Gooley AA, Wilkins MR, Dunc
an MW et al. Progress with gene-product mapping of the Mollicutes: Mycoplasma genitalium. Electrophoresis, 1995, 16(7): 1090－1094
3Wilkins MR, Sanchez JC, Gooley AA, Appel RD, Humphery-Smith I, Hochstrasser DF, Williams KL. Progress with proteome projects: Why all proteins expressed by a genome should be identified and how to do it. Biotechnol Genet Eng Rev, 1996, 13: 19－50
4Krishna RG, Wold F. Post-translational modification of proteins. Adv Enzymol Relat Areas Mol Biol, 1993, 67: 265－298
5Eisenberg D, Marcotte EM, Xenarios I, Yeates TO. Protein function in the post-genomic era. Nature, 2000, 405(6788): 823－826
6Dunham I, Shimizu N, Roe BA, Chissoe S, Hunt AR, Collins JE, Bruskiewich R et al. The DNA sequence of human chromosome 22. Nature, 1999, 402(6761): 489－495
7Newman A. RNA splicing. Curr Biol, 1998, 8(25): R903－905
8Colledge M, Scott JD. AKAPs: From structure to function. Trends Cell Biol, 1999, 9(6): 216－221
9He QY, Lau GK, Zhou Y, Yuen ST, Lin MC, Kung HF, Chiu JF. Serum biomarkers of hepatitis B virus infected liver inflammation: A proteomic study. Proteomics, 2003, 3(5): 666－674
10Blackstock WP, Weir MP. Proteomics: Quantitative and physical mapping of
cellular proteins. Trends Biotechnol, 1999, 17
(3): 121－127
11Dreger M. Proteome analysis at the level of subcellular structures. Eur J Biochem, 2003, 270(4): 589－599
12Matsudaira P. A Practical Guide to Protein and Peptide Purification for Microsequencing, 2nd ed. San Diego: Academic Press, 1993
13Lewis TS, Hunt JB, Aveline LD, Jonscher KR, Louie DF, Yeh JM, Nahreini TS et al. Identification of novel MAP kinase pathway signaling targets by functional proteomics and mass spectrometry. Mol Cell, 2000, 6(6): 1343－1354
14Cash P. Proteomics in medical microbiology. Electrophoresis, 2000, 21(6): 1187－1201
15Shevchenko A, Jensen ON, Podtelejnikov AV, Sagliocco F, Wilm M, Vorm O, Mortensen P et al. Linking genome and proteome by mass spectrometry: Large-scale identification of yeast proteins from two dimensional gels. Proc Natl Acad Sci USA, 1996, 93(25): 14440－14445
16Jung E, Heller M, Sanchez JC, Hochstrasser DF. Proteomics meets cell biology: The establishment of subcellular proteomes. Electrophoresis, 2000, 21(16): 3369－3377
17Rappsilber J, Siniossoglou S, Hurt EC, Mann M. A generic strategy to analyze the spatial organization of multi-protein complexes by cross-linking and mass spectrometry. Anal Chem, 2000, 72(2): 267－275
18Bjellqvist B, Ek K, Righetti PG, Gianazza E, Gorg A, Westermeier R, Postel W. Isoelectric focusing in immobilized pH gradients: Principle, methodology and some applications. J Biochem Biophys Methods, 1982, 6(4): 317－339
19Unlu M, Morgan ME, Minden JS. Difference gel electrophoresis: A single gel method for detecting changes in protein extracts. Electrophoresis, 1997, 18(11): 2071－2077
20Gygi SP, Rist B, Gerber SA, Turecek F, Gelb MH, Aebersold R. Quantitative analysis of complex protein mixtures using isotope-coded affinity tags. Nat Biotechnol, 1999, 17(10): 994－999
21Haynes PA, Yates JR 3rd. Proteome profiling-pitfalls and progress. Yeast, 2000, 17(2): 81－87
22Aebersold R, Rist B, Gygi SP. Quantitative proteome analysis: Methods and applications. Ann N Y Acad Sci, 2000, 919: 33－47
23Gygi SP, Rist B, Aebersold R. Measuring gene expression by quantitative proteome analysis. Curr Opin Biotechnol, 2000, 11(4): 396－401
24Gorg A, Obermaier C, Boguth G, Harder A, Scheibe B, Wildgruber R, Weiss W. The current state of two-dimensional electrophoresis with immobilized pH gradients. Electrophoresis, 2000, 21(6): 1037－1053
25Corthals GL, Wasinger VC, Hochstrasser DF, Sanchez JC. The dynamic range of protein expression: A challenge for proteomic research. Electrophoresis, 2000, 21(6): 1104－1115
26Rabilloud T. A comparison between low background silver diammine and sil
ver nitrate protein stains. Electrophoresis, 1992, 13(7): 429－439
27Hochstrasser DF, Merril CR. 'Catalysts' for polyacrylamide gel polymerization and detection of proteins by silver staining. Appl Theor Electrophor, 1988, 1(1): 35－40
28Strupat K, Karas M, Hillenkamp F, Eckerskorn C, Lottspeich F. Matrix-assisted laser desorption ionization mass spectrometry of proteins electroblotted after polyacrylamide gel electrophoresis. Anal Chem, 1994, 66: 464－470

29Gharahdaghi F, Atherton D, Demott M, Mische SM. Amino acid analysis of PVDF-bound proteins. In: Ageletti RH ed. Techniques in Protein Chemistry III, San Diego: Academic Press, 1992, 249－260
30Goldberg HA, Domenicucci C, Pringle GA, Sodek J. Mineral-binding proteoglycans of fetal porcine calvarial bone. J Biol Chem, 1988, 263(24): 12092－12101
31Sanchez JC, Ravier F, Pasquali C, Frutiger S, Paquet N, Bjellqvist B, Hochstrasser DF et al. Improving the detection of proteins after transfer to polyvinylidene difluoride membranes. Electrophoresis, 1992, 13(9-10): 715－717
32Garrels JI, Franza BR Jr. The REF52 protein database. Methods of database construction and analysis using the QUEST system and characterizations of protein patterns from proliferating and quiescent REF52 cells. J Biol Chem, 1989, 264(9): 5283－5298
33Latham KE, Garrels JI, Solter D. Two-dimensional gel analysis of protein synthesis. Methods Enzymol, 1993, 225: 473－489
34Yamaguchi K, Asakawa H. Preparation of collidal gold for staining proteins electrotransferred onto nitrocellulose membranes. Anal Biochem, 1988, 172: 104－107
35Eckerskorn C, Strupat K, Karas M, Hillenkamp F, Lottspeich F. Mass spectrometric analysis of blotted proteins after gel electrophoretic separation by matrix-assisted laser desorption/ionization. Electrophoresis, 1992, 13(9-10): 664－665
36Ortiz ML, Calero M, Fernandez Patron C, Patron CF, Castellanos L, Mendez E. Imidazole-SDS-Zn reverse staining of proteins in gels containing or not SDS and microsequence of individual unmodified electroblotted proteins. FEBS Lett, 1992, 296(3): 300－304

37James P, Quadroni M, Carafoli E, Gonnet G. Protein identification by mass profile fingerprinting. Biochem Biophys Res Commun, 1993, 195(1): 58－64
38Li KW, Geraerts WP, van Elk R, Joosse J. Quantification of proteins in the subnanogram and nanogram range: Comparison of the AuroDye, FerriDye, and India ink staining methods. Anal Biochem, 1989, 182(1): 44－47
39Hughes JH, Mack K, Hamparian VV. India ink staining of proteins on nylon and hydrophobic membranes. Anal Biochem, 1988, 〗173(1): 18－25
40Perkins DN, Pappin DJ, Creasy DM, Cottrell JS. Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis, 1999, 20(18): 3551－3567
41Zhang W, Chait BT. ProFound: An expert system for protein identification using mass spectrometric peptide mapping information. Anal Chem, 2000, 72(11): 2482－2489
42Clauser KR, Baker P, Burlingame AL. Role of accurate mass measurement (+/－10 ppm) in protein identification strategies employing MS or MS/MS and database searching. Anal Chem, 1999, 71(14): 2871－2882
43Hutchens TW, Yip TT. New desorption strategies for the mass spectrometric analysis of macromolecules. Rapid Commun Mass Spectrom, 1993, 7: 576－580
44Fung ET, Thulasiraman V, Weinberger SR, Dalmasso EA. Protein biochips for differential profiling. Curr Opin Biotechnol, 2001, 12(1): 65－69
45Jr GW, Cazares LH, Leung SM, Nasim S, Adam BL, Yip TT, Schellhammer PF et al. Proteinchip(R) surface enhanced laser desorption/ionization (SELDI) mass spectrometry: A novel protein biochip technology for detection of prostate cancer biomarkers in complex protein mixtures. Prostate Cancer Prostatic Dis, 1999, 2(5/6): 264－276
46Paweletz CP, Gillespie JW, Ornstein DK, Simone NL, Brown MR, Cole KA, Wang QH et al. Rapid protein profiling of cancer progression directly from human tissue using a protein biochip. Drug Dev Res, 2000, 49: 34－42
47Lau AT, Chiu JF. Arsenic is a paradoxical toxic metal: Carcinogen and anticarcinogenic agent. Recent Res Devel Cell Biochem, 2003, 1: 1－19
48He QY, Yip TT, Li M, Chiu JF. Proteomic analyses of arsenic-induced cell transformation with SELDI-TOF ProteinChipR technology. J Cell Biochem, 2003, 88(1): 1－8
49Jensen ON, Podtelejnikov AV, Mann M. Identification of the components of simple protein mixtures by high-accuracy peptide mass mapping and database searching. Anal Chem, 1997, 69(23): 4741－4750
50Mann M, Hojrup P, Roepstorff P. Use of mass spectrometric molecular weight information to identify proteins in sequence databases. Biol Mass Spectrom, 1993, 22(6): 338－345
51Pappin DD, Hojrup JP, Bleasby AJ. Rapid identification of proteins by peptide-mass finger printing. Curr Biol, 1993, 3: 327－332
52Yates JR 3rd, Speicher S, Griffin PR, Hunkapiller T. Peptide mass maps: A highly informative approach to protein identification. Anal Biochem, 1993, 214(2): 397－408
53Man WJ, White IR, Bryant D, Bugelski P, Camilleri P, Cutler P, Heald G et al. Protein expression analysis of drug-mediated hepatotoxicity in the Sprague-Dawley rat. Proteomics, 2002, 2(11): 1577－1585
54Cutler P, Bell DJ, Birrell HC, Connelly JC, Connor SC, Holmes E, Mitchell BC et al. An integrated proteomic approach to studying glomerular nephrotoxicity. Electrophoresis, 1999, 20(18): 3647－3658
55 Steiner S, Anderson NL. Expression profiling in toxicology-potentials and limitations. Toxicol Lett, 2000, 112-113: 467－471
56 Rohlff C. Proteomics in molecular medicine: applications in central nervous systems disorders. Electrophoresis, 2000, 21(6): 1227－1234
57 Wilkins MR. What do we want from proteomics in the detection and avoidance of adverse drug reactions. Toxicol Lett, 2002, 127(1-3): 245－249
58 Krieg RC, Paweletz CP, Liotta LA, Petricoin EF 3rd. Clinical proteomics for cancer biomarker discovery and therapeutic targeting. Technol Cancer Res Treat, 2002, 1(4): 263－272
59 Ross JS, Ginsburg GS. Integration of molecular diagnostics with therapeutics: implications for drug discovery and patient care. Expert Rev Mol Diagn, 2002, 2(6): 531－541
60 Jansen B, Zangemeister-Wittke U. Antisense therapy for cancer--the time of truth. Lancet Oncol, 2002, 3(11): 672－683
61 Wu CC, Yates JR. The application of mass spectrometry to membrane proteomics. Nat Biotechnol, 2003, 21(3): 262－267
62 Slamon DJ, Leyland-Jones B, Shak S, Fuchs H, Paton V, Bajamonde A, Fleming T et al. Use of chemotherapy plus a monoclonal antibody against HER2 for metastatic breast cancer that overexpresses HER2. N Engl J Med, 2001, 344(11): 783－792
63 Archakov AI, Govorun VM, Dubanov AV, Ivanov YD, Veselovsky AV, Lewi P, Janssen P. Protein-protein interactions as a target for drugs in proteomics. Proteomics, 2003, 3(4): 380－391
64 Jonscher KR, Yates JR 3rd. Matrix-assisted laser desorption ionization/quadrupole ion trap mass spectrometry of peptides. Application to the localization of phosphorylation sites on the P protein from Sendai virus. J Biol Chem, 1997, 272(3): 1735－1741
65 Qin J, Chait BT. Identification and characterization of posttranslational modifications of proteins by MALDI ion trap mass spectrometry. Anal Chem, 1997, 69(19): 4002－4009
66 Zhang W, Czernik AJ, Yungwirth T, Aebersold R, Chait BT. Matrix-assisted laser desorption mass spectrometric peptide mapping of proteins separated by two-dimensional gel electrophoresis: Determination of phosphorylation in synapsin I. Protein Sci, 1994, 3(4): 677－686
67 Zhang X, Herring CJ, Romano PR, Szczepanowska J, Brzeska H, Hinnebusch AG, Qin J. Identification of phosphorylation sites in proteins separated by polyacrylamide gel electrophoresis. Anal Chem, 1998, 70(10): 2050－2059
68Blagoev B, Kratchmarova I, Ong SE, Nielsen M, Foster LJ, Mann M. A proteomics strategy to elucidate functional protein-protein interactions applied to EGF signaling. Nat Biotechnol, 2003, 21(3): 315－318

Received: August 12, 2003Accepted: August 25, 2003
This work was supported by University of Hong Kong grants #10204004 and #10204007, Hong Kong Research Grant Council grants #HKU7218/02M, #HKU7395/03M and #HKU7227/02M, and a grant from the Hong Kong University Grants Committee under the Area of Excellence Scheme
*Corresponding author: Tel, (852) 2299-0777; Fax, (852) 2817-1006; e-mail, [email protected]

Updated at: 12-18-2003