+ All documents
Home > Documents > Identifying and quantifying proteolytic events and the natural N terminome by terminal amine...

Identifying and quantifying proteolytic events and the natural N terminome by terminal amine...

Date post: 12-Nov-2023
Category:
Upload: independent
View: 0 times
Download: 0 times
Share this document with a friend
34
© 2011 Nature America, Inc. All rights reserved. PROTOCOL 1578 | VOL.6 NO.10 | 2011 | NATURE PROTOCOLS INTRODUCTION Terminal amine isotopic labeling of substrates (TAILS) is a high- throughput quantitative proteomic platform for protease substrate discovery and N terminome analysis (Fig. 1) 1 . The TAILS workflow is composed of the following steps: Protein collection (REAGENT SETUP Box 1) and proteolysis by test pro- tease (if desired for substrate discovery PROCEDURE Steps 1–6) Isotopic labeling and primary amine blocking (PROCEDURE Steps 7–19), followed by tryptic digestion (PROCEDURE Steps 20–26) High-efficiency polymer-based, blocked peptides negative selection (PRO- CEDURE Steps 27–40) Identification of N-terminal peptides by liquid chromatography (LC)- tandem mass spectrometry (MS; PROCEDURE Steps 41–42) and data analysis of the TAILS-tandem MS spectra (Figs. 2 and 3 and PROCEDURE Steps 43–55) Identification of protease substrates by the sequence of the cleavage sites, or loss of cleaved natural N-terminal peptides (Fig. 3 and PROCEDURE Steps 56–66) The following step-by-step protocol describes exactly how to apply TAILS to the study of the substrate repertoire of a protease and simultaneously annotate the natural N terminome with all N-termi- nal modifications identifiable in the studied samples. This protocol includes a detailed description of three labeling approaches that can be used for TAILS. These were reported in the original TAILS publication 1 and its accompanied step-by-step protocol available on the Protocol Exchange 2 , as well as in two sequential development publications 3,4 . This protocol is further streamlined and ready for routine adaption in the laboratory. In addition, we introduce an improved bioinformatics data analysis protocol specific to TAILS data. Below we will review the following: proteases and proteolytic processing; positional proteomic approaches to study the natural N termini of proteins and protease substrate degradomics; an over- view of TAILS; and the components of TAILS. Proteases and proteolytic processing The N-terminal sequence of proteins and the post-translational modifications of the α-amino group, or side chain of the N-terminal residue, determine the cellular localization 5 , activity 6–9 and turnover of most proteins 10 . Hence, the sequence and nature of all the protein amino termini (N termini) within the proteome (the N terminome) provides valuable functional annotation 11–15 as translation start sites, and N-terminal isoforms, modifications and truncations determine the cellular localization, activity and fate of most proteins 10 . As ~85% of eukaryotic proteins have an acetylated N terminus 16 , and because all proteins undergo proteoly- sis either as part of protein maturation and secretion or by specific processing that alters functionality or during their degradation and Identifying and quantifying proteolytic events and the natural N terminome by terminal amine isotopic labeling of substrates Oded Kleifeld 1,2,5,6 , Alain Doucet 1,2,6 , Anna Prudova 1,2,6 , Ulrich auf dem Keller 1,2,5,6 , Magda Gioia 1,2,5,6 , Jayachandran N Kizhakkedathu 3,4 & Christopher M Overall 1,2 1 Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, British Columbia, Canada. 2 Department of Oral Biological and Medical Sciences, University of British Columbia, Vancouver, British Columbia, Canada. 3 Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, British Columbia, Canada. 4 Department of Chemistry, University of British Columbia, Vancouver, British Columbia, Canada. 5 Present addresses: Department of Biochemistry and Molecular Biology, Monash University, Clayton, Melbourne, Australia (O.K.); Ottawa Institute of Systems Biology, Department of Biochemistry, Microbiology and Immunology, University of Ottawa, Ottawa, Ontario, Canada (A.D); Institute of Cell Biology ETH Zurich, Zurich, Switzerland (U.a.d.K.); Department of Experimental Medicine and Biochemical Sciences, Università di Roma Tor Vergata, Roma, Italy (M.G.). 6 These authors contributed equally to this work. Correspondence should be addressed to C.M.O. ([email protected]). Published online 22 September 2011; doi:10.1038/nprot.2011.382 Analysis of the sequence and nature of protein N termini has many applications. Defining the termini of proteins for proteome annotation in the Human Proteome Project is of increasing importance. Terminomics analysis of protease cleavage sites in degradomics for substrate discovery is a key new application. Here we describe the step-by-step procedures for performing terminal amine isotopic labeling of substrates (TAILS), a 2- to 3-d (depending on method of labeling) high-throughput method to identify and distinguish protease-generated neo–N termini from mature protein N termini with all natural modifications with high confidence. TAILS uses negative selection to enrich for all N-terminal peptides and uses primary amine labeling-based quantification as the discriminating factor. Labeling is versatile and suited to many applications, including biochemical and cell culture analyses in vitro; in vivo analyses using tissue samples from animal and human sources can also be readily performed. At the protein level, N-terminal and lysine amines are blocked by dimethylation (formaldehyde/sodium cyanoborohydride) and isotopically labeled by incorporating heavy and light dimethylation reagents or stable isotope labeling with amino acids in cell culture labels. Alternatively, easy multiplex sample analysis can be achieved using amine blocking and labeling with isobaric tags for relative and absolute quantification, also known as iTRAQ. After tryptic digestion, N-terminal peptide separation is achieved using a high-molecular-weight dendritic polyglycerol aldehyde polymer that binds internal tryptic and C-terminal peptides that now have N-terminal alpha amines. The unbound naturally blocked (acetylation, cyclization, methylation and so on) or labeled mature N-terminal and neo-N-terminal peptides are recovered by ultrafiltration and analyzed by tandem mass spectrometry (MS/MS). Hierarchical substrate winnowing discriminates substrates from the background proteolysis products and non-cleaved proteins by peptide isotope quantification and bioinformatics search criteria.
Transcript

©20

11 N

atu

re A

mer

ica,

Inc.

All

rig

hts

res

erve

d.

protocol

1578 | VOL.6 NO.10 | 2011 | nature protocols

IntroDuctIonTerminal amine isotopic labeling of substrates (TAILS) is a high-throughput quantitative proteomic platform for protease substrate discovery and N terminome analysis (Fig. 1)1. The TAILS workflow is composed of the following steps:

Protein collection (REAGENT SETUP Box 1) and proteolysis by test pro-tease (if desired for substrate discovery PROCEDURE Steps 1–6)Isotopic labeling and primary amine blocking (PROCEDURE Steps 7–19), followed by tryptic digestion (PROCEDURE Steps 20–26)High-efficiency polymer-based, blocked peptides negative selection (PRO-CEDURE Steps 27–40)Identification of N-terminal peptides by liquid chromatography (LC)- tandem mass spectrometry (MS; PROCEDURE Steps 41–42) and data analysis of the TAILS-tandem MS spectra (Figs. 2 and 3 and PROCEDURE Steps 43–55)Identification of protease substrates by the sequence of the cleavage sites, or loss of cleaved natural N-terminal peptides (Fig. 3 and PROCEDURE Steps 56–66)

The following step-by-step protocol describes exactly how to apply TAILS to the study of the substrate repertoire of a protease and simultaneously annotate the natural N terminome with all N-termi-nal modifications identifiable in the studied samples. This protocol includes a detailed description of three labeling approaches that can be used for TAILS. These were reported in the original TAILS publication1 and its accompanied step-by-step protocol available on the Protocol Exchange2, as well as in two sequential development

publications3,4. This protocol is further streamlined and ready for routine adaption in the laboratory. In addition, we introduce an improved bioinformatics data analysis protocol specific to TAILS data.

Below we will review the following: proteases and proteolytic processing; positional proteomic approaches to study the natural N termini of proteins and protease substrate degradomics; an over-view of TAILS; and the components of TAILS.

Proteases and proteolytic processingThe N-terminal sequence of proteins and the post-translational modifications of the α-amino group, or side chain of the N-terminal residue, determine the cellular localization5, activity6–9 and turnover of most proteins10. Hence, the sequence and nature of all the protein amino termini (N termini) within the proteome (the N terminome) provides valuable functional annotation11–15 as translation start sites, and N-terminal isoforms, modifications and truncations determine the cellular localization, activity and fate of most proteins10. As ~85% of eukaryotic proteins have an acetylated N terminus16, and because all proteins undergo proteoly-sis either as part of protein maturation and secretion or by specific processing that alters functionality or during their degradation and

Identifying and quantifying proteolytic events and the natural N terminome by terminal amine isotopic labeling of substratesOded Kleifeld1,2,5,6, Alain Doucet1,2,6, Anna Prudova1,2,6, Ulrich auf dem Keller1,2,5,6, Magda Gioia1,2,5,6, Jayachandran N Kizhakkedathu3,4 & Christopher M Overall1,2

1Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, British Columbia, Canada. 2Department of Oral Biological and Medical Sciences, University of British Columbia, Vancouver, British Columbia, Canada. 3Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, British Columbia, Canada. 4Department of Chemistry, University of British Columbia, Vancouver, British Columbia, Canada. 5Present addresses: Department of Biochemistry and Molecular Biology, Monash University, Clayton, Melbourne, Australia (O.K.); Ottawa Institute of Systems Biology, Department of Biochemistry, Microbiology and Immunology, University of Ottawa, Ottawa, Ontario, Canada (A.D); Institute of Cell Biology ETH Zurich, Zurich, Switzerland (U.a.d.K.); Department of Experimental Medicine and Biochemical Sciences, Università di Roma Tor Vergata, Roma, Italy (M.G.). 6These authors contributed equally to this work. Correspondence should be addressed to C.M.O. ([email protected]).

Published online 22 September 2011; doi:10.1038/nprot.2011.382

analysis of the sequence and nature of protein n termini has many applications. Defining the termini of proteins for proteome annotation in the Human proteome project is of increasing importance. terminomics analysis of protease cleavage sites in degradomics for substrate discovery is a key new application. Here we describe the step-by-step procedures for performing terminal amine isotopic labeling of substrates (taIls), a 2- to 3-d (depending on method of labeling) high-throughput method to identify and distinguish protease-generated neo–n termini from mature protein n termini with all natural modifications with high confidence. taIls uses negative selection to enrich for all n-terminal peptides and uses primary amine labeling-based quantification as the discriminating factor. labeling is versatile and suited to many applications, including biochemical and cell culture analyses in vitro; in vivo analyses using tissue samples from animal and human sources can also be readily performed. at the protein level, n-terminal and lysine amines are blocked by dimethylation (formaldehyde/sodium cyanoborohydride) and isotopically labeled by incorporating heavy and light dimethylation reagents or stable isotope labeling with amino acids in cell culture labels. alternatively, easy multiplex sample analysis can be achieved using amine blocking and labeling with isobaric tags for relative and absolute quantification, also known as itraQ. after tryptic digestion, n-terminal peptide separation is achieved using a high-molecular-weight dendritic polyglycerol aldehyde polymer that binds internal tryptic and c-terminal peptides that now have n-terminal alpha amines. the unbound naturally blocked (acetylation, cyclization, methylation and so on) or labeled mature n-terminal and neo-n-terminal peptides are recovered by ultrafiltration and analyzed by tandem mass spectrometry (Ms/Ms). Hierarchical substrate winnowing discriminates substrates from the background proteolysis products and non-cleaved proteins by peptide isotope quantification and bioinformatics search criteria.

©20

11 N

atu

re A

mer

ica,

Inc.

All

rig

hts

res

erve

d.

protocol

nature protocols | VOL.6 NO.10 | 2011 | 1579

turnover17, these are two of the most ubiquitous and most impor-tant post-translational modifications18,19. The protein amino terminus is susceptible to amino-terminal peptidase processing, modification of the α-amino group and side chain–specific changes that can target a protein for ubiquitination and degradation or protect it from rapid turnover, and thus determine its half-life10. In addition to constitutive proteolysis, regulated processing of pro-tein amino termini can irreversibly change the protein activity or

properties6,8,9, but the full extent to which proteolysis sculpts the proteome is unknown18. Hence, it is important to determine the cleavage site(s) within each protease substrate, as the biological activity of the cleavage products is commonly determined by the precise fragmentation pattern.

With 569 members, proteases are the second largest enzyme class in humans20 and make up 5–10% of drug targets21. To link a specific protease with a defined biological pathway or to develop drugs that can target it, it is necessary to determine the protease’s substrate repertoire or substrate degradome22. Once this is achieved, it may be possible to generate hypotheses on its role and provide biomarkers for disease diagnosis and drug efficacy studies. Nevertheless, for about half of the proteases in humans no substrates are known, and for the other half annotation of the substrate degradome is incomplete17,22. Thus, specific degradomics techniques22 are needed to rapidly identify and quantify the N terminome for the human proteome project to reveal the functional state of key molecules and to identify new protease substrates and their cleavage sites.

Positional proteomic approaches to study protease degradomicsStarting from proteomes that differ only isotopically after labeling, the activation, inhibition or silencing of a selected protease can be readily distinguished with a mass spectrometer. However, in most cases, proteolytic activity induces little change in the treated pro-teomes (i.e., less than a few percent of tryptic peptides). Therefore, the application of a selection approach to enrich for protease-cleaved N termini of a proteome is needed to (i) reduce sample complexity, (ii) maximize MS detection of significant peptides, (iii) simplify MS data analysis and (iv) increase the dynamic range of MS analyses.

Positional proteomics approaches that isolate only the N-terminal peptides of proteins, the N terminome11–15 (Table 1), have been proposed for sample simplification before MS analysis and for proteome annotation, but coverage is often limited11–14,23,24. Apart from combined fractional diagonal chromatography (COFRADIC)11, most of these early approaches were not reported for global protease cleavage site analysis. The main obstacles to this latter goal are to identify the neo–N termini of the substrates generated by specific proteolysis and to distinguish these not only from the natural N termini, but also from N termini generated by background proteolysis of the proteins in a sample and by trypsin digestion (internal tryptic peptides) in proteomic workflows17,25. The high complexity of proteomes typically leads to incomplete coverage of the samples, particularly at the peptide level, making this task even more challenging (see supplementary discussion in ref. 1). Solving these problems requires innovative strategies to circumnavigate the very similar chemical properties of the pri-mary amines of the lysine side chains and N termini; this has been termed the ‘lysine problem’1.

Recent reports of different proteomics-based approaches (Table 1) to tackle this difficult task include lysine guanidination blocking of intact proteins to expose only amino-termini for biotinylation, followed by affinity capture26 or amino-terminal isotopic labeling and in silico selection defined by MS/MS database search param-eters of protease-generated neo-N-terminal peptides27; specific subtiligase enzyme-mediated biotinylation of unblocked α-amine groups, followed by enrichment of the biotinylated N-terminal peptides15; and MS/MS analysis of multiple gel slices of protease-cleaved samples28,29. Although these approaches represent a welcome

Proteolysis

Blocking ofprimary amines

Trypsin digestion

Reaction with theHPG-ALD polymer

Ultrafiltration

Enriched N-terminallyblocked peptides:

original and neo–N termini

NH2

NH2

NH2

NH2 NH2

NH2NH2

NHHO

HOHOH

H

H

HO

H H

HO HO

OHO

NHH

H

H

H

O

O

O

O

O

OO O

O

OO

OO OO

OO

O OO O

O

O

OHO

O

OO

H

H

O

O

NH

NH

NH

NHHN

NH

HN

NH2NH2

NH2NH2

KK

KK

K

KK

KK

K

Figure 1 | Schematic representation of the TAILS workflow. Protease (represented by scissors) is either present in the sample or added to the sample, which generates a prime-side neo-N-terminal fragment and a non–prime-side neo-C-terminal fragment of substrates. Primary amines of natural and protease-generated N termini (red NH2), and lysine residues (red K) are chemically modified and blocked by dimethylation or iTRAQ (red stars) at the protein level. After mixing protease-treated and control proteins, the sample is digested with trypsin, which generates internal tryptic peptides with free N termini (green NH2). The newly formed internal tryptic peptides are removed by reaction of the tryptic peptide N-termini with the amine-reactive, high-molecular-weight, aldehyde-derivatized polymer in the presence of the reductive agent sodium cyanoborohydride. Isotopically labeled natural and neo-N-terminal peptides are separated from the polymer-internal tryptic peptide complexes by ultrafiltration. The unbound peptides in the eluate from the ultrafiltration device are now depleted of internal tryptic peptides and thus are highly enriched for natural and neo-N-terminal peptides. The sample is then analyzed and quantified by high-accuracy LC-MS/MS.

©20

11 N

atu

re A

mer

ica,

Inc.

All

rig

hts

res

erve

d.

protocol

1580 | VOL.6 NO.10 | 2011 | nature protocols

Box 1 | CELL CULTURE AND PRoTEIN CoLLECTIoN Option A describes the procedure for cell culture for dimethylation or iTRAQ labeling, whereas option B describes the SILAC procedure.(a) cell culture for dimethylation and itraQ labeling ● tIMInG 1–7 d (depending on cells grown and required amounts)1. Grow cells in appropriate medium up to 70% confluence.2. Decant media and wash cells extensively (at least three times) with PBS to remove serum proteins and any dead cells.3. Add serum-free medium (i.e., the same medium used for growing the cells but without serum); we usually add 20 ml per T175 flask.4. Grow cells overnight to synchronize them.5. Decant medium and wash cells at least three times with PBS.6. Add fresh serum-free, phenol-free medium. By using a lesser amount of medium than that used for normal cell culture, the secreted proteins will be more concentrated. The time of addition is set as the starting time.7. Grow cells for the required time, which is usually 24 h, depending on the requirements of the experiment and tolerance of cells to serum-free conditions. crItIcal step After 24 h, serum starvation might occur. If cells are grown for shorter times, a larger number of flasks will be required to accrue sufficient quantities of protein in the medium. Note that this procedure is for secretome analysis, but it can be easily adapted for cell lysates or subcellular fractions.cell culture medium collection ● tIMInG 1 h8. Collect conditioned medium in 50-ml tubes (i.e., two flasks per 50-ml tube).9. Centrifuge conditioned medium at 2,200g at 4 °C for 5 min to remove any cells.10. Add protease inhibitors such as PMSF (1 mM final), EDTA (1 mM final) and E64 according to the experimental question being addressed. For a complete list of protease inhibitors for all classes of proteases, see reference 17. It is important to minimize any background proteolysis that inevitably occurs in all proteome samples after collection. crItIcal step Although excess and reversible protease inhibitors will be removed in the following steps by dialysis, when the protease of interest is added in vitro after secretome collection, inhibitors of that protease should be avoided.Medium concentration and buffer exchange ● tIMInG 3–6 h (depending on sample volume and protein concentration)11. Filter supernatant using a Millipore Steriflip or equivalent. pause poInt At this point, it is possible to freeze the samples in liquid nitrogen and store them at − 80 °C for a few months. However, for best results proceed with the next steps using the freshly prepared samples.12. Apply the protein samples to protein concentration devices such as Millipore Amicon-Ultra 15 concentrators. Concentrate condition medium proteins at 4 °C following the manufacturer’s instructions to ~1 ml volume. Please note that to minimize the time for this and the following steps, it is recommended to use several concentrators for treating each sample (i.e., one concentrator per 40 ml of collected conditioned medium proteins). The use of microconcentrators also avoids concentrating the yellow-colored riboflavin present in all cell culture medium that occurs if solid-phase extraction C18 and C4 cartridges are used for protein concentration from conditioned cell culture medium35.13. Add 9 ml of the desired buffer to each concentrator. We recommend using 100 mM HEPES, pH 7.0. crItIcal step TAILS is based on the labeling of peptide primary amines, and thus other molecules with primary amines will inter-fere with the labeling step resulting in incomplete labeling of peptides. Thus, primary amine-containing buffers such as ammonium bicarbonate or Tris must not be used. The purpose of the following buffer-exchange steps is to deplete the sample of free amino acids and other compounds with primary amines. If the protease of interest is added in vitro after secretome collection, the buffer of choice, pH and other additives should allow the optimal activity of the studied protease. We also recommend excluding any detergents, as these can interfere with MS later.14. Concentrate the sample again to a volume of ~1 ml or less.15. Repeat Steps 13 and 14 at least three times. (Optional) If several concentrators were used for each sample, pool all concentrates and bring them to a final volume of ~1 ml. Carefully recover as much protein as possible from each concentrator by gently pipetting the sample over the membrane before removal. Measure protein concentration using your method of choice, e.g., BCA or Bradford assay.16. Bring protein concentration to ~1 mg ml − 1 using your buffer of choice. We recommend using 100 mM HEPES, pH 7.0. pause poInt Samples can be stored at − 80 °C after rapid freezing in liquid nitrogen. However, to avoid nonspecific cleavages, and for a more streamlined procedure, we recommend continuing to the next steps. In general, speed matters, and all results are improved by using fresh samples and avoiding storage and freezing at all times.(B) cell culture for sIlacWhen dealing with protease substrate identification in the simplest experimental setup, two isotopes are used: one form for labeling the ‘naive’ proteome and another for the same proteome exposed to the protease. SILAC labeling offers two different strategies for generating proteomes. One approach follows proteolysis on live cells, by growing cells expressing the protease of interest in heavy medium (heavy arginine) and by growing the same cells expressing an inactive protease mutant or transfected with the vector in light medium. A convenient complementary approach examines in vitro protease activity, performed by growing protease-null cells separately in heavy and light medium and, after harvesting of the secretome in the conditioned medium, adding recombinant protease to the heavy proteome and inactive or inactivated protease to the light control proteome.Adaptation of cells to SILAC medium and the complete incorporation of heavy amino acids into the studied cells proteome is a prerequisite for using metabolic labeling (SILAC-TAILS).1. Grow cell line of choice in DMEM medium in the required amount of FBS. ● tIMInG overnight

(Continued)

©20

11 N

atu

re A

mer

ica,

Inc.

All

rig

hts

res

erve

d.

protocol

nature protocols | VOL.6 NO.10 | 2011 | 1581

Box 1 | CoNTINUED 2. Split cells and subculture in two parallel dishes, using light and heavy SILAC-supplemented medium, respectively. From this point on, heavy and light media must be used separately and preferably in parallel to minimize environmental differences. crItIcal step For SILAC, all steps should be performed in parallel for light- and heavy-labeled cells while verifying that heavy and light cells are cultured at the same growth rate. If the same cell line is used for heavy and light samples, always start culturing using the same number of cells in order to synchronize their growth. If cells with different growth rates are used for light and heavy samples, synchronize the time at which they will reach confluence by using appropriate amounts of each to compensate for the different growth rates.3. Grow cells in culture medium in roller bottles or T-175 tissue culture flasks for at least seven doubling times to gain complete in-corporation of SILAC amino acids. We recommend that the cells be split every day to ensure constant doubling. ● tIMInG At least 7 d, assuming that the cell doubling time is 24 h. crItIcal step In each subculture, verify that cell morphology is not changed.4. When cells are confluent, discard medium and wash gently three times with serum-free medium (or where possible, use room tem-perature (20–25 °C) PBS instead). ● tIMInG 15 min crItIcal step In SILAC experiments, serum proteins can be technically distinguished from proteins of the cell proteome; neverthe-less, it is always advisable, when possible, to adapt cells at the lowest serum concentration the cell line can withstand (serum protein leftover could mask low-abundance proteins of interest).5. Detach cells using a few milliliters of Versene (or trypsin–200 mg ml−1 EDTA).6. Centrifuge, separately, heavy and light SILAC cells at 500g for 5 min.7. Wash each cell pellet with 5 ml of PBS and centrifuge again as in Step 6.8. Resuspend pellet in 1–2 ml of PBS and use an aliquot to count live cells with a hemocytometer chamber using the Trypan-blue dye-exclusion method. ● tIMInG 20 min9. Split the remaining of each sample (light and heavy): keep 10% for incorporation test (steps 12–18) and freeze the rest (90%) as starting stocks for future experiments (steps 10 and 11). ● tIMInG 5 min10. For storage, pellet light and heavy cells by centrifugation at 500g for 5 min.11. Resuspend each pellet in heavy or light DMEM supplemented with 10% (vol/vol) DMSO and 20% (vol/vol) serum, freeze and store in − 80 °C. Save and store separately the cells from a few flasks of heavy and light SILAC-labeled cells in 10% (vol/vol) DMSO and 20% (vol/vol) serum to serve as starting stocks for the actual experiments. ● tIMInG 5 min pause poInt In the case where SILAC amino acid incorporation is confirmed, the frozen heavy and light cells can be used as stocks ready for subculture before TAILS or other experiments.12. For a SILAC incorporation test, first pellet the cells and remove all liquid.13. Resuspend cells in lysis buffer and leave at 4 °C for 5 min, pellet debris by centrifuging at 4 °C for 10 min at 700g and collect the supernatant while taking care to avoid DNA shearing and sample contamination. ● tIMInG 20 min14. Determine protein concentration by Bradford or BCA assay. Keep a small aliquot of each sample (i.e., light and heavy SILAC-labeled cells, respectively) and combine heavy and light cell lysates in a 1:1 ratio. ● tIMInG 35 min pause poInt Samples can be stored at − 20 °C for a few months.15. Resolve heavy, light and heavy-light (1:1) proteome in three separate lanes by simple 1D gel SDS-PAGE. Load the same amount of proteome in each lane (70–90 µg is the recommended amount). ● tIMInG 1.5 h16. Run the gel following the manufacturer’s recommendations, stain the gel with Coomassie blue R250 and destain. Choose a couple of protein bands and excise these out from each of the three lanes. ● tIMInG 2–12 h (depending on staining procedure) pause poInt Gels or cut gel slices can be stored at 4 °C for a short time as long as they are maintained hydrated.17. Perform trypsin in-gel digestion (follow any standard protocol suited for MS)74. ● tIMInG 1 h, then overnight, then 1 h crItIcal step To check SILAC amino acid incorporation, it is not essential to obtain maximum peptide coverage from trypsin in-gel digestion. To save time, we recommend skipping the reduction and alkylation steps of the in-gel digestion procedure. pause poInt Gel-extracted tryptic peptides can be stored at -20 °C for long periods.18. Analyze in-gel digested samples according to standard procedures commonly used in mass spectrometry proteomics. The labeling yield must be greater than 95%. ● tIMInG from 30–60 min crItIcal step SILAC cells are properly labeled only when all their proteins (meaning more than 95%) contain heavy amino acids. Hence, in the cell-labeling procedure, the percentage of light proteins should decrease to zero. Thus, the presence of light peptides in the heavy proteome would indicate that the amino acid incorporation was not complete. The degree of SILAC amino acid incorporation can be quickly determined by manually inspecting the MS1 spectra and performing a database search where the modified masses of the SILAC amino acids have been incorporated into the list of possible modifications. The heavy and light combined proteome should present pairs of tryptic peptides with the expected mass shift due to the isotopic label in a 1:1 ratio. Look for the presence of light tryptic peptides, thereby ensuring that these peptides are absent in the heavy proteome75.

©20

11 N

atu

re A

mer

ica,

Inc.

All

rig

hts

res

erve

d.

protocol

1582 | VOL.6 NO.10 | 2011 | nature protocols

step forward for studying protease-generated neo-N-terminal pep-tides, they are still limited in different aspects such as quantifica-tion26, coverage27, amount of sample required that can be in excess of 50–100 mg15, potential bias due to subtiligase specificity15 or are MS analysis-intensive and hence expensive11,28–30, and as impor-tantly, incapable of analyzing naturally blocked N-terminal pep-tides11–15,26,27. Hence, these approaches are also less suited for the high-throughput wide coverage of diverse N-terminal modifica-tions required for the human proteome project.

Until now, COFRADIC30–33 and TAILS1,3,4 are the only N- terminomics approaches that provide both broad coverage and iso-topic quantification that is essential for study of protease substrate degradomes with unknown or broad cleavage-site recognition motifs, as well as to completely annotate the N terminome3,4,30–33. Although the well-established COFRADIC is a more expensive and time-consuming procedure involving multiple enzymatic and chemical steps, multiple HPLC fractionations, and up to 150 MS/MS analyses per experiment30, selected fractions can now be analyzed after pooling, thus reducing the total number of analyses and hence cost and time.

TAILS overviewTo overcome these problems we developed TAILS1, which is a combined N terminomics and protease substrate discovery degra-domics platform for the simultaneous quantitative analysis of the N terminome and proteolysis on a proteome-wide scale (Fig. 1). TAILS is designed for comparison of multiple (from two up

to eight) protease-treated and control proteomes. In TAILS, it is important to note that the primary amines of both N termini and lysine chains are chemically derivatized before trypsin cleavage. This step, at the protein level, blocks the ‘native protein’ amines and can simultaneously introduce the stable isotope-labeled moiety. Next, the labeled proteomes are mixed and digested with trypsin. The blocked lysine residues result in most of the short neo–N ter-mini being effectively lengthened by trypsin skipping the lysine and cleaving with ArgC specificity, thus enabling these longer peptides to be identified non-redundantly. After trypsin cleavage, the inter-nal tryptic and C-terminal peptides are removed by reactivity of their free amino N termini generated by trypsin cleavage. This step enriches for all forms of blocked N-terminal peptides by negative selection. To overcome nonspecific peptide binding and low capac-ity of derivatized chromatographic beads, we developed a novel class of dendritic polyglycerol aldehyde polymers optimized for efficient, high-capacity tryptic peptide binding with virtually no nonspecific interactions. Through this massive sample simplifica-tion before mass spectrometric analysis, good proteome coverage with a dynamic range of six orders of magnitude can be obtained in as little as one MS/MS analysis1—although we now typically per-form ten MS/MS analyses per sample following offline strong cation exchange (SCX) chromatographic fractionation of samples before MS analysis to further improve coverage3,4. Rather than deliberately excluding acetylated proteins12,15,26,27, TAILS provides wide cover-age of all forms of naturally blocked N-terminal peptides and, in many cases, allows for their quantification through isotopic labeling

NaturalN termini

Neo–N termini

Dimethylation

R R

R R

+

Inte

nsity

R

Inte

nsity

Elution time

SILAC

R R

R

Elution time

MS1 quantification MS2 quantification

iTRAQ

R R

R R

+

R

R

m/z

R

+

R R

Protease

a

b

cElution time

550 600 650 700 750 800

Inte

nsity

Inte

nsity

Light : 348.868 m/z, +3Area: 0Maximum intensity: 0

Heavy: 350.881 m/z, +3Area: 7.38 × 106

Maximum intensity: 4.05 × 105

100 500 900 1,300m/z

114

R

R

R

R

Inte

nsity

113 115

Figure 2 | Peptide relative quantification with dimethylation, SILAC or iTRAQ isotopic-labeling strategies. (a) Schematic representation of the different labeling and quantification strategies of natural N-terminal and protease-generated neo-N-terminal peptides. Primary amine-reductive dimethylation and SILAC-labeling strategies can be used for peptide-relative quantification in MS1 mode. For reductive dimethylation (left column), primary amines (lysine and N termini) are labeled with the heavy (red star) or light (blue star) formaldehyde at the protein level for the protease-treated and the control sample, respectively. In SILAC labeling, proteins are isotopically labeled by culturing cells in media containing heavy (red circled R) or light (blue circled R) arginine for the protease-treated and the control sample, respectively. Proteins are then isolated and their primary amines are blocked by reductive dimethylation using light formaldehyde for both samples (blue star). Tryptic peptide pairs from natural N termini and basal degradation products unaffected by the protease are equal in abundance in the protease-treated and control samples. The elution profiles of each peptide pair from the reverse-phase column have similar intensities, as represented in the top two chromatograms. Neo–N termini are only found in the protease-treated samples and are found as singletons (bottom two chromatograms; the light-labeled versions of peptides are not present). For quantification by MS/MS (right) protein primary amines of the protease-treated and control samples are labeled with iTRAQ reagents containing the heavy (115 kDa, red square) or light (114 kDa, blue square) reporter moiety, respectively. The iTRAQ reporter ions are released upon collision-induced fragmentation of the peptides and result in peaks in the low molecular mass of the peptides MS2 spectra. Natural N termini and basal degradation products have iTRAQ reporter ions of equal intensity, and neo–N termini have only the 115-kDa reporter ion. (b) Example of a heavy-labeled (SILAC labeling strategy) singleton identified in initial mass-to-charge-ratio (m/z) spectrum (MS1 mode). The light (top) and heavy (bottom) peptide pair chromatographic elution profile is presented. Only the heavy-labeled peptide is detected, indicating that this peptide is only found in the protease-treated sample. (c) MS/MS spectrum of the protease-generated peptide EVGAPGAPGGKGDSGAPGER, in which only the 115-kDa iTRAQ reporter ion is found. Inset: zoom-in of the low-molecular-mass region of the spectrum.

©20

11 N

atu

re A

mer

ica,

Inc.

All

rig

hts

res

erve

d.

protocol

nature protocols | VOL.6 NO.10 | 2011 | 1583

of lysine within the peptide sequence1,3,4. The labeling approaches used for TAILS introduce either nonbiological (isobaric tags for relative and absolute quantification (iTRAQ)) or biologically rare (dimethylation) protein N-terminal modifications34, allowing a very clear discrimination between the naturally blocked protein N termini and naturally free protein N termini that were blocked during the TAILS procedure. In addition to annotating the pro-teome, we use the abundance ratio distribution of these natural N-terminal peptides to form a statistic classifier for the TAILS experiments to determine statistically valid isotope ratio cutoffs.

To study the substrate repertoire of a specific protease (also known as the substrate degradome) using TAILS, the protease of interest and an inactivated form can be added separately for distinct proteomes to be compared. The protease can be added in vitro after proteome isolation1,3,4 or by using cells expressing the active pro-tease for comparison with cells expressing the inactive form of the protease1,35,36, or in which the protease expression is knocked down or its activity is inhibited37. Alternatively, proteomes of protease-knockout mice cells or tissues can be used and compared with cells or tissues from wild-type animals with or without induction of a specific stress or disease state1.

The ability to reliably identify substrates and their cleavage sites requires two essential steps: high-confidence peptide identifica-tion and high-confidence peptide quantification. Quantification is required to correctly distinguish protease-generated cleaved neo– N termini from the background proteolysis that presents itself as stable cleavage fragments that are present in every sample. For substrates of a protease that cleaves a canonical recognition motif, such as caspases and granzymes, this is a straight-forward technique. Here the N termini of peptides identified are simply scanned manu-ally for the protease-recognition motif. However, for proteases with unknown cleavage specificities or broad specificities, this cannot

be done. We have solved this problem by peptide quantification. The neo-N-terminal peptides specific to the protease of interest appear only in the protease-treated sample and therefore show a high protease/control abundance ratio, thus distinguishing them from background proteolysis products that appear in all samples with abundance ratios centered to 1. In TAILS, a stable isotope is introduced either by chemical modification or by metabolic labeling (Fig. 2). Chemical labeling can be universally done on any biological source of proteome, including human body fluids and biopsy sam-ples, whereas metabolic labeling strategies require cells in culture.

Dimethylation-TAILSChemical labeling can be performed in a single step using amine-reactive isotopic reagents. Isotopic labeling by dimethylation of

Mascot X! Tandem

.mgf

Tab-delimitedfile (.xls)

pepXML pepXML

pepXML

PepXML viewer

mzXML

PeptideProphetquantification:XPress, Libra

iProphet

MzXML2Search

Mascot2XML Tandem2XML

TAILS-ANNOTATOR

Spectrum-to-peptide assignment(TPP)

Mass spectrometer raw files

mzWiff, ReAdW,msconvert

Log2(protease/control)

Depletednatural

N-termini

Generatedneo-N-termini

0

Den

sity 3 s.d.3 s.d.

Substrate identification(based on natural N-termini)

Metremoved

Met intact

Signal peptideremoved

Propeptideremoved

Internal

Natural

Doublevalidation

Positionalannotation

Doublevalidated

Peptide Peptideox

PeptidechargePeptideCID

Figure 3 | Schematic representation of bioinformatics analysis pipeline for TAILS data. Raw mass spectrometer output files are first converted to the open format mzXML. For high-confidence peptide assignments, spectra are searched by using at least two search engines (e.g., Mascot and X! Tandem). Search results are validated by PeptideProphet, and quantification analysis is performed by the appropriate analysis tool (XPRESS or Libra). The results for each biological sample are merged, and peptide identifications are secondary validated using iProphet with a <1% false discovery rate. These high-confidence N-terminal peptides are analyzed by TAILS-ANNOTATOR software that assigns their position in the mature protein and checks for double validation based on multiple identifications in different charge states, methionine (Met) oxidation states and the collision-induced dissociation (CID) number. Abundance ratios of natural N termini (Met1 intact, Met1 removed, signal peptide removed, propeptide removed, all with or without N-terminal acetylation or N-terminal cyclization) are expected to be centered on 1.0 and are used to normalize data for experimental systematic errors. The normalized log2-ratio distribution of natural N termini defines the experimental variation and is used to determine neo–N termini generated or natural N termini depleted by activity of the test protease with a stringent cutoff of three times the standard deviation (3× s.d.). High protease/control ratio or singletons correspond to protease-generated neo–N termini. These can be easily distinguished from background proteolysis products that occur in both samples and from natural N-terminal peptides, which have an isotope ratio centered on 1.0. Low-ratio peptides correspond to protease cleavages close to the original N terminus, which therefore depletes the original mature N-terminal peptide from the sample due to protease activity. Hence, low-ratio peptides provide strong but indirect evidence for cleavage based on substrate depletion. Alternatively, for iTRAQ labeling and quantification, a more sophisticated statistically based data analysis pipeline is described by auf dem Keller et al4.

©20

11 N

atu

re A

mer

ica,

Inc.

All

rig

hts

res

erve

d.

protocol

1584 | VOL.6 NO.10 | 2011 | nature protocols

taBle 1 | Positional LC degradomics approaches to study N termini and proteolysis products.

technique Features

Acetylation of primary amines followed by tryptic digestion and biotinylation of the free N termini of the internal tryptic peptides12,14

Chemical labeling (acetylation) of free N termini and lysines Negative selection of blocked N termini Impossible to distinguish between naturally blocked N termini and naturally free internal N termini after acetylation Low numbers of peptides reported Lacks utility for comprehensive proteome N-terminomics suited for the human proteome project No isotopic labeling and thus lacks quantitative aspects Cannot distinguish between background proteolysis products and experimental proteolysis products Was not reported for the study of proteolysis

Lysine guanidination followed by biotinylation of protein N termini26

Specific chemical blocking of lysine residues first—hard to reach completeness Chemical tagging of free protein N termini for positive selection of N-terminal peptides Does not capture naturally blocked N termini and thus lacks the ability for statistical modeling using non-cleaved peptides and lacks utility for comprehensive proteome N-terminomics suited for the human proteome project Does not use isotopic labeling and thus lacks quantitative aspects Without labeling unless cleavage site specificity is already known, it cannot separate background pro-teolysis products from the experimental protease cleavage sites Suited mainly for proteases with known sequence specificity, which can be used to manually curate the data sets, or for nonspecific proteolytic signatures of samples

Subtiligase biotinylation of protein N termini15

Enzymatic labeling of protein N-terminal peptides (requires patent-protected enzyme) Advantageous, as it does not introduce lysine-blocking chemicals Without lysine blocking, many cleaved neo-N-terminal peptides will be too short for nonredundant peptide/protein identification Output highly dependent on subtiligase specificity, which is biased Does not capture naturally blocked N termini Lacks utility for comprehensive proteome N-terminomics suited for the human proteome project Requires 50–100 mg of peptide sample Does not use isotopic labeling and thus lacks quantitative aspects Without labeling unless cleavage site specificity is already known, it can not separate background proteolysis products from the experimental protease cleavage sites Therefore, only suitable for proteases with known sequence specificity so as to manually curate the data sets

iTRAQ-labeling of protein N termini27

High sample complexity (no enrichment of N termini) iTRAQ reagents are moderately expensive Restricted to MALDI mass spectrometers In silico selection of neo–N termini Very low neo-N-terminal peptide numbers reported Does not capture naturally blocked N termini Lacks utility for comprehensive proteome N-terminomics suited for the human proteome project

Combined fractional diagonal chromatography (COFRADIC)11,30,31,33

Negative selection of N termini Separates both naturally blocked N termini and protease generated neo–N termini but requires differ-ent labels to distinguish natural- versus chemical-labeled acetylated N termini Separation dependent upon amino acid modifications such as methionine oxidation not occurring dur-ing sample handling or else the peptides elute from the second HPLC column at a different position Can suffer from high carryover of unlabeled tryptic peptides in analyses, but these can be distin-guished by search parameters Flexible labeling options (C-terminal 18O, acetylation or SILAC reported) Quantitative and position-based identification of neo–N termini High ratio neo-N-peptides provide direct evidence for substrates Requires multiple chemical processing and complex chromatography and mass spectrometry schemes Up to 150 MS/MS analyses per sample, but samples can be pooled, and thus the number of analyses can be reduced Highly suited for proteases with broad or unknown specificity Well established and successful methodology

(continued)

©20

11 N

atu

re A

mer

ica,

Inc.

All

rig

hts

res

erve

d.

protocol

nature protocols | VOL.6 NO.10 | 2011 | 1585

two samples using either 12CH2-formaldehyde (light) or 13CD

2-

formaldehyde (heavy), with sodium cyanoborohydride (NaBH3CN)

as the catalyst38, is originally described for TAILS by Kleifeld et al1. Formaldehyde is a well-characterized protein and peptide modi-fier that readily reacts with primary amines (and to lesser extent with thiols) modifying these into imines, also called a Schiff base, which can then crosslink with several amino acid(s) (including glutamine, asparagine, tryptophan, histidine, arginine, cysteine and tyrosine residues)39. However, addition of NaBH

3CN reduces the

Schiff base to a secondary amine. As this is more reactive than the primary amine, it will react with another formaldehyde unit and is then reduced to form a dimethylamino group.

Dimethylation (as well as iTRAQ, see below) maintains the ionic state of the peptide and actually assists ionization as observed in a 15% higher coverage of dimethylated samples38. The quantification of such labeling is performed on the initial mass-to-charge-ratio (m/z) spectrum of a peptide (also known as MS1 mode) (Fig. 2). This labeling approach is fast, robust and efficient and uses relatively cheap reagents (~$1 per labeling reaction). The labeling procedures are carried out separately for the control and protease-treated sam-ples before pooling. When using dimethylation labeling, sample comparisons are typically limited to two samples in a duplex experi-ment. However, it is often advantageous to compare more than two samples simultaneously. To do so, isotopically labeled cyanoborohy-drate can allow the multiplexing of three samples, using the dimeth-ylation protocol40. Alternatively, chemical labeling using iTRAQ or tandem mass tag reagents can allow up to eight samples to be compared41,42. Labeling can also be accomplished in vivo by stable isotope labeling with amino acids in cell culture (SILAC) metabolic labeling, in which up to five samples43 can be simultaneously com-pared. However, SILAC is too expensive for most laboratories to use for animal or not feasible for human tissue analysis.

SILAC-TAILSThe SILAC procedure is easy for any laboratory that uses cell cul-ture. Therefore, this strategy can be routinely used as a labeling technique. Metabolic labeling allows the analysis of ex vivo process-ing or inhibition of a given protease using underivatized biological samples. Further, metabolic labeling allows mixing of the examined

proteomes at early stages of the procedures, thus reducing system-atic errors and experimental noise caused by the separate handling of different samples that is inherent in chemical labeling. Another advantage of using metabolic labeling over chemical tagging of pro-teins is that the metabolic labeling procedure enables a fast, reliable discrimination between authentic cellular-derived proteins versus other contaminants such as serum proteins and human keratin. Using proper combinations of amino acids, SILAC can be used for multiplex analysis of up to five samples43; however, it is generally a prohibitively expensive option for use in animal experiments and, crucially, SILAC is not suitable for the analysis of clinically relevant human samples that cannot be metabolically labeled.

By using arginine and lysine as labeled amino acids in combina-tion with trypsin digestion, all peptides with two exceptions are quantifiable: These are the carboxyl-terminal peptides of the pro-teins and rare semitryptic internal peptides (which will be subse-quently generated by trypsinization in the workflow) occurring N-terminal to a protease cleavage site. Note that the blocking of N-terminal and lysine residue primary amines, which is funda-mental for TAILS, is performed after mixing the proteomes prior to trypsinization and can be done by any amine-reactive reagent (Fig. 2). We choose to perform this step by dimethylation using the light version of formaldehyde for the reasons specified above44.

A disadvantage of SILAC is the price of stable isotope-labeled amino acids. However, chemical derivatization of the lysine side-chain amine group in TAILS enforces ArgC specificity to trypsin, which results in all peptides having a C-terminal arginine. Accordingly, only isotope-labeled arginine is necessary for N-terminal and neo-N-terminal peptide detection and quantification with SILAC. In the case in which only one amino acid is used for isotopic labeling, only two samples can be simultaneously compared.

iTRAQ-TAILSLabeling by dimethylation or SILAC results in increased sample complexity in MS1 mode because each peptide from the different samples is labeled to show a unique mass difference. Therefore, identical peptides in the combined samples are detected with different m/z ratios, thereby doubling the number of peaks in duplex experiments for MS analysis. Therefore, this increase in

taBle 1 | Positional LC degradomics approaches to study N termini and proteolysis products (continued).

technique Features

Terminal amine isotopic labeling of substrates (TAILS)1,3,4

Negative selection of all N termini Identifies all naturally blocked N termini, all unblocked N termini and protease-generated neo–N termini Suitable for human proteome project N-terminome analysis Flexible and can be easily adopted to different sample preparation schemes or isotopic-labeling methodologies (dimethylation, SILAC and multiplexing; e.g., iTRAQ is possible) High-ratio neo-N-terminal peptides provide direct evidence for substrates Increased substrate coverage by analysis of low-ratio peptides providing strong but indirect evidence for cleavage based on substrate depletion iTRAQ reagents are moderately expensive Polymer is commercially available Polymer chemistry enables simple and highly efficient removal of internal tryptic peptides High-efficiency separation enables small sample amounts to be used (100–300 µg) One to ten MS/MS analyses required per sample Quantitative and position-based identification of neo–N termini Highly suited for proteases with broad or unknown specificity

©20

11 N

atu

re A

mer

ica,

Inc.

All

rig

hts

res

erve

d.

protocol

1586 | VOL.6 NO.10 | 2011 | nature protocols

sample complexity with every isotopic combination reduces the chances for identification and correct quantification of low-abundance proteins. Although spectral counting for quantification45,46 could be performed after amine blocking, as >15 spectra are required for each peak quantified to be statistically valid, this greatly increases the number of MS analyses required and hence the cost to achieve statistical significance for every peptide in large data sets. This sta-tistical rigor often effectively negates any cost- and time-saving originally hoped for when applying spectral counting.

To more easily analyze and quantify multiple samples simultane-ously, which enables adding the time dimension to static time-point proteomics or for more reproducible analysis of sample replicates, we developed a third labeling option using iTRAQ reagents as described by Prudova et al3,4. In iTRAQ-TAILS labeling and MS analysis, costs come at an intermediate price point and provide the following advantages: (i) four or up to eight samples can be simultaneously analyzed in multiplex experiments by using the four- and eight-plex iTRAQ reagents, respectively; (ii) no increase in sample complexity in MS1 mode; and (iii) signal amplification of low-abundance proteins as intensities for the same peptide/parent ion from the contributing samples are effectively added together in MS1, followed by fragmentation in MS2 for high-accuracy identi-fication and quantification from the same spectra (Fig. 2). As with dimethylation, the labeling procedure must be carried out sepa-rately for the control and protease-treated sample(s), but any minor systematic errors such as pipetting differences are readily corrected and normalized during data analysis using the ratio distribution of the natural N-terminal peptides, which are unaffected by the investigated protease. Hence, it is important to be able to analyze the natural N termini to make any such minor corrections.

TAILS componentsTryptic digestion of labeled samples. As in many other proteomics procedures, the sample must be digested to prepare it for proteo-mics analysis. It is highly recommended to use trypsin for this pur-pose, although GluC or chymotrypsin can also be used. Of note, using endopeptidases with similar or overlapping specificity to the test protease might reduce the number of identified peptides47, but the test protease neo-N-terminal peptides can be positively identified by the label at their N termini. In TAILS, trypsin cannot cleave after derivatized (i.e., by iTRAQ or dimethylation) lysine, and thus will cleave with ArgC specificity (cleaving C-terminal to arginine residues only). This generates longer peptides and sig-nificantly improves the likelihood of identifying neo-N-terminal peptides that have already been shortened by the protease under study1. In general, we found the MS/MS assignment to be very good, as indicated by the low false discovery rate in our data sets (<1% at the peptide level). Nevertheless, not all generated tryptic N termini peptides are amenable to MS/MS; therefore, repeating TAILS experiments with proteases other than trypsin increases the coverage of the N-terminal peptide repertoire.

Negative selection of blocked peptides using HPG-ALD polymer. This step eliminates internal tryptic peptides and enriches the natu-rally blocked, as well as the isotope-labeled and blocked, N-terminal peptides by negative selection. In the previous steps, the original N termini of the proteins and protease-generated neo–N termini were chemically blocked (i.e., by iTRAQ or dimethylation); thus, together with the naturally blocked N termini (e.g., by acetylation,

methylation or cyclization) of proteins, they all have unreactive N termini. Trypsin digestion generates internal peptides with free, reactive N termini. The high-molecular-weight, aldehyde-deriva-tized (HPG-ALD) polymers that we developed for TAILS contain multiple aldehyde functional groups that readily react and bind the free N-terminal internal tryptic and C-terminal peptides when mixed with the digested sample in the presence of reducing NaBH

3CN. In contrast, the naturally blocked and chemically labeled

mature N-terminal and neo-N-terminal peptides (as well as blocked amino groups of lysine side chains) are unreactive and will remain unbound. These blocked peptides are separated from the polymer-bound internal tryptic peptides and recovered by ultrafiltration.

This protocol refers to HPG-ALD type II (HPG-ALDII), which is the preferred polymer for TAILS because it provides a good balance between aldehyde content and peptide binding capacity (M

n ~90λkDa with ~500 functional groups/per molecule)1. HPG-

ALDII polymer has a binding capacity of 2.5 mg of peptide per milligram of polymer. However, there are different versions of the HPG-ALD polymer, each with different binding capacities1. Therefore, if a different version of HPG-ALD is used, the amount of polymer for the capture should be modified accordingly. The polymers are readily obtained and without commercial or com-pany restriction from Flintbox Innovation Network, The Global Intellectual Exchange and Innovation Network (http://www.flintbox. com/public/project/1948/). To ensure complete binding of inter-nal tryptic peptides, we recommend using an amount of polymer with binding capacity that is approximately five times excess to peptides. Thus, 100 µg of digested proteome should be incubated with 200 µg of polymer (200 µg of HPG-ALDII binds 500 µg of peptides, representing a fivefold excess of polymer). Removal of the internal tryptic peptides results in an approximately 95% decrease in the total peptide content. Therefore, for 100 µg of starting material, a maximum of ~10 µg of peptides can be recov-ered in the N-terminal–enriched sample. The peptide content is, in fact, lower (around 2.5 µg) because of sample loss through the different steps.

TAILS LC-tandem MS. TAILS-enriched N-terminal peptides have been analyzed on quadrupole time-of-flight QStar3,4 and LTQ-Orbitrap1 mass spectrometers, but they can also be analyzed on any similar or better tandem mass spectrometer. An LTQ-Orbitrap mass spectrometer is preferred for SILAC and dimeth-ylation because of its fast duty cycle time and high mass accuracy. Using the LTQ-Orbitrap, TAILS data coverage has proven excel-lent without sample prefractionation steps due to the massive sample simplification achieved after removal of the internal tryptic peptides1. However, higher coverage and potentially bet-ter quantification accuracy can be obtained using a 2D peptide separation system following TAILS3,4, such as SCX chromatog-raphy to generate ten fractions, with each then being analyzed separately on the mass spectrometer.

Use of hydrogen-based isotopes can lead to partial resolution of light and heavy tags during LC separation that may reduce the accu-racy of quantification48. However, there are contradicting reports in the literature regarding such isotopic effects while using deuter-ated formaldehyde for dimethylation23,38,40,49, which also seems to be related to the specific LC characteristics used40. If prefractionation is performed, it is important to choose an LC step that is not likely to be affected by the deuterium effect, such as SCX40. This is an

©20

11 N

atu

re A

mer

ica,

Inc.

All

rig

hts

res

erve

d.

protocol

nature protocols | VOL.6 NO.10 | 2011 | 1587

advantage of TAILS in that the flexible nature of the labeling indi-cates that if the deuterium effect were a problem, then it is simple to switch labeling strategies.

Data analysis of the TAILS–tandem mass spectroscopy spectra. Unlike most proteomics procedures, the successful outcome of TAILS negative selection is the generation and isolation of peptide ‘single hits’—where a protein identification is based on a single peptide (i.e., only the original or neo-N-terminal peptide of each protein)50. This is also a former concern for phosphoproteomics, in which single phosphorylated peptides are often used to identify proteins. Nevertheless, unlike most positive-selection N terminome proce-dures, negative selection approaches such as TAILS and COFRADIC often identify both the mature or blocked original N terminus and one or more neo-N-terminal peptides. Indeed, by TAILS, typically ~50% of proteins are identified by two or more different and unique peptides3,4 and for these proteins TAILS therefore meets the standard of conventional proteomics analyses for protein identification.

Solving the single-peptide problem using a double-validation workflow. For the remaining ~50% of proteins identified by one peptide, setting the selection criteria on a single search tool and using a strict cutoff might seem to be a simple approach; however, coverage is reduced and spectra assigned this way do not have as high a confidence in peptide (and hence protein) identification, as when two or more independent identifications of the same peptide are used for cleavage-site identification, and two or more peptides are used for protein substrate identification. To address those substrates identified by only one peptide, we set statistical and bioinformatics criteria to ensure that each peptide is identified by at least two dif-ferent high-quality spectra in either different biological replicates (preferred) or at different points of the LC elution (such as from missed tryptic cleavages), or by identification of the peptide in two or more charge or modification states, thereby forming what we term a ‘double-validated’ strategy1. Identification of a peptide in two states or forms can occur from different charge states or modifications to the peptide, such as methionine oxidation or amino acid deamida-tion. By applying double-validation criteria to N-terminal peptide analysis, a very high confidence in single-peptide identifications, and hence cleavage sites, is achieved. However, when performed this way, the cleavage sites and substrate identification numbers for TAILS are lower, being quite conservative with many potential sites and substrates discarded for not being identified in two samples, forms or states. Nevertheless, TAILS experiments identifying >1,000 high-confidence double-validated peptides are relatively easy to produce. Less strict criteria can be used, but such cleavage sites are identified with less confidence and need independent validation, such as by using in vitro cleavage assays.

The TAILS double-validated bioinformatics workflow1,3,4 is shown in Figure 3. We choose to analyze the data using the open-source Trans-Proteomics Pipeline (TPP) software from the Institute of Systems Biology in Seattle51, which allows input from different mass spectrometers and can incorporate MS/MS search results from dif-ferent search engines including free and open-source engines such as X! Tandem and OMSSA, as well as the commonly used commer-cial programs Sequest and Mascot. We provide examples of neo– N-terminal peptide identification based on data originating from a Thermo LTQ-Orbitrap instrument (‘.RAW’ file format) for dimeth-ylation-TAILS and SILAC-TAILS, and from a QStar XL instrument

(‘.wiff ’ file format) for iTRAQ-TAILS both analyzed by Mascot52 and X! Tandem53 for uninterpreted database searches. The naturally blocked N-terminal peptides are identified by applying appropri-ate database search parameters (i.e., including peptide N-terminal acetylation or cyclization) and by using protein database annota-tions. Data analysis instructions (Steps 43–54) exemplify this strat-egy for N-terminally acetylated proteins. These steps are described briefly, and detailed information regarding the use of TPP can be found on the TPP wiki (http://tools.proteomecenter.org/wiki/index.php?title=Main_Page), tutorial (http://tools.proteomecenter.org/wiki/index.php?title=TPP_Demo2009), the TPP users discussion list (http://groups.google.com/group/spctools-discuss?pli=1) and in the links and references provided below.

First, for the identification of N-terminal peptides, database searches are performed using at least two search engines. Combining the outcome of two search engines ensures higher numbers of peptide identifications54,55 and also improves the validity of spectra-to-peptide identification for those peptides found by both search engines55,56. For this purpose, we use Mascot and X! Tandem, but other search engines can also be used. The search parameters include N-terminal and lysine modifications. To reduce the false discovery rate of peptide identifica-tion, decoy sequences in the searched database can be used, or software analyses can be performed, where statistical models are created for each data set using programs such as PeptideProphet57,58. We use both approaches, i.e., we use databases containing target protein sequences and labeled decoys for Mascot or X! Tandem searches, followed by PeptideProphet1,3,4,44. For more information about decoy sequences, their generation and implantation in the database, see Mascot help (http://www.matrixscience.com/help/decoy_help.html).

TPP quantitative analysis of Mascot search data for dimeth-ylation-TAILS (http://tools.proteomecenter.org/wiki/index.php?title=TPP:Mascot_and_the_TPP) requires running two sepa-rate searches, one for only heavy-labeled peptides and one for only light-labeled peptides. Similar analyses of SILAC-TAILS44 depend on the type of heavy amino acids used. For example, analysis of only arginine-labeled samples can be done by a single search, whereas analysis of lysine- and arginine-labeled samples requires two sepa-rate searches. For clarity and simplicity we describe only the option of separate searches. Quantitative analysis of X! Tandem searches for dimethylated peptides with the TPP (http://tools.proteomecenter.org/wiki/index.php?title=TPP_Tandem_search) does not require the use of separate searches for heavy- and light-labeled peptides, but for simplicity we will describe the same search approach. To combine X! Tandem search results with Mascot results, it is impor-tant to use the same database for both searches.

Furthermore, with the recent iProphet56 tool, the TPP provides an additional layer of secondary validation of spectrum-to-pep-tide assignments by combining results from multiple search engines and replicate experiments. Similarly to our manual double-validated peptide scheme, iProphet takes into account whether a peptide was identified in more than one experiment and whether it was also present as a missed cleaved peptide, in multiple charge states or in other multiple forms of modifica-tions. As iProphet may still provide high probabilities to pep-tides based on only a single assigned spectrum, we introduced an additional step to annotate peptides in iProphet-filtered lists that have been identified by two or more spectra. Rather than manually, as described in our previous report1, we now use TAILS-ANNOTATOR, an in-house Perl script available online

©20

11 N

atu

re A

mer

ica,

Inc.

All

rig

hts

res

erve

d.

protocol

1588 | VOL.6 NO.10 | 2011 | nature protocols

(see Scripts section in http://www.clip.ubc.ca/resources/cliptails.html) for the community, and Venny59 as convenient programs to plot the overlap in lists and sample replicates.

Finally, if the pre-pullout fractions in TAILS samples are also analyzed by MS/MS, then several tryptic peptides of the substrate protein may also be identified. This is particularly useful for pro-tein isoform assignment using the Isoform Assignment Score algorithm we developed4. Hence, although TAILS identifies >50% of substrates by two or more different peptides, the remaining substrates are identified by the same peptide identified in two or more different biological replicate samples, forms or states, thereby giving high-confidence cleavage sites at the peptide level and high- confidence substrate identification at the protein level1.

Hierarchical substrate winnowing. After high-confidence peptide, cleavage site and substrate protein identification, bona fide protease substrates of the test protease versus the background cleavage prod-ucts present in every sample are distinguished by a series of steps we termed ‘hierarchical substrate winnowing’1. Key to this is the use of peptide quantification (Fig. 2 and Box 2), because without quantitative proteomics approaches, background cleavage products cannot be reliably distinguished from those of the test protease1,3,4,44. Relative quantification (protease treated versus control) of each peptide found in the database searches is determined and only pep-tides with a statistically derived cutoff ratio are selected as high-confidence protease-generated peptides. It should be noted that lower cutoff values could be applied when a protease with known and narrow cleavage specificity is being examined, such as caspases or GluC, as the cleavage sites can be easily manually vetted. Indeed, for such proteases, non-quantitative degradomics approaches can also be used, which do not use isotopic labeling15,26,28. Next, data originating from three or more independent biological replicates are compared. If two or more independent interpretations produce the same result, then it is very likely that the result is correct. This

serves to greatly reduce the number of false positives. Nevertheless, other high- or low-ratio peptides can still be considered candidate substrates but with lower confidence, and thus require biochemi-cal validation. Finally, the substrate selection is further winnowed down to select the most biologically relevant candidate substrates for the tested protease. For proteases belonging to a family, candi-date substrates can be selected from the candidates that are known to be cleaved by other proteases in the family, or those that are protein family members of a substrate cleaved by the protease1,35–37. Although such an approach is reliable, producing many good results, interpreting unbiased proteomics data in a biased man-ner also leads to many interesting proteins in new substrate classes being missed. For instance, ‘moonlighting proteins’, those classified as intracellular proteins yet also having bona fide extracellular roles in certain circumstances60,61, have been found to be an exciting new class of matrix metalloproteinase (MMP) substrates.

Positional annotation of N-terminal peptides and public repositories. Crucial to the analysis of N terminomes by techniques selective for protein N termini, such as TAILS, is the annotation of identified peptides for their position within the corresponding protein. Therefore, not only the position within the amino acid sequence of the unprocessed protein precursor as derived from its open reading frame, but also within the processed mature pro-tein needs to be determined. This can be achieved by querying annotated databases such as UniProt-SwissProt. To ease this task for extensive lists of N-terminal peptides, we created a Perl script termed TAILS-ANNOTATOR that can be downloaded with accom-panying documentation from the URL listed above.

Identified termini should then be submitted to the Termini oriented protein Function Inferred Database (TopFIND)62. Termini submitted to TopFIND are associated with extensive controlled vocabulary based evidence information. TopFIND not only makes the identified termini publically available but also

Box 2 | ABUNDANCE RATIo–BASED SELECTIoN oF NEo-N-TERMINAL PEPTIDES Protein neo-N termini generated by the test protease are discriminated from unaffected N-terminal peptides by their relative abun-dance in the protease-treated and the control sample. To reliably determine a critical protease/control abundance ratio for substrate neo-N-terminal peptides, we developed several statistical models that are described in detail in the following references1,3,4,44. In addition to high-ratio neo-N-terminal peptides, natural N-terminal peptides with low ratios correspond to protease cleavages close to the original protein N terminus, as the original mature N-terminal peptide is depleted from the protease-treated sample because of pro-tease activity1,3,4. Hence, low-ratio peptides provide strong but indirect evidence for cleavage based on substrate depletion. Below we describe the statistical approach based on the protein natural N-termini quantification data for correction of experimental variations and the identification of potential protease substrates1,3,4,44.Filtering poor or unreliable quantifications is an essential step in order to set the cutoff for selection of protease-generated substrates. To do so, more sophisticated statistically based analyses are required. As abundance ratios of natural N termini in protease-treated and control samples are mostly unaffected by the test protease and thus are expected to be equal (protease/control ~1), they can be used to adjust for experimental variation caused by pipetting errors. We perform centroiding of only the mature N-terminal peptides that are not affected by proteolysis (Steps 56–59). That is, those protein N termini having a methionine1, or start at position 2 after methionine removal, or proteins with the signal peptide removed. As such analyses do not include outliers and bona fide substrates in the calculation, they form a more accurate model of the data than including every peptide identified. Furthermore, the corrected abundance ratio distribution is used for determining statistical protease/control ratio cutoffs to discriminate basal proteolysis from cleavage events resulting from activity of the test protease. We apply a stringent cutoff of three times the standard deviation for log2(protease/control) ratios of natural N termini (Steps 60–61 and Fig. 3) as these have a low probability to have high or low ratios by chance. However, for cell-based experiments or tissue analyses, this is more difficult to implement or interpret, as protease addition or depletion can modify cell-signaling circuits. If cell responses alter the synthesis and turnover of some proteins, it complicates the statistical modeling, rendering it more difficult to determine those proteins that truly have unaltered abundances versus those that have reduced abundance because of degradation or reduced synthesis.

©20

11 N

atu

re A

mer

ica,

Inc.

All

rig

hts

res

erve

d.

protocol

nature protocols | VOL.6 NO.10 | 2011 | 1589

performs positional annotation. Data from TAILS experiments can then be analyzed in relation to other cleavage sites in the pro-tein and known protein functions of the protein cleaved at these sites. A web interface and an application-programming interface (API) provide easy access to protein terminus information as well as its positional correlation with proteolytic cleavage sites, functional protein domains, mutations and the comprehensive evidence metadata. A powerful filtering mechanism allows for limiting the analyzed and displayed information based on a wide range of evidence parameters such as confidence, physiological relevance, source laboratory, methodology or tissue localization. Hence submission of identified termini makes the information available to the community, integrates it with identifications from different studies and provides access to positional annotation that can spark new hypotheses. TopFIND is available at http://clipserve.clip.ubc.ca/topfind. Alternately, data from TAILS experiments can be searched for on a case-by-case basis in TopFIND for similar or different sites in the same or other cleaved proteins. Cleavage sites can be analyzed in relation to protein domains and known muta-tions that might be associated with disease.

Replicates and labeling swaps. The need to doubly validate TAILS output necessitates multiple biological replicates. We recommend and use at least three biological experiments to achieve confident identification of substrates and cleavage sites. These should include separate and independent preparations of the tested proteomes. For experiments in which the specificity of the tested protease is being assessed from in vitro addition of the protease to a defined

relevant proteome, technical replicates can be prepared using one stock of the studied proteome that has been split into the number of desired repeats and then perform separate incubations with the tested protease.

Nevertheless, meaningful output can be obtained even by a single test; in such cases it is essential to validate the potential substrates by other means such as biochemical validation or cell biology approaches17. In contrast to a single-experiment approach, two parallel inverse labeling experiments can be conducted, in which the labeling is reversed in the second experiment (i.e., in the first experiment the control proteome is light isotope labeled and the protease-treated proteome is heavy labeled; in the repeat experiment the isotopes used are the opposite)44. The advantages offered by this approach are the prompt assignment of false posi-tives. In the swapped experiment, all meaningful peptide ratios should be inverted. By applying the inverse labeling strategy, the effort spent in analyzing irrelevant peptides with no differential change is eliminated (i.e., pair matching 1 intensity ratio). In response to a perturbation of the system, if a protein is from serum leftover contamination that does not consistently occur in the swapped experiment, its peptide will have no isotopic counterpart in the analysis using the single-experiment approach. Although an additional experiment must be carried out, this procedure eas-ily tackles three problems: data reduction of irrelevant signals, quick focus on signal of interest only and minimal ambiguity. The setup also can be considered a technical replicate, thus improving confidence in assignments for substantially altered peptides and thus proteins.

MaterIalsREAGENTS

HPLC-grade acetonitrile (CH3CN; Sigma-Aldrich, for HPLC, cat. no.

34851) HPLC-grade water Acetone (CH3COCH

3; Sigma-Aldrich, ACS

reagent, ≥99.5%, cat. no. 179124) Argon gas Ammonium bicarbonate (Sigma-Aldrich, BioUltra, ≥99.5% (T), cat. no. 09830) 1.0 M, pH 8.0 crItIcal Ammonium bicarbonate stock solution should be freshly prepared.DMSO (Sigma-Aldrich, ACS spectrophotometric grade ≥99.5%, cat. no. 154938) ! cautIon DMSO has a relatively low toxicity; however, it is a superior solvent that readily penetrates the skin and increases absorp-tion of certain compounds. In addition, it rapidly dissolves nitrile gloves recommended for use with this protocol. Therefore, extra care should be taken while handling this solvent to minimize any contact with skin.EDTA (Sigma-Aldrich, ACS reagent, ≥99.0%, cat. no. 03680)Formic acid (Sigma-Aldrich, for mass spectroscopy, ~98.0%, cat. no. 94318)N-(trans-epoxysuccinyl)-l-leucine 4-guanidinobutylamide (E64; Sigma-Aldrich, cat. no. E3132)Guanidine hydrochloride (GuHCl; Sigma-Aldrich, ≥99.0% (Cl), cat. no. G4505)HEPES (Sigma-Aldrich, ≥99.5%, cat. no. H3375), 1.0 M, pH 7.0 (for dimethylation labeling) and 1.0 M, pH 8.0 (for iTRAQ labeling)HPG-ALD polymer at ~35 mg ml − 1 crItIcal HPG-ALD polymers for proteomics are available through Flintbox, The Global Intellectual Exchange and Innovation Network (http://www.flintbox.com/public/project/1948/), Flintbox Innovation Network. Prepare as described in Box 3.Iodoacetamide (Sigma-Aldrich, BioUltra, cat. no. I1149), 0.5 M stock in water crItIcal Iodoacetamide stock solution should be freshly prepared and kept in the dark at 4 °C.Methanol (Sigma-Aldrich, BioReagent suitable for protein sequencing, cat. no. M1770)Rapigest SF Surfactant (Waters, cat. no. 186001860)Sodium chloride (NaCl; Sigma-Aldrich, molecular biology grade, cat. no. S3014)SDS-PAGE solutions to prepare 10% (wt/vol) cross-linked gels plus loading and running buffers and silver stain solutions63

•••

••

•••

Sodium hydroxide (NaOH; Sigma-Aldrich, BioXtra, > 98.0%, cat. no. S8045)Hydrochloric acid (HCl; Sigma-Aldrich, 36.5–38.0%, BioReagent, for molecular biology, cat. no. H1758) Liquid nitrogen PMSF (Sigma-Aldrich, ≥99.0%, cat. no. 78830), 100 mM stock (100×) ! cautIon PMSF is carci-nogenic and toxic. The stock solution in ethanol or acetonitrile should be freshly prepared.PBS, sterile tissue culture grade (purchase prepared or combine 138 mM NaCl, 2.7 mM KCl, 20 mM Na

2HPO

4 and 1.5 mM KH

2PO

4, pH 7.4)

Protein assay kits such as Pierce BCA protein assay kit (Thermo Fisher Scientific, cat. no. 23225) or Bradford Assay reagentsTest protease for TAILS assay. Alternately, cell culture medium or cell extracts from protease transfected or knockout cell lines can be used together with the matching control.Trifluoroacetic acid (TFA; Sigma-Aldrich, Spectrophotometric grade, ≥99%, cat. no. 302031)Trichloroacetic acid (TCA; Sigma-Aldrich, ACS reagent, ≥99.0%, cat. no. T6399)Trypsin, mass spectrometry grade (Promega, cat. no. V5111)Versene, sterile (purchase prepared or combine 140 mM NaCl, 2.7 mM KCl, 10 mM Na

2HPO

4, 1.8 mM NaH

2PO

4, 0.5 mM EDTA and 1 mM glucose,

pH 7.4)Microcapillary tubes (Hamilton)

Labeling by dimethylationDTT (Sigma-Aldrich, BioUltra, for molecular biology, ≥99.5% (RT), cat. no. 43815)12CH

2-formaldehyde (CH

2O)—12.3 M for light labeling (37% (wt/wt),

Sigma-Aldrich, cat. no. 252549) ! cautIon Formaldehyde solutions and formaldehyde vapors are toxic; prepare solution in a fume hood.13C2H

2-formaldehyde (13CD

2O)—6.6 M for heavy labeling (20% (wt/wt) in

D2O, 99% 13C, 98% D, Cambridge Isotopes, cat. no. CDLM-4599-1)

! cautIon Formaldehyde solutions and formaldehyde vapors are toxic; prepare solution in a fume hood.

••

••

©20

11 N

atu

re A

mer

ica,

Inc.

All

rig

hts

res

erve

d.

protocol

1590 | VOL.6 NO.10 | 2011 | nature protocols

Sodium cyanoborohydride (NaBH3CN)—1 M (Sterogene, cat. no. 9704-01)

crItIcal NaBH3CN solution should be kept at 4 °C and not stored past

the expiry date to ensure good labeling efficiency.Labeling by multiplex amine-reactive isobaric tags (iTRAQ)

iTRAQ reagents application kit—protein, peptide or bulk iTRAQ reagents only (Applied Biosystems, cat. no. 4374321)Tris(2-carboxyethyl)phosphine hydrochloride solution (TCEP; Sigma-Aldrich, cat. no. C4706)

Labeling by SILACAmino acids for light labeling: l-arginine monohydrochloride (Sigma-Aldrich, cat. no. A6969), l-lysine monohydrate (Sigma-Aldrich, cat. no. L9037), l-leucine (Sigma-Aldrich, cat. no. L8912)Amino acids for heavy labeling: l-arginine-13C

6 hydrochloride (Sigma-

Aldrich, cat. no. 633440); l-arginine-13C6

15N4 hydrochloride (Sigma-

Aldrich, cat. no. 608033); l-lysine-13C6

15N2 hydrochloride (Sigma-Aldrich,

cat. no. 608041); l-lysine-4,4,5,5-D4 hydrochloride (Sigma-Aldrich,

cat. no. 616192) crItIcal Choose the SILAC amino acid(s) required for the desired SILAC experimental design. A combination of two amino acids is recommended to increase the number of quantifiable peptides and enhance sensitivity, as well as allowing wider multiplex SILAC analysis43, even if in principle heavy arginine is sufficient to achieve complete labe-ling of the tryptic peptides generated in TAILS procedures (remember that the labeled and blocked lysine residues are not cleaved by trypsin, and thus a labeled arginine will label all tryptic peptides in the TAILS procedure). Any tissue cell line that can be grown in DMEM Culture-dialyzed DMEM with high glucose, l-Glu, sodium pyruvate and pyridoximetile without l-lysine and l-arginine (Caisson Laboratory, cat. no. DML10-1000ML)DTT (Sigma-Aldrich, BioUltra, for molecular biology, ≥99.5% (RT), cat. no. 43815)Dialyzed FBS, supplements and selection reagents appropriate for the cell-type used (Invitrogen, cat. no. 26400-044)12CH

2-formaldehyde (CH

2O)—12.3 M for primary amine blocking (37%

(wt/wt), Sigma-Aldrich, cat. no. 252549) ! cautIon Formaldehyde solu-tions and formaldehyde vapors are toxic; prepare solution in a fume hood.Sodium cyanoborohydride (NaBH

3CN)—1.0 M (Sterogene,

cat. no. 9704-01) crItIcal NaBH3CN solution should be stored at 4 °C

and not stored past the expiry date to ensure good labeling efficiency.Trypan blue (Sigma-Aldrich, BioReagent, cat. no. T6146) solution 0.4% (wt/vol) Tris-HCl Sodium chloride (NaCl; Sigma-Aldrich, molecular biology grade, cat. no. S3014)EDTA (Sigma-Aldrich, ACS reagent, ≥99.0%, cat. no. 03680)Zwittergent 3-16 detergent (EMD, cat. no. 693023)PMSF (Sigma-Aldrich, ≥99.0%, cat. no. 78830), 100 mM stock (100×) ! cautIon PMSF is carcinogenic and toxic. The stock solution in ethanol or acetonitrile should be freshly prepared.

EQUIPMENTLiquid chromatography–coupled mass spectrometer capable of quantita-tive proteomics using the chosen labeling technique. Note: We have used

•••

Thermo Scientific LTQ Orbitrap XL for dimethylation and SILAC-TAILS and the AB Sciex QStar XL or Pulsar for iTRAQ-TAILS.HPLC system with appropriate SCX column for offline fractionations. We have used the Agilent Technologies 1100 HPLC with the PolySULFOETHYL A column (PolyLc; 100 mm × 4.6 mm, 5 µm, 300 Å).Centrifuge for up to 50-ml tubes at up to 20,000gTabletop centrifuge accommodating 2-ml reaction tubesComputer equipped with proteomic analysis softwareCentrifugal protein concentrator with 5 kDa molecular weight cutoff, such as Millipore Amicon-Ultra 15 concentratorSlide-A-Lyzer G2 Dialysis cassettes, with a 10-kDa molecular weight cutoff (Pierce, 87729-87731)Filtering devices (50 ml, 0.22 µm; Millipore Steriflip, cat. no. SCGP00525 or equivalent) HPLC equipped with UV-Visible detector and a fraction collector PolySULFOETHYL A SCX column (100 mm × 4.6 mm, 5 µm, 300 Å column; PolyLC) LC-MS/MS instrument pH test strips, range 5–10 (EMD Chemicals, ColorpHast pH test strips (non-bleeding), cat. no. 9588)Reversed-phase solid-phase extraction cartridges for 1 ml volume (Waters, Sep-Pak C

18 light, cat. no. WAT023501) and for <100 µl volumes (Agilent,

OMIX C18

, 5–100 µl cat. no. A57003100). Please note that it is also possible to use StageTips64, if applicable. See Box 4 and Figure 4. SDS-PAGE apparatus and power supply Spin-filter device (Millipore, Microcon spin-filter device with a 30-kDa molecular weight cutoff, cat. no. 679)Standard laboratory equipment for cell culture, molecular biology and protein chemistryClear microcentrifuge tubes (1.5 and 2 ml; Eppendorf, cat no. 022364111/022363352) crItIcal Polymers released from tubes and surfaces upon exposure to chemicals and solvents contaminate and interfere with mass spectrometric analysis. We have found that microtubes from Eppendorf show good chemical resistance and are suitable for the procedures described in this protocol.Polyethylene sterile tubes (15 and 50 ml; e.g., Nunc, Corning or equivalent) crItIcal These tubes are to be used for acetone precipitation; thus they require chemical resistance to acetone and methanol, and centrifugation at 15,000g.Tubes for collection of cell-conditioned medium, 50 ml Vacuum evaporation system (e.g., SpeedVac, Thermo Scientific Savant)Mascot database searching engine (Matrix Science; http://www.matrixscience. com/search_form_select.html)X! Tandem database searching engine (open source, http://www.thegpm.org/tandem/)Bottle-top vacuum filter (Corning, 500 ml bottle-top vacuum filter, 0.22-µm pore, cat. no. 430513)C

18-SCX-C

18 StageTips64,65 (20-gauge, P200 pipette; prepared as described

in Box 4)

REAGENT SETUPCell culture and protein collection TAILS is based on the quantitative com-parison of N-terminal peptides from protease-treated and control samples. The following protocols developed for studies of cell-conditioned medium

••••

••

•••

Box 3 | PoLYMER PREPARATIoN ● tIMInG 10–12 h The following steps refer to HPG-ALD type II (HPG-ALDII), which is the preferred polymer for TAILS because it provides a good balance between aldehyde content and peptide binding capacity (Mn ~90 kDa with ~500 functional groups per molecule)1. crItIcal step HPG-ALD polymer usually supplied at a concentration of ~35 mg ml − 1. Although the polymer was dialyzed exten-sively, we recommend dialyzing again before use. If the supplied polymer is different from HPG-ALDII, adjust final amounts according to the reported binding capacity1.1. Dialyze 0.5 ml of HPG-ALDII polymer against 4 liters of water overnight at room temperature with agitation.2. Split HPG-ALDII stock into 20-µl aliquots in microcentrifuge tubes. Each aliquot should contain 0.7 mg of polymer with binding capacity of 1.8 mg of peptides1.3. Flow argon gas on top of the liquid for 1 min for each tube.! cautIon Do not use strong gas flow, as it will cause the polymer solution to be ejected out of the tube.4. Close the microcentrifuge tubes and freeze the polymer solution in liquid nitrogen. Store the polymer at − 80 °C. These aliquots are ready to be used for experiments. crItIcal step If the polymer solution is frozen other than by liquid nitrogen, a gel-like, opaque solution will be formed upon thaw-ing, which will require about 1 h to form a clear, usable solution.

©20

11 N

atu

re A

mer

ica,

Inc.

All

rig

hts

res

erve

d.

protocol

nature protocols | VOL.6 NO.10 | 2011 | 1591

Box 4 | PRoCEDURE FoR FRACTIoNATIoN oF N TERMINoME PEPTIDE SAMPLE BY STAGETIPS See also refs. 65,76.MaterIals ● A blunt-tipped 20-gauge needle ● A plunger of 50-µl Hamilton syringe ● A 20-ml disposable syringe ● Gilson P200 pipette tips ● High-performance extraction discs, C18 and strong cationic exchange SCX 3M Empore (Empore Products)reaGents ● Acetonitrile ● Ammonium acetate (AcONH4) ● Perfluoro pentanoic acid nonafluoropentanoic acid (PFPA; SIGMA, cat. no. 396575) ● Trifluoroacetic acid (TFA; Sigma-Aldrich, spectrophotometric grade, ≥99%, cat. no. 302031)reaGent preparatIon ● Buffer A: 0.1% (vol/vol) TFA, 5% (vol/vol) acetonitrile. Mix 9.5 ml dH2O, 0.5 ml acetonitrile and 10 µl TFA ● Buffer B: 0.1% (vol/vol) TFA and 80% (vol/vol) acetonitrile. Mix 2.0 ml dH2O, 8.0 ml acetonitrile and 10 µl TFA ● AcONH4 (500 mM). Dissolve 0.7708 g AcONH4 in 20.0 ml dH2O ● AcONH4 (20 mM), 0.1% (vol/vol) PFPA, 15% (vol/vol) acetonitrile. Mix 8.10 ml dH2O, 0.40 ml 500 mM AcONH4, 1.5 ml acetonitrile and 10 µl PFPA ● 50 mM AcONH4, 0.1% (vol/vol) PFPA, 15% (vol/vol) acetonitrile. Mix 7.5 ml H2O, 1 ml 500 mM AcONH4, 1.5 ml acetonitrile and 10 µl PFPA ● AcONH4 (100 mM), 0.1% (vol/vol) PFPA, 15% (vol/vol) acetonitrile. Mix 6.5 ml H2O, 2.0 ml 500 mM AcONH4, 1.5 ml acetonitrile and 10 µl PFPA ● AcONH4 (425 mM), 0.1% (vol/vol) PFPA, 15% (vol/vol) acetonitrile. Mix 8.5 ml 500 mM AcONH4, 1.5 ml acetonitrile and 10 µl PFPANote: PFPA (5.8 µl) yields 0.1% (wt/vol) PFPA, whereas 10 µl PFPA yields 0.1% (vol/vol) PFPA.c18-scX-c18 stagetip assembling ● tIMInG 1 min per tip1. Place a disk of C18 3M Empore material on a flat, clean surface such as a Petri dish.2. Punch out a small piece of the C18 3M Empore material using a blunt-tipped 20-gauge needle. In doing so, a small piece of the disk now sticks in the needle and can be transferred into a pipette tip. (Optional) To facilitate the excision, use methanol to wet the disk material before the extraction.3. Push the extracted piece out and immobilize it inside the tapered end of a Gilson P200 pipette tip, by positioning the loaded needle inside the tip first (Fig. 4a), and then by inserting the plunger of a Hamilton syringe inside the needle (Fig. 4b). Push the plunger to accommodate the disk to the end of the pipette tip (Fig. 4c).4. Place a disk of SCX 3M Empore on a flat, clean surface. (Optional) To facilitate the excision, use methanol to wet the disk material before the extraction.5. Stamp out a small piece of SCX 3M Empore material using the needle.6. Stack the extracted SCX portion on top of the C18 disk, pushing the plunger of a Hamilton syringe through the needle.7. Punch out a second piece of C18 3M Empore material with the needle.8. Stack the second stamped-out C18 piece on top of the SCX disk, pushing the piston of a Hamilton syringe through the needle.

pause poInt The C18-SCX-C18 StageTips can be stored dry in a tip box at room temperature.sample preparation9. Acidify the sample to achieve a final concentration of 0.5% acetic acid.Note: Consider that 100 µg is the maximum capacity of the stuffed material (based on SCX capacity76). Sample volume is best at ~20 µl, but up to ~150 µl can be used (if necessary reduce sample volume by SpeedVac) and the pH should be between 1 and 2.5.proceDureFor all the following steps, load the indicated volumes to the wide top of the StageTip. Use a 20-ml syringe and position the rubber plunger head to the end of the scale. Without turning the tip upside down, plug the syringe tightly to the wide end of a StageTip. Hold the tip to prevent it from popping off. Thereafter, press the syringe to push the liquid through the stuffed 3M Empore materials. The starting sample can be loaded at up to ~300 µl min − 1; other solutions should be loaded at 10–30 µl min − 1.Note: StageTips cannot be used with a pipette.conditioning steps1. Load 20 µl of methanol and discard the solvent.2. Load 20 µl of Buffer B and discard the flow-through.3. Load 20 µl of Buffer A and discard the flow-through.4. Load 500 mM of AcONH4 in 20 µl and discard the solvent.5. Load 20 µl of Buffer A and discard the flow-through.

(Continued)

©20

11 N

atu

re A

mer

ica,

Inc.

All

rig

hts

res

erve

d.

protocol

1592 | VOL.6 NO.10 | 2011 | nature protocols

proteins (secretome) can be easily adapted for cell lysates, or for samples from other sources. The introduction of the protease of interest, its inhibition or silencing can be done at the cellular level prior to proteome collection, or it can be done in vitro after the proteome has been harvested. The latter requires col-lection under conditions that maintain the native structure of the constituent proteins. The minimum recommended protein amount is 100 µg for each sam-ple (i.e., 100 µg for control and 100 µg for protease-treated samples), and can be generally achieved for secretome analysis by collecting serum-free conditioned medium from at least six cell culture flasks (175 cm2, T175) at approximately 70–80% confluence. See Box 1 for detailed procedures for culturing cells.

Lysis buffer Combine 50 mM Tris HCl, 150 mM NaCl, 10 mM EDTA, 0.2% (wt/vol) Zwittergent pH 8.0. Filter through 0.22-µm filters. This solution can be stored for 1–2 weeks at 4 °C. Immediately before use, add PMSF to final concentration of 1mM.

Box 4 | CoNTINUED loading sample6. Apply sample (premix sample and Buffer A in 1:3 ratio).7. Load 20 µl of Buffer A and discard the flow-through.sample fractionation8. Elute with 20 µl of buffer B and collect in a fresh tube as flow-through fraction.9. Load 50 µl of 20 mM AcONH4, 0.1% (vol/vol) PFPA and 15% (vol/vol) acetonitrile.10. Wash with 20 µl of Buffer A.11. Elute with 20 µl of Buffer B, collect as fraction 1.12. Load 50 µl of 50 mM AcONH4, 0.1% (vol/vol) PFPA, 15% (vol/vol) acetonitrile.13. Wash with 20 µl of Buffer A.14. Elute with 20 µl of Buffer B, collect as fraction 2.15. Load 50 µl of 100 mM AcONH4, 0.1% (vol/vol) PFPA and 15% (vol/vol) acetonitrile.16. Wash with 20 µl of Buffer A.17. Elute with 20 µl of Buffer B, collect as fraction 3.18. Load 50 µl of 500 mM AcONH4, 0.1% (vol/vol) PFPA, 15% (vol/vol) acetonitrile.19. Wash with 20 µl of Buffer A.20. Elute with 20 µl of buffer B, collect as fraction 4. ● tIMInG 20 minsample preparation for lc/Ms/Ms21. Allow the five fractions to evaporate to near dryness using SpeedVac (approximately 1–2 µl). Do not dry down completely.22. Add 3 µl of buffer A. ● tIMInG 2–5 min pause poInt Samples can be kept at − 80 °C for long-term storage.

proceDureproteolysis by a test protease of the collected proteome ● tIMInG 3–24 h crItIcal The following steps are required only if the proteome is to be exposed to the protease of interest in vitro. For other studies focusing on the nature of the N-terminal peptides of proteins or for proteolysis studies using samples that have already undergone proteolysis in cell culture, or for in vivo samples, proceed to isotopic labeling (Step 6).

1| If possible, spike into the investigated proteome a known substrate of the test protease that will serve as a positive control to confirm proteolytic activity and assess sensitivity of the TAILS procedure. We recommend a known substrate that, after cleavage by the test protease, amine labeling and trypsin digest, will generate a peptide that is suitable for effective and unambiguous MS analysis. Specifically, the length of the sequence upstream of the cleavage site and until the next arginine should be between 7–20 amino acids and should not be homologous to the proteins naturally found in the analyzed sample (for example, you can use a protein from a different species and matching these criteria). Typically, 0.5–1 µg of known substrate can be added to 200 µg proteome.

Figure 4 | Schematic representation of the immobilization procedure for a single piece of Empore material on a pipette tip. Left, a–d show a step-by-step guide for positioning a single piece of Empore material inside a pipette tip. Right, a C18-SCX-C18 StageTip is assembled, repeating the procedure three times. See Box 4.

a b c d

Plunger Syringe Pipette tip Empore material

C18-SCX-C18StageTip

©20

11 N

atu

re A

mer

ica,

Inc.

All

rig

hts

res

erve

d.

protocol

nature protocols | VOL.6 NO.10 | 2011 | 1593

2| Divide the proteome into two equal aliquots.

3| Add activated protease to the sample and an equivalent amount of buffer or inactive protease to the control sample. Typical protease to proteome ratios are 1:1,000–1:50 (wt/wt), with 1:100 (wt/wt) being a good ratio to be used for the first time. This will likely ensure that cleaved neo-N-terminal peptides can be identified. If necessary, in follow up experiments, the ratios of the protease to proteome can be reduced. crItIcal Avoid use of buffers containing primary amines. If the test protease is supplied in such medium, perform buffer exchange.

4| Incubate for 1–24 h at a temperature suitable for the protease under investigation. Incubation times are based on the in vitro cleavage assays of the studied protease. If such information is not available, 12 h is a good starting point.

5| (Optional) The sample should be processed immediately. If not, then inactivate the protease by heating or adding inhibi-tors to both samples.

6| Keep a small aliquot of (minimally 1–2 µg, but preferably 10 µg) each sample (control and protease treated) for quality control purposes designated ‘before labeling’. pause poInt Samples can be stored at − 80 °C following rapid freezing in liquid nitrogen. However, to avoid nonspecific cleavages, and for a more streamlined procedure, we recommend continuing to the next steps. In general, speed matters, and all results are improved using fresh samples and avoiding storage and freezing at all times.

Isotopic labeling and amine blocking7| For dimethylation proceed to option A; for iTRAQ labeling proceed to option B; for primary amine blocking of SILAC-labeled samples, proceed to option C.(a) Isotopic labeling and blocking of primary amines by dimethylation ● tIMInG 9.5–24 h (i) Add 8.0 M GuHCl to a final concentration of 4.0 M GuHCl to denature all proteins in the sample to ensure that all

N termini and lysine side chains are exposed for labeling and blocking reactions. crItIcal step As per Box 1, at this stage the samples should be in buffer without primary amines (100 mM HEPES, pH 7.0, is recommended).

(ii) Check pH by pipetting 1 µl of sample onto a pH strip. Hamilton microcapillary tubes can also be used for 100-nl volumes. (iii) Adjust pH to 8.0 by addition of small volumes of 1 N HCl or 1 N NaOH. (iv) Reduce cysteine residues by adding 1.0 M DTT to a final concentration of 5 mM. (v) If desired, TCEP can be used for reduction, as described in Step 7B(iii). (vi) Incubate the sample for 1 h at 65 °C.

crItIcal step The labeling reaction in Step 7A(vii–xv) can be carried out efficiently at lower denaturant concen-trations but will require more time. It is very important not to use urea, which will modify amino acid residues in the sample and thus reduce peptide identifications. For difficult samples, we sometimes add 0.2% (wt/vol) Rapigest.

(vii) Cool samples to room temperature (≤25 °C). Alkylate cysteine by adding 0.1 M iodoacetamide to a final concentration of 15 mM and mix thoroughly. Incubate samples at 25 °C in the dark for 30 min. Pulse-spin to bring the sample to the bottom of the tube.

(viii) (Optional) Under certain conditions, excess of iodoacetamide can lead to alkylation of other amino acid side chains, and more importantly the N terminus of peptides66. Although we have not noticed this problem in our samples, such unwanted side reactions can be avoided by addition of excess of DTT to quench the residual iodoacetamide66. Alterna-tively, if needed, a lower iodoacetamide concentration (5 mM) can be used as described in Step 7B(iv), but only if 1 mM TCEP is used as the reducing agent. ! cautIon Increasing temperature, iodoacetamide concentration or extending reaction time will result in iodoaceta-mide side reactions with side chains of lysine. Cooling the sample temperature prior to addition of iodoacetamide prevents lysine modification by iodoacetamide67.

(ix) Prepare 2.0 M working stocks of 13CD2-formaldehyde (heavy) and 12CH2-formaldehyde (light) in water. It is important to note that the concentrations of the light and heavy formaldehyde stock solutions, as supplied by the manufacturer, are different: 37% (wt/wt) or 12.3 M and 20% (wt/wt) or 6.6 M, respectively. crItIcal step By itself, formaldehyde reacts with primary amines (and to lesser extent with thiols) and can induce cross-linking between different amino acids39. Such modifications will be eliminated in the presence of NaBH3CN, which will cause the complete dimethylation of all primary amines68,69. Thus, addition of NaBH3CN should be done immediately after the addition of formaldehyde.

(x) Add light formaldehyde to one sample (control) and heavy formaldehyde to the protease sample to a final concentra-tion of 40 mM light/heavy formaldehyde. Please note that if the experiment is repeated several times, labeling swaps are recommended for validation. By convention, heavy labels are used for the protease sample.

©20

11 N

atu

re A

mer

ica,

Inc.

All

rig

hts

res

erve

d.

protocol

1594 | VOL.6 NO.10 | 2011 | nature protocols

(xi) Add 1.0 M NaBH3CN to each sample to a final concentration of 20 mM. (xii) Vortex samples and adjust pH to 6–7 if required, by adding 1.0 N HCl or 1.0 N NaOH (check pH as in Step 7A(ii)). (xiii) Incubate for at least 4 h at 37 °C; overnight incubation is recommended.

! cautIon Although dimethylation labeling has been shown to be very efficient and primary amine specific in different proteomic studies23,38,40,49,70, it should be noted that under certain conditions, especially while using high concentrations of formaldehyde and cyanoborohydride combined with prolonged incubation times, some other modifi-cations of different amino acid(s) (arginine and tryptophan) might occur71. Hence, it is important to use the suggested formaldehyde and NaBH3CN concentrations that have been optimized for TAILS.

(xiv) Quench excess formaldehyde by adding 1.0 M ammonium bicarbonate to each sample up to a final concentration of 100 mM. (xv) Vortex samples and check pH (as in Step 7A(ii)). If required, adjust pH to 6–7 by adding small volumes of 1.0 N HCl or

1.0 N NaOH. (xvi) Incubate for at least 4 h at 37 °C. (xvii) Keep a small aliquot (1–2 µg protein) of each sample for labeling validation (for troubleshooting purposes); label

samples as ‘heavy’ and ‘light’. pause poInt Samples can be stored at − 80 °C following rapid freezing in liquid nitrogen. However, for a more streamlined procedure, we recommend continuing to the next steps. ? trouBlesHootInG

(B) Isotopic labeling and blocking of primary amines by itraQ ● tIMInG 2.5–3 h (i) Denature protein samples by adding 8.0 M GuHCl to a final concentration of 2.5 M GuHCl to ensure that all N termini

and lysine side chains are exposed for labeling and blocking reactions. Use 1 M HEPES, pH 8.0, to adjust to 100–250 mM HEPES final and final protein concentration of 1–2 mg ml − 1. Please note that, although it is possible to work with samples with lower starting protein concentrations of 1 mg ml − 1 or less, it is desirable to start with higher protein concentrations of 2–3 mg ml − 1 in order to minimize extreme sample dilution and loss in the following steps. crItIcal step It is essential not to use urea, because it will modify amino acid residues in the sample and thus reduce peptide identifications.

(ii) Incubate samples at 65 °C for 15 min to initiate protein denaturation. Denaturation time can be extended up to 1 h or longer without any harmful consequences. However, for a more streamlined procedure, we recommend concomitant reduction and denaturation (see the following step).

(iii) Reduce cysteine residues by adding 50–350 mM TCEP to a final concentration of 1 mM and mix thoroughly by gentle pipetting. Incubate sample at 65 °C for an additional 45 min. Pulse-spin to bring the sample to the bottom of the tube. Depending on the volume of the reactions, choose concentrations of TCEP working stock solution carefully to minimize sample volume. ! cautIon Do not use DTT or any other thiols for cysteine reduction as this results in decreased labeling efficiency by iTRAQ in subsequent steps.

(iv) Cool samples to room temperature (≤25 °C). Alkylate cysteines by adding 0.1 M iodoacetamide to a final concentration of 5 mM and mix thoroughly. Incubate samples at 25 °C in the dark for 30 min. Pulse-spin to bring the sample to the bottom of the tube. ! cautIon Increasing iodoacetamide concentration and/or increasing temperature while extending reaction time will result in iodoacetamide side reactions with side chains of lysine, serine and threonine residues. An alternative to iodoacetamide is MMTS (2 mM). crItIcal step This protocol was optimized to achieve high-efficiency, high-fidelity cysteine alkylation using min-imal possible iodoacetamide concentrations and in shortest time. No subsequent iodoacetamide quenching is required under these conditions.

(v) Keep a small aliquot (~1 µg protein) of each sample for labeling validation (Step 7B(xii)). (vi) Dissolve each iTRAQ reagent using 100% DMSO in a volume equivalent to the volume of the reaction after Step 7B(iii).

To calculate the amount of iTRAQ per labeling reaction, iTRAQ is used at 5:1 iTRAQ/protein in each reaction. Typically, if each reaction contains 0.5 mg protein in 0.5 ml, then dissolve 2.5 mg of each iTRAQ reagent in 0.5 ml of 100% DMSO. ! cautIon iTRAQ reagents rapidly hydrolyze in aqueous solutions or when exposed to moisture in the air. Prepare the DMSO working solutions immediately before use. ! cautIon Although DMSO has a relatively low toxicity, it is a superior solvent that readily penetrates the skin and in-creases absorption of certain compounds. In addition, it rapidly dissolves nitrile gloves recommended for use with this protocol. Therefore, extra care should be taken while handling this solvent in order to minimize any contact with skin. crItIcal step The iTRAQ reagent tubes as supplied by the manufacturer contain the same amount of reagent in each tube (see the accompanying technical bulletin for the exact amounts) dissolved in minimal amount of stabilizing organic solvent. The exact volume differs slightly for each iTRAQ label, and from batch to batch. Before making solu-tions, vortex the tube gently and spin to collect the reagent at the bottom of the tube, and then measure the exact

©20

11 N

atu

re A

mer

ica,

Inc.

All

rig

hts

res

erve

d.

protocol

nature protocols | VOL.6 NO.10 | 2011 | 1595

volume in each tube using a pipette. Calculate the amounts to be used on the basis of the volume present. For iTRAQ reagents supplied in bulk, make note of the amount/volume used and remaining, recap tightly, seal with laboratory film to minimize contact with air moisture and store at − 80 °C until future use.

(vii) Add individual iTRAQ reagents solutions to control and protease-treated sample(s) to achieve 50% final DMSO con-centration and 1:5 protein-to-iTRAQ (wt/wt) ratio. Mix by gentle pipetting and spin to collect the samples at the bottom of the tubes. Please note that if the experiment is repeated several times, labeling swaps are recommended for validation and to exclude any potential label bias. crItIcal step Final concentrations of DMSO <50% results in decreased labeling efficiency and promote side reac-tions with side chains of tyrosine, threonine and serine residues. Concentrations of DMSO >50% provide no further advantage and are not desired, as they result in more diluted samples and higher losses in subsequent cleanup steps. Note that these steps differ from those recommended with the iTRAQ kits, which are provided for labeling of peptides. In TAILS, proteins are labeled, so the conditions have been optimized for this difference.

(viii) Incubate at room temperature for 30 min or 1 h when four-plex or eight-plex reagents are used, respectively. crItIcal step The reaction is ~97% complete after a 15-min incubation when using four-plex reagents with virtu-ally no side reactions with side chains of threonine, serine or tyrosine residues. If eight-plex iTRAQ reagents are used, incubate at room temperature for 1 h.

(ix) Quench excess iTRAQ reagent by adding 1.0 M ammonium bicarbonate, pH 8.5, to each sample up to a final concentra-tion of 100 mM.

(x) Incubate for an additional 15–30 min at room temperature. (xi) Keep a small aliquot (~1 µg protein) of each sample for labeling validation. (xii) (Optional) A fast labeling test can be performed by analyzing the non-labeled (from Step 7B(v)) and labeled samples by

matrix-assisted laser desorption/ionization–time of flight (MALDI-TOF) MS or SDS-PAGE electrophoresis followed by silver staining. On a MALDI-TOF MS spectrum, successful labeling is indicated by a complete shift of major peaks by ∆(N × 141) m/z of the observed peaks in the labeled samples compared with the non-labeled samples, where N = the number of lysine residues + 1 (representing the N terminus). If the peak shift is not complete, but rather shows a comb pattern with spacing of 141, it indicates incomplete labeling. If SDS-PAGE is chosen for quality control, use gels with large wells and first dilute the samples ×4- or ×5-fold with water before adding Laemmli buffer in order to avoid precipitation. Similarly, silver staining of the gels should reveal a complete shift of the major bands to higher molecular weights. ? trouBlesHootInG

(c) amine blocking of sIlac-labeled samples ● tIMInG 9–24 h (i) For efficient and complete primary amine blocking, separately denature each concentrated conditioned medium SILAC-

labeled protein sample by adding 8.0 M GuHCl to a final concentration of 4.0 M GuHCl. crItIcal step To avoid undesired proteolytic processing due to the mixing of the protease and control samples, it is essential to denature samples prior to mixing together. Do not use urea, which might modify amino acid residues in the sample and thus reduce peptide identifications.

(ii) Mix all isotopically labeled samples together. (iii) Check pH by pipetting 1 µl of the combined sample onto a pH strip. Hamilton microcapillary tubes can also be used

for 100-nl volumes. (iv) Adjust pH to 7.0 by addition of small volumes of 1 N HCl or 1 N NaOH. (v) Reduce and alkylate cysteine residues as described in Step 7A(iii–viii) for DTT or Step 7B(iii–iv) using TCEP at your choice. (vi) Prepare 2.0 M working stock of 12CH2-formaldehyde (light) in water. Please note that the concentration of light formal-

dehyde stock solution as supplied by the manufacturer is 37% or 12.3 M. There is no isotopic label incorporation in these steps; this is a simple primary amine blocking by dimethylation reaction. crItIcal step On its own, formaldehyde reacts with primary amines (and, to a lesser extent, with thiols) and can induce cross-linking between different amino acids39. Such modifications will be eliminated in the presence of NaB-H3CN, which will cause the complete dimethylation of all primary amines68,69. Thus, addition NaBH3CN should be done immediately after the addition of formaldehyde.

(vii) Add the light formaldehyde to a final concentration of 40 mM formaldehyde. (viii) Proceed with dimethylation as described in Step 7A(xi–xvii).

? trouBlesHootInG

labeling and blocking reagents cleanup ● tIMInG 6–24 h8| Combine dimethylated or iTRAQ-labeled samples in a 15- or 50-ml tube. For SILAC samples, these have already been mixed together as in Step 7C(ii), so now add the single sample to a 15- or 50-ml tube.

9| Keep a small aliquot for quality control and label it ‘labeled samples before precipitation’.

©20

11 N

atu

re A

mer

ica,

Inc.

All

rig

hts

res

erve

d.

protocol

1596 | VOL.6 NO.10 | 2011 | nature protocols

10| Add eight sample volumes of ice-cold acetone and 1 sample volume of methanol to the labeled proteins. ! cautIon Acetone and methanol should be stored in chemical resistant glass containers. The use of plasticware for storage results in the extraction of contaminating plastic polymers into the solvent, which will badly affect MS results.

11| Aliquot 1.2 ml of sample into 1.5-ml microcentrifuge tubes (unless a centrifuge capable of 15,000g for 15 ml tubes is available, in which case continue using the 15-ml tubes). Precipitate labeled proteins for at least 3 h at − 80 °C. crItIcal step Precipitation can be extended overnight, which may increase the yield of proteins; however, we also find this results in higher amounts of MS-interfering plastic leaching into solution.

12| Centrifuge the samples at 14,000g in 4 °C for 20 min and carefully discard the supernatant.

13| Add 1 ml of ice-cold methanol to each tube (or 5 ml if 15 ml tube is used). crItIcal step Washing the acetone pellet with methanol prevents unwanted acetylation of the N termini of tryptic peptides if there is any acetone carryover and also removes precipitated GuHCl.

14| Centrifuge the samples at 14,000g in 4 °C for 20 min and carefully discard the supernatant.

15| Repeat Steps 13 and 14.

16| Carefully invert the tubes upside down at a slight angle and air-dry the sample. crItIcal step Do not over dry the sample as it will be difficult to dissolve.

17| Resuspend sample in a minimal volume of 50 mM NaOH (start with 20 µl and, only if required, gradually add 2.5–5 µl aliquots until complete resuspension has been achieved). Alternatively, for SILAC- and dimethylation-labeled samples only, it is possible to resuspend the sample in 8.0 M GuHCl. For 1.5 ml, we recommend starting with 20 µl and increasing the volume if required. Use the minimal volume required to completely resuspend the sample. ! cautIon Use of large volumes of NaOH can lead to protein modification and degradation, which will alter MS results.

18| Combine the samples from all tubes and add 50 mM HEPES, pH 8.0, to adjust the volume to obtain a final protein concentration ~1 mg ml−1 (i.e., if a total of 1 mg of protein for all conditions was used, the volume when combined should be 1 ml). crItIcal step If GuHCl was used, the final concentration of GuHCl should not exceed 0.75 M, which is compatible for trypsin digestion. If another protease is used in place of trypsin, e.g., GluC or chymotrypsin, the final GuHCl should be ad-justed accordingly.

19| Keep a small aliquot (1% of the total or 1 µl if testing by MALDI-TOF) for quality control and label ‘labeled samples after precipitation’.

tryptic digestion of labeled samples ● tIMInG 18–24 h20| Check pH and if required adjust to pH 8.0 by adding small volumes of 1.0 N HCl or 1.0 N NaOH.

21| Add MS grade trypsin to a final ratio of 1:100 protease/protein (i.e., 2 µg trypsin per 200 µg sample) and gently pipette up and down to mix sample.

22| Incubate overnight (18 h) at 37 °C.

23| (Optional) Add additional trypsin by repeating Step 21 and incubate for an additional 4 h at 37 °C to ensure complete digestion.

24| Keep a small aliquot for quality control and label it ‘labeled samples after digestion’. It is highly recommended to plan ahead and prepare sufficient starting material that will allow for an additional MS analysis of the sample prior to polymer negative selection. An additional aliquot (~10% of the sample) should be stored for this purpose at this point (designated ‘before pullout’). crItIcal step Analysis of the ‘before pullout’ sample is especially important for SILAC-labeled samples not only for quality control but also critically for the normalization of the heavy/light ratio distribution.

©20

11 N

atu

re A

mer

ica,

Inc.

All

rig

hts

res

erve

d.

protocol

nature protocols | VOL.6 NO.10 | 2011 | 1597

Quality control for labeling and trypsin digestion ● tIMInG 4–12 h25| To verify the successful completion of the above steps, carry out a 10% (wt/vol) SDS-PAGE silver staining analysis of the aliquots that were stored in the previous steps (labeled samples before precipitation, labeled samples after precipitation and labeled samples after digestion). Ensure that similar protein bands and intensities appear before and after precipitation, and that all bands higher than 10 kDa disappear after trypsin digestion. Mismatching protein bands before and after precipita-tion indicates sample losses that will reduce the quality of the MS analysis. Protein bands after tryptic digestion indicate incomplete digestion (e.g., due to high concentration of remaining GuHCl or low trypsin activity), and thus require repetition of Steps 21–24. In such a case, we highly recommend diluting the sample twofold, adding fresh trypsin and incubating for at least 6 h; this should be followed by an additional assessment of digestion efficiency.? trouBlesHootInG

26| (Optional) For assurance of labeling completeness and as supporting data, it is highly recommended to analyze some of the samples now (before the polymer negative selection). Such ‘before pullout’ analysis should be performed under exactly the same LC–tandem MS conditions that will be used for the analysis of samples after the polymer negative selection as described in Steps 41–55. This will allow the identification of mislabeled amino acids and other issues that can alter the negative selection effectiveness.

negative selection of blocked peptides using HpG-alD polymer ● tIMInG 6–12 h27| Add HPG-ALDII to the trypsinized sample. We recommend capturing 100 µg of peptides with 200 µg of HPG-ALDII, representing a fivefold excess of polymer. Therefore, if the polymer solution concentration is ~35 mg ml − 1, ~15 µl of polymer stock should be added per 100 µg of tryptic digest.

28| Add NaBH3CN reagent to final concentration of 20 mM.

29| Check pH, and if required adjust pH to a value ranging from 6–7 by adding small volumes of 1.0 N HCl or 1.0 N NaOH.

30| Incubate overnight at 37 °C.? trouBlesHootInG

31| Add 1.0 M ammonium bicarbonate to a 100 mM final concentration. crItIcal step This step is used for blocking the excess functional aldehyde groups of the polymer, which improves yield and reduces nonspecific binding of peptides to the polymer.

32| Check pH; if required, adjust to pH 6–7 by adding small volumes of 1.0 N HCl or 1.0 N NaOH.

33| Incubate at 37 °C for 30 min.

recovery of unbound blocked and labeled peptides ● tIMInG up to 3 h34| Pretreat a 10-kDa (molecular weight cutoff) Microcon spin-filter with 400 µl of water as per the manufacturer’s instruc-tions. Do not allow the membrane to dry out before adding the sample.

35| Load the tryptic digest/polymer reaction mixture.

36| Filter by centrifugation at 14,000g for 15 min.

37| Monitor the sample volume above the filter and centrifuge until just a few microliters remain on the filter in which the polymer and covalently bound internal tryptic peptides are retained. crItIcal step Keeping the filter (and sample) wet is important to prevent breakdown of the polymer that might lead to leakage of internal tryptic peptides into the filtrate.

38| Collect the filtrate, which contains the enriched N-terminal peptides. For larger reaction volumes, repeat Steps 35–37 as necessary.

39| Wash the filter by adding 200 µl of 100 mM ammonium bicarbonate buffer and centrifuge again.

40| Collect the filtrate and combine it with the filtrate of Step 38.

©20

11 N

atu

re A

mer

ica,

Inc.

All

rig

hts

res

erve

d.

protocol

1598 | VOL.6 NO.10 | 2011 | nature protocols

Identification of n-terminal peptides by lc-tandem Ms41| If offline prefractionation is required, proceed directly to Step 41B. Follow the steps in option A for desalting of blocked and labeled solution without SCX prefractionation. This part is performed using a C18 reverse-phase solid-phase extraction cartridge. We use the Sep-Pak light, which accommodates a relatively small volume with a high binding capacity, thus provid-ing convenient sample concentration. Follow the steps in option B for sample prefractionation by offline SCX HPLC. Alterna-tively, the sample from Step 40 can be desalted, concentrated and fractionated by C18-SCX-C18 solid reverse-phase StageTips (see Box 4).(a) Desalting of blocked and labeled peptide solution without scX prefractionation ● tIMInG 3 h (strongly depends on speedVac speed) (i) Acidify the pooled filtrates from Step 40 to pH 3 by adding formic acid and dilute to 3 ml 0.1% (vol/vol) formic acid

in water. (ii) Condition a Sep-Pack light C18 cartridge by injecting 5 ml of 80% (vol/vol) acetonitrile, 20% (vol/vol) water and 0.5%

(vol/vol) formic acid with a syringe. (iii) Discard the flow-through.

! cautIon Do not dry the cartridge by introducing air at the end of the injection. Always keep the cartridge wet. (iv) Rinse the Sep-Pack light C18 cartridge with 5 ml of water with 0.1% (vol/vol) formic acid and discard the flow-through. (v) Apply the sample to the cartridge at a maximum of 1 ml min − 1 and collect the flow-through.

crItIcal step Measure the flow with a timer using the syringe volume marks. (vi) Reapply the sample to the cartridge to improve peptide binding and recovery. (vii) Wash the Sep-Pack light C18 cartridge twice with 5 ml 0.1% (vol/vol) formic acid in water and discard the flow-

through. (viii) Elute peptides with 1.5 ml of 80% (vol/vol) acetonitrile, 20% (vol/vol) water and 0.5% (vol/vol) formic acid at a

maximum of 1 ml min − 1. Collect the eluate into a microcentrifuge tube. (ix) Evaporate the organic solvent under vacuum (using a SpeedVac) to final volume of 1–2 µl.

! cautIon Do not dry completely as this will make sample resuspension difficult and will lead to sample loss. The vacuum time and temperature setting required are strongly dependent on each particular SpeedVac’s performance. If no such knowledge exists, it is recommended to test the drying time under different temperatures using a similar solution and volume as in Step 41A(viii). In case the sample did attain complete dryness, perform the next step while adding rigorous vortexing, or using sonication bath to help dissolve the pellet.

(x) Resuspend the peptides in 20 µl of 3% (vol/vol) acetonitrile, 97% (vol/vol) water and 0.1% (vol/vol) formic acid. Store the samples at − 80 °C until MS analysis.

(B) sample prefractionation by offline scX Hplc ● tIMInG 3–6 h (depending on speedVac speed and number of samples/fractions) (i) Ensure that all HPLC capillaries and the SCX column are filled with water (and not with organic solvents that might be

used to fill the equipment while not in use) before applying any salt-containing solutions. crItIcal step Failure to do so may result in severe precipitation and capillary blockade if high-salt solutions are directly mixed with high organic content mobile phase.

(ii) Prepare the samples from Step 40 for loading by adjusting the ionic strength and pH to those of Buffer A (10 mM potassium phosphate and 25% acetonitrile (vol/vol), pH 2.7). If the sample volume is small (<500 µl) and a larger loading loop (several ml) is available, you may dilute the sample with Buffer A. Check pH, adjust to ~2.7 if needed. If the sample volume is larger and/or it contains high amounts of salt, then dilute the sample with water and acidify to pH 2.7 using TFA. Alternatively, the sample can be desalted/concentrated first using C18 cartridges (see Step 41A(i–x)).

(iii) Load peptides on the SCX column using a flow rate of 1 ml min − 1 and wash for 15 min with 100% buffer A. (iv) To elute peptides, gradually increase Buffer B (10 mM potassium phosphate and 25% (vol/vol) acetonitrile, 1.0 M NaCl,

pH 2.7) to 30% from 15 to 37 min, followed by a sharper increase to 40% by 43 min. Increase Buffer B to 100% at 45 min and maintain at 100% for 8 more min.

(v) Switch the mobile phase back to Buffer A from 53 to 55 min and equilibrate the column with 100% Buffer A for an additional 10 min so that it is ready for the injection of the next sample.

(vi) Monitor peptide separation and elution by absorbance at 214 and 280 nm. Collect 1.5-min (equivalent to 1.5 ml) frac-tions using an automated fraction collector. At the beginning and the end of the elution profile (as judged by absorb-ance) combine several fractions together as they do not have as much material as those in the middle. Using this protocol, a typical sample produces ten fractions.

(vii) SpeedVac the samples to reduce the volume to approximately 50–100 µl. ! cautIon Do not use heat while reducing the sample volume until at least the first 500 µl has evaporated. Heating the samples that contain 25% (vol/vol) acetonitrile (from the mobile phase) will promote plastic leaching into the

©20

11 N

atu

re A

mer

ica,

Inc.

All

rig

hts

res

erve

d.

protocol

nature protocols | VOL.6 NO.10 | 2011 | 1599

solution, thus interfering with MS in subsequent steps. Do not store the samples before evaporating the first 500 µl (to remove acetonitrile) at least, as failure to do so will result in plastic polymers leaching into the sample. pause poInt Although not recommended, if necessary, the protocol may be paused at this point and the reduced volume samples can be stored overnight in the fridge or freezer.

(viii) Desalt and further concentrate the fractions by using appropriate C18 reverse-phase solid-phase extraction tips (see Step 41A(i–x)). After eluting in high-concentration acetonitrile solution, quickly evaporate the organic phase using a SpeedVac at a low temperature setting.

(ix) This step is performed by using a C18 reverse-phase solid-phase extraction cartridge. We use C18 OMIX 100-µl tips, which accom-modate a relatively small volume with a high binding capacity, thus providing convenient sample concentration at this step.

(x) Check pH of the SpeedVac-reduced volume fractions, adjust to pH 2.5 with TFA if needed. (xi) Condition a C18 100-µl tip by aspirating 100 µl of 80% (vol/vol) acetonitrile, 20% (vol/vol) water and 0.1% (vol/vol)

TFA. Discard the flow-through. Repeat 3–5 times. ! cautIon Do not dry the cartridge by introducing air at the end of the injection. Always keep the cartridge wet.

(xii) Equilibrate the tip with 100 µl of water with 0.1% (vol/vol) TFA and discard the flow-through. Repeat 3–5 times. (xiii) Slowly aspirate the sample up and down several times to assure more efficient binding and recovery. Collect the flow-

through for quality control. (xiv) Wash the loaded sample tip with 100 µl 0.1% (vol/vol) TFA in water and discard the flow-through. Repeat 3–5 times. (xv) Elute peptides with 100 µl of 80% (vol/vol) acetonitrile, 20% (vol/vol) water and 0.1% (vol/vol) TFA into a clean

microcentrifuge tube. (xvi) Evaporate the organic solvent and TFA under vacuum (using a SpeedVac).

! cautIon Do not use heat while reducing the sample volume at this stage. Heating the samples will promote plastic leaching into the solution thus interfering with MS in subsequent steps. Do not store the samples before evaporating to at least 10 µl (to remove acetonitrile) as failure to do so will result in plastic polymers leaching into the sample.

Inline Hplc and Ms analysis ● tIMInG 2–3 h per fraction (depends on the gradient length)42| A description of the LC-MS/MS setup is not within the scope of this protocol; therefore, here we outline the conditions used for the LTQ-Orbitrap TAILS analysis1 (option A) and QStar-XL3,4 (option B) only briefly. These steps can be easily adapted to other mass spectrometers.(a) orbitrap—suitable for dimethylation-taIls and sIlac-taIls (i) Load peptides on a C18 reverse-phase (3-µm ReproSil Pur C18 beads) capillary column (15-cm, 75-mm inner diameter

fused silica emitter with an 8-mm opening) with a nanoflow HPLC in line with the mass spectrometer. (ii) Elute the peptides from the reverse-phase column with a gradient composed of Buffer A ((0.5% vol/vol) acetic acid)

and Buffer B (0.5% (vol/vol) acetic acid and 80% (vol/vol) acetonitrile) and inject it directly into the mass spectrom-eter by ion-spray ionization. The gradient is formed with 6–30% Buffer B within 60 min, then from 30 to 80% Buffer B within 10 min and held at 80% of Buffer B for 5 min.

(iii) Acquire MS1 scans between 350 and 1,500 m/z at a resolution of 60,000 and select the five most intense ions for fragmentation. Repeat this cycle for the period of the gradient.

? trouBlesHootInG(B) Qstar—suitable for itraQ-taIls (i) Load N terminome samples onto a C18 reverse-phase (150 mm × 100 µm) column at a flow rate of 100–200 nl min − 1

with a nanoflow HPLC in line with the mass spectrometer as described. (ii) After loading the samples onto a trapping column, wash the column with 5% (vol/vol) acetonitrile containing 0.1%

(vol/vol) formic acid. (iii) Elute and separate peptides with a 40–100 min linear 5–40% (vol/vol) acetonitrile gradient (containing 0.1% (vol/vol)

formic acid) at a flow rate of 150–200 nl min − 1. (iv) Acquire MS1 scans using an information-dependent acquisition method consisting of a 1-s TOF MS survey scan of mass

range 400–1,500 a.m.u. and three 3-s product ion scans of mass range 75–1,500 a.m.u. Of the ion peaks with a signal over 20 counts (charge state + 2 to + 4), select for fragmentation the three most intense ion peaks. Repeat this cycle for the period of the gradient.

? trouBlesHootInG

Data analysis of the taIls–tandem mass spectroscopy spectra43| Make a directory for each analysis (Mascot protease cleavage, X! Tandem protease cleavage, Mascot N-terminal, X! Tandem N-terminal, and so on). crItIcal step Computation times vary with the number of LC/MS runs, sample complexity, richness of spectra and LC gradient length, size and format of the database, and are also highly dependent on the hardware specifications of the

©20

11 N

atu

re A

mer

ica,

Inc.

All

rig

hts

res

erve

d.

protocol

1600 | VOL.6 NO.10 | 2011 | nature protocols

computational system that is being used. In the following sections we list typical timing for a single Orbitrap run (i.e., no prefractionation) of dimethylation-TAILS or SILAC-TAILS. For iTRAQ-TAILS, timing is for 10 LC-MS/MS runs on a QStar.

44| Convert files to mzXML using option A (Orbitrap) or option B (Qstar).(a) orbitrap—suitable for dimethylation-taIls and sIlac-taIls ● tIMInG 15–30 min (i) Convert LTQ-Orbitrap RAW data to mzXML format in profile mode (not centroid) using ReAdW (http://tools.proteomecenter.

org/wiki/index.php?title=Software:ReAdW) or msconvert (http://proteowizard.sourceforge.net/tools/msconvert.html) tools in the TPP. It is possible to convert to mzML format, which is the new standard MS format set by HUPO, but the use of this format for TAILS data analysis has not been tested yet.

(ii) Place mzXML files in the directory created in Step 43. (iii) For Mascot searches, convert the mzXML files to Mascot generic format (.mgf extension) using MzXML2Search

(http://tools.proteomecenter.org/wiki/index.php?title=Formats:mzXML#MzXML2Search) tool in the TPP (use defaults).(B) Qstar—suitable for itraQ-taIls ● tIMInG 4–24 h (i) Convert QStar XL wiff data to mzXML in profile mode using mzWiff provided with the TPP. (ii) Place mzXML files for data from each SCX fraction (Step 41B) in the directory created in Step 43. (iii) For Mascot searches, convert the mzXML files to Mascot generic format (.mgf extension) using MzXML2Search (http://

tools.proteomecenter.org/wiki/index.php?title=Formats:mzXML#MzXML2Search) tool in the TPP (use defaults).

Mascot database search analysis45| Search Mascot with the appropriate mgf files generated in Step 44.TPP quantitative analysis of Mascot (http://tools.proteomecenter.org/wiki/index.php?title=TPP:Mascot_and_the_TPP) search data for dimethylation-TAILS requires running two separate searches, one for only heavy labeled peptides and one for only light labeled peptides. Similar analyses of SILAC-TAILS depend on the type of heavy amino acids used. For example, analysis of only arginine-labeled samples can be done by a single search, whereas analysis of lysine- and arginine-labeled samples requires two separate searches. For clarity and simplicity, we describe only the option of separate searches (option A). As quantitative information is retrieved from reporter ion intensities in MS2 spectra, iTRAQ-TAILS analyses require only a single database search (option B).(a) Mascot search for dimethylation-taIls and sIlac-taIls ● tIMInG 15 min–1 h (i) Perform a Mascot search for only light dimethylated peptides using the .mgf file as input against an appropriate

species Uniprot-SwissProt database using the following search parameters: Semi-ArgC cleavage specificity; up to two missed cleavages; precursor ion mass tolerance 10 p.p.m.; fragment mass tolerance 0.8 kDa; fixed modifications: cysteine carbamidomethylation ( + 57.021464), peptide N-terminal and lysine dimethylation ( + 28.031300); variable modifications: methionine oxidation ( + 15.994915), asparagine deamidation ( + 0.984016) and glutamine deamidation ( + 0.984016); and scoring scheme ESI-TRAP. A corresponding Mascot parameter file is available at http://www.clip.ubc.ca/resources/cliptails.html.

(ii) Repeat the search for heavy dimethylated peptides or dimethylated heavy SILAC as appropriate by changing the set-tings of fixed modifications: For dimethylation-TAILS, change the fixed modification of peptide N-terminal and lysine residues to heavy dimethylation ( + 34.063117). For SILAC-TAILS, add fixed modification for heavy arginine ( + 6.020129 for 13C6-Arg, or 10.008269 for 13C6

15N4-Arg) and heavy lysine ( + 4.028203 for D4-Lys) if used. If more amino acids were used for metabolic labeling, fixed modification with their relative mass differences should be added. ! cautIon Usage of D4-Lys can introduce the deuterium isotopic effect48 affecting the elution times of the heavy and light lysine-containing peptides48. This effect must be accounted for in the quantitative analysis (see Critical Step note in Step 45A(vi) below).

(iii) (Optional) If analysis of the labeling efficiency is to be done on the ‘before pullout’ sample, perform a single search (heavy and light together) setting cysteine carbamidomethylation ( + 57.021464) as a fixed modification and the remaining modifications as variables: light peptide N-terminal dimethylation ( + 28.031300), light lysine dimethylation ( + 28.031300), heavy peptide N-terminal dimethylation ( + 34.063117), heavy lysine dimethylation ( + 34.063117), methionine oxidation ( + 15.994915), asparagine deamidation ( + 0.984016) and glutamine deamidation ( + 0.984016). If SILAC-TAILS was used, do not use the heavy dimethylation parameters (both on N terminus and on lysine), and add the heavy amino acids as variable modifications. crItIcal step If lysine was used for SILAC labeling, the heavy lysine residues will be dimethylated (addition of + 28.031300) in addition to the SILAC label (addition of + 6.020129 for 13C6-Lys, or 8.014199 for 13C6

15N2-Lys). The database search parameter to define such heavy lysine should be the combination of both modifications; otherwise, the search engine will look for different combinations and will likely pick more false-positive hits. Following this, Mascot dimethylated heavy lysine should be defined as dimethyl 13C6 SILAC label ( + 34.051429)

©20

11 N

atu

re A

mer

ica,

Inc.

All

rig

hts

res

erve

d.

protocol

nature protocols | VOL.6 NO.10 | 2011 | 1601

(http://www.unimod.org/modifications_view.php?editid1=986) or dimethyl 13C615N2 SILAC label ( + 36.045499) (http://

www.unimod.org/modifications_view.php?editid1=987). (iv) Import the search result files (.dat extension) from the Mascot server to the analysis directory and name them accord-

ing to the input file name (i.e., data1.dat for data1.mgf, and so on). (v) Convert the .dat files to pepXML file using Mascot2XML (http://tools.proteomecenter.org/wiki/index.

php?title=Software:Mascot2XML) tool. Do so for both the light and heavy labeled searches. (vi) Merge the heavy and light search results, analyze and validate peptide MS/MS identifications and quantification

using the XInteract, PeptideProphet and XPRESS (http://tools.proteomecenter.org/wiki/index.php?title=TPP_Tutorial#Peptide_Level_Analysis) tools of the TPP (respectively). At this step, we recommend checking the following PeptideProphet options: ‘Use accurate mass binning’, ‘Do not use the NTT model’ and ‘Use decoy hits to pin down the negative distribution’ and providing the correct identifier for decoy proteins. The output of this step is an Inter-act pepXML file that includes all the peptides and their related information (database search score, PeptideProphet score, relative abundance and so on). All of these steps can be executed in a single step through the TPP Petunia interface GUI. crItIcal step Check for a potential deuterium isotopic effect by comparing the elution profiles of identical peptide identified in light labeled and heavy labeled forms. If the elution profiles are not exactly matched, check the option ‘Heavy labeled peptide elutes before light labeled partner’ in XPRESS options.

(B) Mascot database search analysis for itraQ-taIls ● tIMInG 1–12 h (i) Perform a Mascot search for each SCX fraction using the .mgf file as input against an appropriate database using the

following search parameters: Semi-ArgC cleavage specificity; up to two missed cleavages; precursor ion mass tolerance 0.4 kDa; fragment mass tolerance 0.4 kDa; fixed modifications: cysteine carbamidomethylation ( + 57.021464) and peptide lysine iTRAQ ( + 144.102066); variable modification: peptide N-terminal iTRAQ ( + 144.102063), peptide N-terminal acetylation ( + 42.010559), methionine oxidation ( + 15.994915), asparagine deamidation ( + 0.984016) and glutamine deamidation ( + 0.984016); and scoring scheme ESI-QUAD-TOF. A corresponding Mascot parameter file is available at http://www.clip.ubc.ca/resources/cliptails.html. Note that QStar instrument mass accuracy is better than the suggested search mass tolerance; however, it was shown that for high-accuracy mass spectrometers, such as Q-TOF instruments, PeptideProphet modeling is improved by using wider mass tolerance windows that allow better discrimi-nation of true positive IDs72.

(ii) Import the search result files (.dat extension) from the Mascot server to the analysis directory and name them accord-ing to the input file name (i.e., data1.dat for data1.mgf and so on).

(iii) Convert the .dat files to pepXML files using the Mascot2XML tool (http://tools.proteomecenter.org/wiki/index.php?title=Software:Mascot2XML) specifying the same database used in search and ‘clostripain’ (the alternative name for ArgC) as enzyme.

(iv) Merge pepXML files from all SCX fractions and quantify reporter ion intensities using Xinteract, PeptideProphet and Libra tools of the TPP, respectively. At this step, we recommend checking the following PeptideProphet options: ‘Do not use the NTT model’, ‘Use decoy hits to pin down the negative distribution’ and to provide the correct identifier for decoy proteins. The output of this step is an interact pepXML file that includes all the peptides and their related information (database search score, PeptideProphet score, reporter ion intensities and so on). All of these steps can be executed in a single step through the TPP Petunia interface.

X! tandem database search analysis46| Search X! Tandem with the appropriate mzXML files generated in Step 44. Use option A for dimethylation-TAILS or SILAC-TAILS and Option B for iTRAQ-TAILS.(a) X! tandem search for dimethylation-taIls and sIlac-taIls ● tIMInG 15 min–1 h (i) Perform an X! Tandem database search for light-labeled peptides directly from the Petunia interface (http://tools.

proteomecenter.org/wiki/index.php?title=TPP:Using_Petunia) of the TPP using the k-score option (X! Tandem searches are done with mzXML as input). Use the same parameters used for Mascot light search: Semi-ArgC cleavage specificity; up to two missed cleavages; precursor ion mass tolerance 10 p.p.m.; fragment mass tolerance 0.8 kDa. Fixed modifica-tions: cysteine carbamidomethylation ( + 57.021464) and peptide N-terminal and lysine dimethylation ( + 28.031300). Variable modifications: methionine oxidation ( + 15.994915), asparagine deamidation ( + 0.984016) and glutamine deamidation ( + 0.984016).

(ii) The X! Tandem search will generate a .tandem file (which includes the search results) in the same directory of the mzXML file used for the search. Performing X! Tandem search through the TPP is done by using an input file with the required parameters for the search. Appropriate input parameter files for dimethylation-TAILS are available at http://www.clip.ubc.ca/resources/cliptails.html.

©20

11 N

atu

re A

mer

ica,

Inc.

All

rig

hts

res

erve

d.

protocol

1602 | VOL.6 NO.10 | 2011 | nature protocols

(iii) Repeat the search for heavy dimethylated forms of heavy SILAC peptides: For dimethylation-TAILS, only change the fixed modifications of peptide N-terminal and lysine residues to heavy dimethylation ( + 34.063117). For SILAC-TAILS, only add fixed modifications for heavy arginine ( + 6.020129 for 13C6-Arg or 10.008269 for 13C6

15N4-Arg) and heavy lysine ( + 4.028203 for D4-Lys) if used. If more amino acids were used for metabolic labeling, fixed modification with their relative mass differences should be added. ! cautIon Usage of D4-Lys can introduce the deuterium isotopic effect48, thus affecting the elution times of the heavy and light lysine-containing peptides48. This effect must be accounted for in the quantitative analysis. crItIcal step If lysine was used for SILAC labeling, the heavy lysine residues will be dimethylated (addition of + 28.031300) in addition the SILAC label (addition of + 6.020129 for 13C6-Lys, or 8.014199 for 13C6

15N2-Lys). The database search parameter to define such heavy lysine should be the combination of both modifications; otherwise, the search engine will look for different combinations and will likely pick more false-positive hits. Following this, Mascot-dimethylated heavy lysine should be defined as dimethyl 13C6-SILAC label ( + 34.051429) (http://www.unimod.org/modifications_view.php?editid1=986) or dimethyl 13C6

15N2 SILAC label ( + 36.045499) (http://www.unimod.org/modifications_view.php?editid1=987).

(iv) Convert the .tandem files to pepXML using the Tandem2XML tool (http://tools.proteomecenter.org/wiki/index.php?title=Software:Tandem2XML) for both the light and heavy labeled searches.

(v) Merge the heavy and light search results, analyze and validate peptide MS/MS identifications and quantification using the Xinteract, PeptideProphet and XPRESS tools of the TPP (http://tools.proteomecenter.org/wiki/index.php?title=TPP_Tutorial#Peptide_Level_Analysis), respectively. At this step, we recommend checking the following PeptideProphet options: ‘Use accurate mass binning’, ‘Do not use the NTT model’ and ‘Use decoy hits to pin down the negative distribution’ and to provide the correct identifier for decoy proteins. The output of this step is an interact pepXML file that includes all the peptides and their related information (database search score, PeptideProphet score relative abundance and so on). All of these steps can also be executed in a single step through the TPP Petunia interface (GUI; http://tools.proteomecenter.org/wiki/index.php?title=TPP:Using_Petunia).

(B) itraQ-taIls X! tandem database search analysis ● tIMInG 1–12 h (i) Subject mzXML files for each SCX fraction to an X! Tandem database search using the following parameters: Semi-ArgC

cleavage specificity; up to two missed cleavages; precursor ion mass tolerance 0.4 Da; fragment mass tolerance 0.4 kDa; fixed modifications: cysteine carbamidomethylation ( + 57.021464), peptide lysine iTRAQ ( + 144.102063) and peptide N-terminal acetylation ( + 42.010565). Variable modifications: Peptide N-terminal iTRAQ ( + 102.0915), methionine oxidation ( + 15.994915), asparagine deamidation ( + 0.984016) and glutamine deamidation ( + 0.984016). Searching for both N-terminal acetylation and iTRAQ labeling is achieved by setting acetylation as fixed and the difference of iTRAQ and acetylation (102.0915 = 144.102063 − 42.010565) as variable modifications. crItIcal step Although X! Tandem searches of data from a high mass accuracy instrument such as the LTQ-Orbitrap perform much better in subsequent PeptideProphet analysis when using the k-score plug-in, this is not the case for data from a QStar XL instrument with somewhat lower mass accuracy. Therefore, we recommend using native X! Tandem scoring for lower mass accuracy data. The use of decoy sequences in the databases is not needed if the built-in statis-tical model in X! Tandem is used with expectation value as output and modeled with PeptideProphet. However, one has to extract taxon-specific Uniprot_sprot.fasta depending on the species, as X! Tandem does not provide a taxon parameter for Uniprot databases, as Mascot does.

(ii) The X! Tandem search will generate a .tandem file with the search results data in the same directory of the mzXML file used for the search. An appropriate input parameter file for iTRAQ-TAILS is available at http://www.clip.ubc.ca/ resources/cliptails.html. crItIcal step QStar instrument mass accuracy is better than the suggested search mass tolerance; however, it was shown that for high-accuracy mass spectrometers such as Q-TOF instruments, PeptideProphet modeling is improved by using wider mass tolerance windows that allow better discrimination of true positive IDs72.

(iii) Convert the .tandem files for all SCX fractions to pepXML using the Tandem2XML tool (http://tools.proteomecenter.org/wiki/index.php?title=Software:Tandem2XML).

(iv) Merge pepXML files from all SCX fractions and quantify reporter ion intensities using Xinteract, PeptideProphet and Libra tools of the TPP, respectively. At this step, we recommend checking the following PeptideProphet option: ‘Do not use the NTT model’. The output of this step is an interact pepXML file that includes all the peptides and their related information (database search score, PeptideProphet score, reporter ion intensities and so on).

? trouBlesHootInG

©20

11 N

atu

re A

mer

ica,

Inc.

All

rig

hts

res

erve

d.

protocol

nature protocols | VOL.6 NO.10 | 2011 | 1603

secondary validation and selection of high-confidence peptides ● tIMInG 2 h–3 d (dependent on the manual verification required in step 51)47| Combine the pepXML interact files of Mascot and X! Tandem using the iProphet tool of the TPP (http://tools. proteomecenter.org/wiki/index.php?title=TPP_Demo2009#8._Further_peptide-level_validation_iProphet). This will generate a pepXML file with a combined list of identified and quantified peptides.

48| Open the resulting iProphet pepXML file (using pepXML viewer).

49| Determine iProphet probability score that corresponds to a false discovery rate of 1% by using the ‘calculate stats’ option under ‘other options’ tab.

50| Select peptides with an iProphet probability score found above the score determined in Step 49, corresponding to a false discovery rate of 1%. For iTRAQ-TAILS, go directly to Step 52.

51| Required only for dimethylation-TAILS and SILAC-TAILS: Manually verify the quantification data (extracted ion chroma-tograms from XPRESS, http://tools.proteomecenter.org/wiki/index.php?title=TPP_Tutorial#XPRESS_Results) of peptides with undefined ratios and heavy and light singletons and correct if required. crItIcal step This step is crucial to the successful outcome of a dimethylation-TAILS and SILAC-TAILS experiment. The light and heavy scan range used for XPRESS should be carefully monitored and adjusted to the exact peak area around the collision-induced dissociation of the quantified peptide.

52| Export the final list of peptides to a Microsoft Excel sheet using the ‘export spreadsheet’ option under ‘other options’ tab.

53| (Optional) When using high mass accuracy mass spectrometers we also use orthogonal validation of peptide identifica-tions by selecting only those peptides with a precursor mass error <5 p.p.m. (experimental versus theoretical). These can be easily sorted when the data is in Excel.

analysis of natural n termini of proteins54| Analysis of natural N-terminal peptides of proteins using TAILS requires only changing the database search parameters. Carry out analysis as described in Steps 45–53 using the required N-terminal modifications. For analysis of acetylated peptides when using dimethylation-TAILS or SILAC-TAILS, the search parameters listed above should be changed by replacing the fixed modification on peptide N termini from dimethylation ( + 28.031300 or + 34.063117) to acetylation ( + 42.010565) and with ‘Glu to pyroglutamate modification’ ( − 17.026549; http://www.unimod.org/modifications_view.php?editid1=28). Accordingly, for iTRAQ-TAILS, additional variable N-terminal modifications can be added to Mascot param-eter files and acetylation can be replaced by other modifications in X! Tandem searches. The other search parameters should remain the same. X! Tandem default search parameters include automatic checks for N-terminal glutamine and glutamate converted to pyroglutamate, and N-terminal cysteine converted to 5-oxothiomorpholine-3-carboxylic acid. Therefore, it is not required to specify these modifications in searches for naturally blocked N-terminal peptides.

positional annotation of n-terminal peptides55| Use TAILS-ANNOTATOR to directly process Excel export files from PepXML Viewer to add the corresponding annotation and information on ‘double validation’ for each peptide. crItIcal step TAILS-ANNOTATOR only queries UniProt-SwissProt databases, as UniProt-TrEMBL databases lack positional annotation.? trouBlesHootInG

Detection of protease-generated peptidesData centroiding and correction56| From TAILS-ANNOTATOR output file (Step 55), extract all peptides with peptide starting position in the unprocessed precursor (‘Start_pre’ column in TAILS-ANNOTATOR output) and/or in the processed, mature protein (‘Start_mat’ column in TAILS-ANNOTATOR output) smaller than 3. Selecting a number < 3 reveals natural N termini, as many SwissProt annotations lack correct assignments for removed and acetylated methionine, resulting in a shift in annotating the true start position4.

©20

11 N

atu

re A

mer

ica,

Inc.

All

rig

hts

res

erve

d.

protocol

1604 | VOL.6 NO.10 | 2011 | nature protocols

57| For dimethylation-TAILS and SILAC-TAILS, calculate log2 for values in ‘XPRESS’ column corresponding to the peptides extracted in Step 56. For iTRAQ-TAILS, calculate log2(libra_protease/libra_control) values using the corresponding columns for the peptides extracted in Step 56.

58| Calculate the mean for the log2 ratios obtained in Step 57 and subtract this value from the log2(protease/control) ratios calculated for each peptide in the data set. The resulting differences are the corrected log2 ratios for the peptide abundances in the proteases and control samples.

59| Plot a histogram of the log2-ratios from Step 58 to check for normal distribution and deviation from the expected center at 0. This can be easily done by using Wessa.net73 (http://www.wessa.net/rwasp_fitdistrnorm.wasp).? trouBlesHootInG

Detection of potential protease-generated substrate hits60| Calculate the standard deviation of the corrected log2 ratios for natural N termini obtained in Step 58.

61| Extract peptides from TAILS-ANNOTATOR with the corrected log2(protease/control) ratios that are higher than three times the standard deviation (obtained in Step 60). These are considered to be high-probability substrates of the protease of interest in the substrate with a cleavage site as identified by the neo-N-terminal peptide. Peptides with log2-ratios lower than 3 times the negative of the standard deviation that also meet criteria for natural N termini (as specified in Step 59) represent natural N termini with internal cleavage sites of the test protease, and they are thus depleted in the protease-treated sample.

62| Screen natural N-terminal peptide tables for low-ratio peptides (<1). These may be the result of proteolytic cleavage close to the protein’s natural N terminus and should be considered as indirect evidence of a potential substrate. crItIcal step The observed ratio change in such cases might be much more moderate than that observed for neo– N termini. For example, if 5% of a certain protein is cleaved near the original N terminus by the protease, it should be reflected in a ≥3-fold ratio or a singleton for the neo-N-terminal peptide, but by a ratio of only ~0.95 for the original N-terminal peptide.

63| A useful tool for data visualization is two, three or four-way Venn diagram analysis. For this we use the free tool Venny59 (http://bioinfogp.cnb.csic.es/tools/venny/index.html). Paste columns of data from Excel spreadsheets into 2–4 fields for two, three or four-way Venn diagram analysis, respectively. For example, we routinely do this for peptides, proteins and high-ratio peptides found by the different search engines and in the different biological or technical replicates; this easily presents the overlap in identifications at these different levels of validation.

Further substrate validation64| We consider high-confidence, high-ratio, double-identified neo-N-terminal peptides as representing bona fide cleavage sites not requiring further in vitro validation. However, evidence for in vivo cleavage might still be considered because of the axiom ‘just because it can does not mean it does’17.

65| For neo-N-terminal peptides not meeting our identification criteria, e.g., identified only once, but with a very high ratio, the spectra must be manually inspected and can also be manually sequenced to ascertain the quality of identification. Bio-chemical validation of cleavage would then be required to reliably classify the protein as a substrate.

66| For high-confidence-identified neo-N-terminal peptides not meeting the 3 s.d. cutoff (Step 62), such peptides might still be from substrates, perhaps those that were cut very slowly in the assay conditions, those that were not discriminated be-cause of confounding spectral noise, or those that were subject to background proteolysis by other proteases in the sample before or during assay. These low-confidence candidate substrates would definitely require further biochemical validation. Examples of such substrates that would warrant this extra effort are those that are either biologically interesting or are in the same biochemical pathway of other known substrates of the protease, those that are known substrates of other members of the protease family, or those that are family members of a known substrate.

? trouBlesHootInGTroubleshooting advice can be found in table 2.

©20

11 N

atu

re A

mer

ica,

Inc.

All

rig

hts

res

erve

d.

protocol

nature protocols | VOL.6 NO.10 | 2011 | 1605

taBle 2 | Troubleshooting table.

step problem possible cause solutions

7A Incomplete protein labeling: dimethylation

Presence of primary amines in the reaction

See below

Degraded sodium cyanoborohydride

Use freshly prepared and properly stored sodium cyanoborohydride

7B Incomplete protein labeling: iTRAQ

Hydrolysis of iTRAQ reagent in the pres-ence of water traces or moisture

Prepare fresh working solutions of iTRAQ reagents in DMSO immediately before use in labeling reaction. Store stock reagents strictly following manufacturer’s recommendations.

Incomplete pro-tein denaturation/reduction/alkylation prevents label access to some protein amino groups

Determine accurate protein concentrations and use adequate concentrations of denaturing/reducing/alkylating reagents

iTRAQ reagent deple-tion by primary amines present in the sample or buffer (e.g., Tris buffer)

Use only amine-free buffers; ensure that no primary amines (Tris buffer, ammo-nium bicarbonate, free amino acids and so on) were introduced during the procedure; verify that the test protease and isolated proteins are in amine-free buffer; concentrate/wash samples using microconcentrators to remove free amino acids and amine-containing metabolites from conditioned medium (or perform TCA precipitation for cell and tissue lysates (use cold TCA, 10–15% final, 30 min on ice, collect by centrifugation, wash with methanol, resuspend using 8.0 M GuHCl and adjust the volume with 1.0 M HEPES, pH 8.0, stock solution))

7C Incomplete protein N-terminal blocking: SILAC

Delay in adding sodium cyanoborohydride can lead to cross-linking

Add immediately after formaldehyde addition

High pH of the protein solution

Adjust pH to 6–7 after addition of the formaldehyde and cyanoborohydride

Low proteome cover-age (Step 50)

Incomplete labeling/blocking of primary amino groups results in the loss of such pep-tides during the polymer-coupling step

See Troubleshooting for incomplete protein labeling

26 Incomplete trypsin digestion

pH is not optimal Always check solution pH before adding trypsin; adjust to pH 8 if necessary, add fresh aliquot of trypsin, incubate for 5–12 h

High concentrations of carried-over GuHCl from insufficient methanol washes

Dilute the sample further with 50 mM HEPES, pH 8.0, add fresh aliquot of trypsin, incubate for an additional 5–12 h. In the future, increase the number/volume of methanol washes after acetone/methanol precipitation

30 Carryover of tryptic peptides after polymer pullout

Instability of sodium cyanoborohydride solu-tion

Use only fresh sodium cyanoborohydride solution. Store stock reagents strictly following manufacturer’s recommendations

pH is not optimal Always check solution pH before adding the polymer; adjust to pH 6–7, if nec-essary

Insufficient coupling time

Allow for at least 5 h coupling reaction

(continued)

©20

11 N

atu

re A

mer

ica,

Inc.

All

rig

hts

res

erve

d.

protocol

1606 | VOL.6 NO.10 | 2011 | nature protocols

taBle 2 | Troubleshooting table (continued).

step problem possible cause solutions

High trypsin or trypsin-like activity in the original biological sample

Do not use trypsin for cell culture propagation (i.e., use EDTA only instead); add broad specificity protease inhibitors immediately on sample collection

Insufficient amount of polymer used

Increase polymer amount for the negative selection

Oxidized polymer Replace air by argon in polymer aliquots and store at − 80 °C

42A, 42B

Poor spectra Polymers leaching from laboratory plasticware affect ionization

Use powder-free gloves; use mass spectrometry–compatible grades of plastic tubes, tips and so on as indicated throughout the protocol; avoid/minimize unnecessary exposure of plasticware to organic solvents, especially at higher temperatures

Incompatible buffers present in the sample affect ionization

Perform additional desalting of the final peptide-containing sample using C18 cartridges/tips

Too much or too little sample

Use serial dilutions to determine optimal amount of the sample for the type of MS instrument and LC gradient used. Titrate peptides by measuring optical density at 280 nm using a Nanodrop spectrometer

Sample of high com-plexity

Further fractionate/simplify the sample, i.e., by additional rounds of HPLC chromatography or using StageTips; collect more fractions by reducing gradi-ent elution

46 Protein ID scores are low; poor data quality after MS

Insufficient cells used or insufficient amount of protein collected

Determine correct cell number. Use fivefold more cells to obtain a good MS sig-nal for low-abundant proteins. Collect higher amounts of starting material

Improper MS analysis Ensure that the MS instrument was properly tuned and calibrated prior to sam-ple analysis

Verify that the correct database, organism taxonomy, peptide modifications, labeled amino acids and enzymes were selected during the data analysis

Loss of proteome at peptide level

Do not concentrate the peptides to dryness after trypsin treatment. If the pep-tides are completely dried, the peptides become difficult to resuspend, result-ing in loss of peptides

46 A low number of pep-tides are identified by the database search

Poor MS data quality due to contamination

Verify the quality of the MS data (mzXML) using the Pep3D visualization tool of the TPP

Modified peptides are unidentified

Change database search parameters to include potential modifications

Inactive test protease Validate the tested protease activity (as suggested in ‘Test protease cleavage of collected proteome’)

Increase the protease/proteome ratio or incubation time

Incomplete isotope labeling

Analyze samples collected prior to the enrichment step by LC-MS/MS and use database search parameters allowing the identification of all tryptic peptides with lysine. Over 95% of lysine should be labeled

Wrong analysis pipeline Use protease of known specificity such as GluC or caspases to set the analysis conditions before using the test protease

(continued)

©20

11 N

atu

re A

mer

ica,

Inc.

All

rig

hts

res

erve

d.

protocol

nature protocols | VOL.6 NO.10 | 2011 | 1607

taBle 2 | Troubleshooting table (continued).

step problem possible cause solutions

Resulting peptide properties prevent MS/MS identification (length is either too long or too short; or other characteristics such as hydrophobicity and so on will mean that not all N-terminal peptides can be identi-fied as N-terminal tryp-tic peptides)

Replace trypsin with GluC or another digesting enzyme to generate peptides with different properties that will be identified

55 No output from TAILS-ANNOTAOR

Dependencies not installed

Install BioPerl and the TextxSV perl module

Input file has Mac- formatted line endings

Save Excel export from PepXML Viewer (.xls) as ‘Windows formatted text’

No peptides found in input file

Ensure that the column containing peptide entries has header ‘peptide’

Wrong database Use the same FASTA database used for search

59 Offset in peptide quan-tifications, as identi-fied by a large shift of original N-terminal peptides from expected ratio of 1

Non-equal amounts of proteins used for the two samples

Verify protein concentrations after the initial desalting/buffer exchange steps

Instability or insuffi-cient amounts of one of the labeling reagents results in incomplete labeling of one of the samples

See Troubleshooting for incomplete protein labeling. Redo labeling

Impurities present in only one of the samples interfere with labeling

See Troubleshooting for incomplete protein labeling

59 The relative abundance between most of the heavy- and light-labeled proteins is not 1:1 (SILAC)

Error in mixing cells (To verify that this is the cause, check that the H/L ratios of before pullout sample (Step 24) differ from 1)

Count cells prior to mixing and adjust the number of cells harvested to ensure the cells from two populations are mixed in a 1:1 ratio by cell number. Be sure to use log-phase cells with 90% viability

Incomplete incorpora-tion of SILAC heavy amino acids. (If this is the cause, light partner peptides will be found on inspecting MS spec-tra of heavy proteome of SILAC starting stock (REAGENT SETUP, Box 1, Option B)

Perform the metabolic labeling for at least seven generations to ensure com-plete incorporation of the heavy amino acids. Always use log-phase cells with 90% viability

(continued)

©20

11 N

atu

re A

mer

ica,

Inc.

All

rig

hts

res

erve

d.

protocol

1608 | VOL.6 NO.10 | 2011 | nature protocols

taBle 2 | Troubleshooting table (continued).

step problem possible cause solutions

Additional supplements added to the medium may contain amino acids

Always use dialyzed serum to prepare medium. Do not use any other media supplements that may contains free amino acids

Amino acids prepared in complete medium

Prepare amino acid solutions using unsupplemented medium

Serum proteins leftover If possible, adapt cells to the minimum required amount of serum. Extensively wash cells with PBS before switching to serum-free medium

● tIMInGSteps 1–6, Proteolysis by test protease of collected proteome: 3–24 hStep 7A, Isotopic labeling and blocking of primary amines by dimethylation: 9.5–24 hStep 7A(i–viii), Sample denaturation, reduction and alkylation: 1.5 hStep 7A(ix–xvii), Isotopic dimethylation: 8–18 hStep 7B, Isotopic labeling and blocking of primary amines by iTRAQ: 2.5–3 hStep 7B(i–v), Sample denaturation, reduction and alkylation: 1.5 hStep 7B(vi), Prepare working stocks of iTRAQ reagents in DMSO: 15 minStep 7B(vii–xii), Whole-protein iTRAQ labeling: 30 min for four-plex, 1 h for eight-plexStep 7C, Amine blocking of SILAC-labeled samples: 9–24 hSteps 8–19, Labeling and blocking reagents cleanup: 6–24 hSteps 20–24, Tryptic digestion of labeled samples: 18–24 hSteps 25–26, Quality control for labeling and trypsin digestion: 4–12 hSteps 27–33, Polymer negative selection: 6–12 hSteps 34–40, Recovery of unbound blocked and labeled peptides: up to 3 hStep 41A, Desalting of blocked and labeled peptide solution without SCX prefractionation: 3 h (strongly depends on SpeedVac speed)Step 41B, Sample prefractionation by offline SCX high performance liquid chromatography: 3–6 h (depending on SpeedVac speed and number of samples/fractions)Step 41B(i–viii), SCX HPLC prefractionation: 3–6 h (depending on SpeedVac and number of samples)Step 41B(ix–xvi), Desalting of peptide solution after SCX prefractionation: ~1 h (depending on SpeedVac efficiency)Step 42, Inline high performance liquid chromatography and mass spectrometry analysis: 2–3 h per fraction (depends on the gradient length)Step 43, Preparing file directories for analysis: a few secondsStep 44A, Orbitrap—suitable for dimethylation-TAILS and SILAC-TAILS: 15–30 minStep 44B, QSTAR—suitable for iTRAQ-TAILS: 4–24 hStep 45A, Mascot search for dimethylation-TAILS and SILAC-TAILS: 15 min–1hStep 45B, Mascot database search analysis for iTRAQ-TAILS: 1–12 hStep 46A, X! Tandem search for dimethylation-TAILS and SILAC-TAILS: 15 min–1hStep 46B, iTRAQ-TAILS X! Tandem database search analysis: 1–12 hSteps 47–66, Secondary validation and selection of high confidence peptides: 2 h–3 d (this stage is highly dependent on the manual verification required in Step 51)

TAILS is a highly streamlined procedure. Every condition has been carefully selected and optimized. Many of the steps are performed in a specific way to be mass spectrometer compatible at later stages. Do not vary the parameters and conditions until satisfactory results are obtained in your hands. Be sure to: monitor and adjust pH where indicated; cool samples to room temperature before alkylating cysteine with iodoacetamide; avoid using urea for sample resolubilization; in dimethy-lation, add NaBH3CN solution immediately after formaldehyde, and at the end of the reaction quench completely excess formaldehyde-labeling reagent with ammonium bicarbonate before mixing heavy- and light-labeled samples and tryptic

©20

11 N

atu

re A

mer

ica,

Inc.

All

rig

hts

res

erve

d.

protocol

nature protocols | VOL.6 NO.10 | 2011 | 1609

digest; block polymer functional groups with ammonium bicarbonate after completion of binding of the internal tryptic and C-terminal peptides to the polymer in the negative selection step of blocked peptides; and perform all labeling steps in a fume hood. crItIcal Do not use buffers or reagents with primary amines prior to completion of labeling and the last step of polymer negative selection.

antIcIpateD resultsAnywhere from 20 to thousands of cleavage sites for the tested protease and ~200 to thousands of original free and blocked N-terminal peptides will be identified. However, not all N-terminal peptides can be detected by MS/MS in TAILS or, indeed, by other N terminome procedures; they may be too long (particularly when lysine residues are blocked), too short (being lost in the desalting steps or with a redundant sequence) or too hydrophobic. As there is only one chance by one peptide to identify each mature protein and cleaved neo-N-terminal cleavage site, in such cases it is technically unlikely to characterize such sites. Greater coverage can therefore be obtained by the routine use of two or more digesting proteases other than trypsin, e.g., GluC or chymotrypsin.

The numbers of N-terminal peptides identified also depend on the size, complexity and characteristics of the tested pro-teome, the presence of substrates of the test protease and the activity of the protease in the assay or cell culture conditions. For example, applying TAILS to test GluC proteolytic processing using proteins secreted by mouse fibroblasts resulted in the high-confidence identification of 860 peptides, of which 64% (551) were GluC neo-N-terminal peptides; 12% (104) were mature or original N-terminal peptides; 19% (167) were peptides from other parts of the proteins; and 5% (42) were internal tryptic peptides1. However, using a similar experimental setup to study MMP-2 proteolytic processing led to the identification of 2,096 peptides, of which ~40% (836) were mature or original N-terminal, ~5% (123) were peptides with cyclized N ter-mini, 3% (57) internal tryptic peptides and 35% (739) potential MMP-2-generated neo-N-terminal peptides (with a protease/control ratio >3)1. After hierarchical substrate winnowing, 288 peptides were defined as MMP-2 neo-N-terminal peptides1.

The accuracy of quantification depends on the peptide ion intensity, which relates to abundance and signal-to-noise ratio of the peptide pairs. This can be as good as a few percent for SILAC1,3,4,44. As determined by 1:1 SILAC label swap experiments, the typical SILAC peptide pair ratio distribution in pre-pullout TAILS samples (Step 54) shows a standard deviation of 0.2 using non-log ratios and so can be viewed as percentages1,3,4,44. The application of the TAILS polymer N-terminal peptide en-richment step to the procedure leads to an ~10% increase in standard deviation (i.e., 0.3), which is still within the expected proteomic experimental error. For iTRAQ-TAILS, a similarly low standard deviation of 0.18 was calculated for ratio distributions of the enriched N-terminal peptides that were not affected by proteolysis and having an expected mean of 1.0 (refs. 3 and 4). Methods to experimentally derive an intensity-dependent quantification confidence factor are described in detail in auf dem Keller et al3,4. TAILS success relies on labeling efficiency and thorough removal of internal tryptic peptides generated during the digestion step. After the polymer negative selection only very few peptides with two tryptic termini should be in the sample (<5%). For quality control, database searches of samples after HGP-ALD polymer negative selection, without specifying any TAILS-related modifications (i.e., no N-terminal or lysine modification) using trypsin as the cleaving enzyme, are expected to yield only a very small number of peptides. This is similar to database searches that include only variable modifications (for example, setting both heavy and light dimethylation on peptide N-terminal and lysine residues), using trypsin as the cleaving enzyme. In such cases, most identified peptides are expected to be original N-terminal peptides. However, changing the digesting enzyme definition in such a search (with only variable modifications) to semi-trypsin (or semi-ArgC or non-specific) should lead to identification of many fully labeled semi-tryptic peptides with C-terminal arginine, in which both the N-terminal and the lysine residues have the same type of labeling (i.e., both light or both heavy).

autHor contrIButIons O.K. developed dimethylation-TAILS and drafted the manuscript. A.D. participated in the development and optimization of dimethylation-TAILS and participated in the manuscript writing. A.P. developed iTRAQ-TAILS and revised the manuscript. U.a.d.K. participated in the development of analysis tools of iTRAQ-TAILS, wrote TAILS-ANNOTATOR and revised the manuscript. M.G. developed SILAC-TAILS and revised the manuscript. J.N.K. engineered the HPG-ALD polymer series and participated in methods development and manuscript writing. C.M.O. conceived the TAILS concept, projects and design, and was responsible for project supervision, data interpretation, manuscript writing and providing grant support.

coMpetInG FInancIal Interests The authors declare no competing financial interests.

Published online at http://www.natureprotocols.com/. Reprints and permissions information is available online at http://www.nature.com/reprints/index.html.

1. Kleifeld, O. et al. Isotopic labeling of terminal amines in complex samples identifies protein N termini and protease cleavage products. Nat. Biotechnol. 28, 281–288 (2010).

2. Kleifeld, O., Doucet, A., Kizhakkedathu, J.N. & Overall, C.M. System-wide proteomic identification of protease cleavage products by terminal amine isotopic labeling of substrates. Protoc. Exchange published online, doi:10.1038/nprot.2010.30 (2010).

3. Prudova, A., auf dem Keller, U., Butler, G.S. & Overall, C.M. Multiplex N terminome analysis of MMP-2 and MMP-9 substrate degradomes by iTRAQ-TAILS quantitative proteomics. Mol. Cell. Proteomics 9, 894–911 (2010).

4. auf dem Keller, U., Prudova, A., Gioia, M., Butler, G.S. & Overall, C.M. A statistics-based platform for quantitative N terminome analysis and identification of protease cleavage products. Mol. Cell. Proteomics 9, 912–927 (2010).

5. Hegde, R.S. & Bernstein, H.D. The surprising complexity of signal sequences. Trends Biochem. Sci. 31, 563–571 (2006).

6. McQuibban, G.A. et al. Inflammation dampened by gelatinase A cleavage of monocyte chemoattractant protein-3. Science 289, 1202–1206 (2000).

©20

11 N

atu

re A

mer

ica,

Inc.

All

rig

hts

res

erve

d.

protocol

1610 | VOL.6 NO.10 | 2011 | nature protocols

7. Overall, C.M. & Lopez-Otin, C. Strategies for MMP inhibition in cancer: innovations for the post-trial era. Nat. Rev. Cancer 2, 657–672 (2002).

8. Vergote, D. et al. Proteolytic processing of SDF-1alpha reveals a change in receptor specificity mediating HIV-associated neurodegeneration. Proc. Natl. Acad. Sci. USA 103, 19182–19187 (2006).

9. Overall, C.M. Molecular determinants of metalloproteinase substrate specificity: matrix metalloproteinase substrate binding domains, modules, and exosites. Mol. Biotechnol. 22, 51–86 (2002).

10. Meinnel, T., Serero, A. & Giglione, C. Impact of the N-terminal amino acid on targeted protein degradation. Biol. Chem. 387, 839–851 (2006).

11. Gevaert, K. et al. Exploring proteomes and analyzing protein processing by mass spectrometric identification of sorted N-terminal peptides. Nat. Biotechnol. 21, 566–569 (2003).

12. McDonald, L., Robertson, D.H., Hurst, J.L. & Beynon, R.J. Positional proteomics: selective recovery and analysis of N-terminal proteolytic peptides. Nat. Methods 2, 955–957 (2005).

13. Kuhn, K. et al. Isolation of N-terminal protein sequence tags from cyanogen bromide cleaved proteins as a novel approach to investigate hydrophobic proteins. J. Proteome Res. 2, 598–609 (2003).

14. McDonald, L. & Beynon, R.J. Positional proteomics: preparation of amino-terminal peptides as a strategy for proteome simplification and characterization. Nat. Protoc. 1, 1790–1798 (2006).

15. Mahrus, S. et al. Global sequencing of proteolytic cleavage sites in apoptosis by specific labeling of protein N termini. Cell 134, 866–876 (2008).

16. Van Damme, P., Arnesen, T. & Gevaert, K. Protein alpha-n-acetylation studied by N-terminomics. FEBS J. published online, doi:10.1111/j.1742-4658.2011.08230.x. (7 July 2011).

17. Overall, C.M. & Blobel, C.P. In search of partners: linking extracellular proteases to substrates. Nat. Rev. Mol. Cell Biol. 8, 245–257 (2007).

18. Doucet, A., Butler, G.S., Rodriguez, D., Prudova, A. & Overall, C.M. Metadegradomics: toward in vivo quantitative degradomics of proteolytic post-translational modifications of the cancer proteome. Mol. Cell Proteomics 7, 1925–1951 (2008).

19. Polevoda, B. & Sherman, F. Nalpha -terminal acetylation of eukaryotic proteins. J. Biol. Chem. 275, 36479–36482 (2000).

20. Puente, X.S., Sanchez, L.M., Overall, C.M. & Lopez-Otin, C. Human and mouse proteases: a comparative genomic approach. Nat. Rev. Genet. 4, 544–558 (2003).

21. Turk, B. Targeting proteases: successes, failures and future prospects. Nat. Rev. Drug Discov. 5, 785–799 (2006).

22. Lopez-Otin, C. & Overall, C.M. Protease degradomics: a new challenge for proteomics. Nat. Rev. Mol. Cell Biol. 3, 509–519 (2002).

23. Ji, C., Guo, N. & Li, L. Differential dimethyl labeling of N termini of peptides after guanidination for proteome analysis. J. Proteome Res. 4, 2099–2108 (2005).

24. Dormeyer, W., Mohammed, S., Breukelen, B., Krijgsveld, J. & Heck, A.J. Targeted analysis of protein termini. J. Proteome Res. 6, 4634–4645 (2007).

25. Schilling, O. & Overall, C.M. Proteomic discovery of protease substrates. Curr. Opin. Chem. Biol. 11, 36–45 (2007).

26. Timmer, J.C. et al. Profiling constitutive proteolytic events in vivo. Biochem. J. 407, 41–48 (2007).

27. Enoksson, M. et al. Identification of proteolytic cleavage sites by quantitative proteomics. J. Proteome Res. 6, 2850–2858 (2007).

28. Guo, L. et al. A proteomic approach for the identification of cell-surface proteins shed by metalloproteases. Mol. Cell Proteomics 1, 30–36 (2002).

29. Dix, M.M., Simon, G.M. & Cravatt, B.F. Global mapping of the topography and magnitude of proteolytic events in apoptosis. Cell 134, 679–691 (2008).

30. Staes, A. et al. Improved recovery of proteome-informative, protein N-terminal peptides by combined fractional diagonal chromatography (COFRADIC). Proteomics 8, 1362–1370 (2008).

31. Van Damme, P. et al. Caspase-specific and nonspecific in vivo protein processing during Fas-induced apoptosis. Nat. Methods 2, 771–777 (2005).

32. Van Damme, P. et al. Analysis of protein processing by N-terminal proteomics reveals novel species-specific substrate determinants of granzyme B orthologs. Mol. Cell. Proteomics 8, 258–272 (2008).

33. Vande Walle, L. et al. Proteome-wide identification of HtrA2/Omi substrates. J. Proteome Res. 6, 1006–1015 (2007).

34. Wold, F. In vivo chemical modification of proteins (post-translational modification). Annu. Rev. Biochem. 50, 783–814 (1981).

35. Dean, R.A. & Overall, C.M. Proteomics discovery of metalloproteinase substrates in the cellular context by iTRAQ labeling reveals a diverse MMP-2 substrate degradome. Mol. Cell. Proteomics 6, 611–623 (2007).

36. Dean, R.A. et al. Identification of candidate angiogenic inhibitors processed by matrix metalloproteinase 2 (MMP-2) in cell based proteomic screens: disruption of vascular endothelial growth factor (VEGF)/heparin Affin regulatory peptide (Pleiotrophin) and VEGF/connective tissue growth factor angiogenic inhibitory complexes by MMP-2 proteolysis. Mol. Cell Biol. 27, 8454–8465 (2007).

37. Butler, G.S., Dean, R.A., Tam, E.M. & Overall, C.M. Pharmacoproteomics of a metalloproteinase hydroxamate inhibitor in breast cancer cells: dynamics of matrix metalloproteinase-14 (MT1-MMP) mediated membrane protein shedding. Mol. Cell Biol. 28, 4896–4914 (2008).

38. Hsu, J.L., Huang, S.Y., Chow, N.H. & Chen, S.H. Stable-isotope dimethyl labeling for quantitative proteomics. Anal. Chem. 75, 6843–6852 (2003).

39. Metz, B. et al. Identification of formaldehyde-induced modifications in proteins: reactions with model peptides. J. Biol. Chem. 279, 6235–6243 (2004).

40. Boersema, P.J., Raijmakers, R., Lemeer, S., Mohammed, S. & Heck, A.J. Multiplex peptide stable isotope dimethyl labeling for quantitative proteomics. Nat. Protoc. 4, 484–494 (2009).

41. Pichler, P. et al. Peptide labeling with isobaric tags yields higher identification rates using iTRAQ 4-plex compared to TMT 6-plex and iTRAQ 8-plex on LTQ Orbitrap. Anal. Chem. 82, 6549–6558 (2010).

42. Thompson, A.J. et al. Characterization of protein phosphorylation by mass spectrometry using immobilized metal ion affinity chromatography with on-resin beta-elimination and Michael addition. Anal. Chem. 75, 3232–3243 (2003).

43. Molina, H. et al. Temporal profiling of the adipocyte proteome during differentiation using a five-plex SILAC based strategy. J. Proteome Res. 8, 48–58 (2009).

44. Gioia, M., Foster, L.J. & Overall, C.M. Cell-based identification of natural substrates and cleavage sites for extracellular proteases by SILAC proteomics. Methods Mol. Biol. 539, 131–153 (2009).

45. Heinecke, N.L., Pratt, B.S., Vaisar, T. & Becker, L. PepC: proteomics software for identifying differentially expressed proteins based on spectral counting. Bioinformatics 26, 1574–1575 (2010).

46. Lu, P., Vogel, C., Wang, R., Yao, X. & Marcotte, E.M. Absolute protein expression profiling estimates the relative contributions of transcriptional and translational regulation. Nat. Biotechnol. 25, 117–124 (2007).

47. Schilling, O. & Overall, C.M. Proteome-derived, database-searchable peptide libraries for identifying protease cleavage sites. Nat. Biotechnol. 26, 685–694 (2008).

48. Zhang, R. & Regnier, F.E. Minimizing resolution of isotopically coded peptides in comparative proteomics. J. Proteome Res. 1, 139–147 (2002).

49. Guo, K., Ji, C. & Li, L. Stable-isotope dimethylation labeling combined with LC-ESI MS for quantification of amine-containing metabolites in biological samples. Anal. Chem. 79, 8631–8638 (2007).

50. Higdon, R. & Kolker, E. A predictive model for identifying proteins by a single peptide match. Bioinformatics 23, 277–280 (2007).

51. Keller, A., Eng, J., Zhang, N., Li, X.J. & Aebersold, R. A uniform proteomics MS/MS analysis platform using open XML file formats. Mol. Syst. Biol. 1, 2005.0017 (2005).

52. Perkins, D.N., Pappin, D.J., Creasy, D.M. & Cottrell, J.S. Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20, 3551–3567 (1999).

53. Craig, R. & Beavis, R.C. TANDEM: matching proteins with tandem mass spectra. Bioinformatics 20, 1466–1467 (2004).

54. Elias, J.E., Haas, W., Faherty, B.K. & Gygi, S.P. Comparative evaluation of mass spectrometry platforms used in large-scale proteomics investigations. Nat. Methods 2, 667–675 (2005).

55. Searle, B.C., Turner, M. & Nesvizhskii, A.I. Improving sensitivity by probabilistically combining results from multiple MS/MS search methodologies. J. Proteome Res. 7, 245–253 (2008).

56. Shteynberg, D. et al. iProphet: Multi-level integrative analysis of shotgun proteomic data improves peptide and protein identification rates and error estimates. Mol. Cell. Proteomics. published online, doi:10.1074/mcp.M111.007690 (2011).

57. Elias, J.E. & Gygi, S.P. Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat. Methods 4, 207–214 (2007).

58. Choi, H. & Nesvizhskii, A.I. Semisupervised model-based validation of peptide identifications in mass spectrometry-based proteomics. J. Proteome Res. 7, 254–265 (2008).

59. Oliveros, J.C. VENNY. An interactive tool for comparing lists with Venn Diagrams. http://bioinfogp.cnb.csic.es/tools/venny/index.html (2007).

60. Butler, G.S. & Overall, C.M. Proteomic identification of multitasking proteins in unexpected locations complicates drug targeting. Nat. Rev. Drug Discov. 8, 935–948 (2009).

©20

11 N

atu

re A

mer

ica,

Inc.

All

rig

hts

res

erve

d.

protocol

nature protocols | VOL.6 NO.10 | 2011 | 1611

61. Butler, G.S. & Overall, C.M. Updated biological roles for matrix metalloproteinases and new ‘intracellular’ substrates revealed by degradomics. Biochemistry 48, 10830–10845 (2009).

62. Lange, P.F. & Overall, C.M. TopFIND, a knowledgebase linking protein termini with function. Nat. Meth. 8, 703–704 (2011).

63. Chevallet, M., Luche, S. & Rabilloud, T. Silver staining of proteins in polyacrylamide gels. Nat. Protoc. 1, 1852–1858 (2006).

64. Rappsilber, J., Ishihama, Y. & Mann, M. Stop and go extraction tips for matrix-assisted laser desorption/ionization, nanoelectrospray, and LC/MS sample pretreatment in proteomics. Anal. Chem. 75, 663–670 (2003).

65. Rappsilber, J., Mann, M. & Ishihama, Y. Protocol for micro-purification, enrichment, pre-fractionation and storage of peptides for proteomics using StageTips. Nat. Protoc. 2, 1896–1906 (2007).

66. Boja, E.S. & Fales, H.M. Overalkylation of a protein digest with iodoacetamide. Anal. Chem. 73, 3576–3582 (2001).

67. Nielsen, M.L. et al. Iodoacetamide-induced artifact mimics ubiquitination in mass spectrometry. Nat. Methods 5, 459–460 (2008).

68. Gidley, M.J. & Sanders, J.K. Reductive methylation of proteins with sodium cyanoborohydride. Identification, suppression and possible uses of N-cyanomethyl by-products. Biochem. J. 203, 331–334 (1982).

69. Jentoft, N. & Dearborn, D.G. Labeling of proteins by reductive methylation using sodium cyanoborohydride. J. Biol. Chem. 254, 4359–4365 (1979).

70. Hwang, S.I. et al. Direct cancer tissue proteomics: a method to identify candidate cancer biomarkers from formalin-fixed paraffin-embedded archival tissues. Oncogene 26, 65–76 (2007).

71. Fu, Q. & Li, L. De novo sequencing of neuropeptides using reductive isotopic methylation and investigation of ESI QTOF MS/MS fragmentation pattern of neuropeptides with N-terminal dimethylation. Anal. Chem. 77, 7783–7795 (2005).

72. Ding, Y., Choi, H. & Nesvizhskii, A.I. Adaptive discriminant function analysis and reranking of MS/MS database search results for improved peptide identification in shotgun proteomics. J. Proteome Res. 7, 4878–4889 (2008).

73. Wessa, P. Free Statistics Software, Office for Research Development and Education, version 1.1.23-r6, http://www.wessa.net/ (2010).

74. Shevchenko, A., Tomas, H., Havlis, J., Olsen, J.V. & Mann, M. In-gel digestion for mass spectrometric characterization of proteins and proteomes. Nat. Protoc. 1, 2856–2860 (2006).

75. Ong, S.E. & Mann, M. A practical recipe for stable isotope labeling by amino acids in cell culture (SILAC). Nat. Protoc. 1, 2650–2660 (2006).

76. Ishihama, Y., Rappsilber, J. & Mann, M. Modular stop and go extraction tips with stacked disks for parallel and multidimensional Peptide fractionation in proteomics. J. Proteome Res. 5, 988–994 (2006).


Recommended