※ GPS INTRODUCTION:
Protein phosphorylation, catalyzed by protein kinases (PKs), is one of the most important post-translational modifications (PTMs), and involved in regulating almost all of biological processes.
Here, we report an updated server, Group-based Prediction System (GPS) 6.0, for prediction of PK-specific phosphorylation sites (p-sites) in eukaryotes. First, we pre-trained a general model using 490,762 non-redundant p-sites in 71,407 proteins. Then, transfer learning was conducted to obtain 577 PK-specific predictors at the group, family and single PK levels, using a well-curated data set of 30,043 known site-specific kinase-substrate relations (ssKSRs) in 7041 proteins. Ten types of sequence features were extracted and integrated by 3 types of machine learning algorithms, including penalized logistic regression (PLR), deep neural network (DNN), and Light Gradient Boosting Machine (LightGMB). Using a newly collected data set of 326 ssKSRs, we compared other existing tools with GPS 6.0, which exhibited a much higher accuracy on a number of well-studied PKs. Together with the evolutionary information, GPS 6.0 could hierarchically predict PK-specific p-sites for 44,046 PKs in 185 species. For users, one or multiple protein sequences could be inputted in the FASTA format, and the output will be shown in a tabular list. Besides the basic statistics, we also integrated the knowledge from 22 public resources to annotate the prediction results, including the experimental evidence, physical interactions, sequence logos, and p-sites in sequences and 3D structures.
For the help of GPS 6.0 and the tutorial, please refer to the USER GUIDE page.
For the source code of GPS 6.0, please visit the GitHub page.
▼ Example