Publication Details

SoluProt: prediction of soluble protein expression in Escherichia coli

HON Jiří, MARUŠIAK Martin, MARTÍNEK Tomáš, KUNKA Antonín, ZENDULKA Jaroslav, BEDNÁŘ David and DAMBORSKÝ Jiří. SoluProt: prediction of soluble protein expression in Escherichia coli. Bioinformatics, vol. 37, no. 1, 2021, pp. 23-28. ISSN 1367-4803.
Czech title
SoluProt: predikce rozpustné exprese proteinů v Escherichia coli
Type
journal article
Language
english
Authors
Hon Jiří, Ing., Ph.D. (DIFS FIT BUT)
Marušiak Martin, Ing. (FIT BUT)
Martínek Tomáš, doc. Ing., Ph.D. (DCSY FIT BUT)
Kunka Antonín, Mgr., Ph.D. (LL)
Zendulka Jaroslav, doc. Ing., CSc. (DIFS FIT BUT)
Bednář David, Mgr. (LL)
Damborský Jiří, prof. Mgr., Dr. (LL)
Keywords

protein solubility, machine-learning

Abstract

Motivation: Poor protein solubility hinders the production of many therapeutic and industrially useful proteins. Experimental efforts to increase solubility are plagued by low success rates and often reduce biological activity. Computational prediction of protein expressibility and solubility in Escherichia coli using only sequence information could reduce the cost of experimental studies by enabling prioritisation of highly soluble proteins.
Results: A new tool for sequence-based prediction of soluble protein expression in Escherichia coli, SoluProt, was created using the gradient boosting machine technique with the TargetTrack database as a training set. When evaluated against a balanced independent test set derived from the NESG database, SoluProts accuracy of 58.4% and AUC of 0.60 exceeded those of a suite of alternative solubility prediction tools. There is also evidence that it could significantly increase the success rate of experimental protein studies. SoluProt is freely available as a standalone program and a user-friendly webserver at https://loschmidt.chemi.muni.cz/soluprot/.

Published
2021
Pages
23-28
Journal
Bioinformatics, vol. 37, no. 1, ISSN 1367-4803
Publisher
Oxford University Press
DOI
UT WoS
000649437800004
BibTeX
@ARTICLE{FITPUB12368,
   author = "Ji\v{r}\'{i} Hon and Martin Maru\v{s}iak and Tom\'{a}\v{s} Mart\'{i}nek and Anton\'{i}n Kunka and Jaroslav Zendulka and David Bedn\'{a}\v{r} and Ji\v{r}\'{i} Damborsk\'{y}",
   title = "SoluProt: prediction of soluble protein expression in Escherichia coli",
   pages = "23--28",
   journal = "Bioinformatics",
   volume = 37,
   number = 1,
   year = 2021,
   ISSN = "1367-4803",
   doi = "10.1093/bioinformatics/btaa1102",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/12368"
}
Back to top