Protein Secondary Structure Prediction Using Super-chains in PDB

Faruk Berat Akcesme; Mehmet Can

doi:10.21533/scjournal.v5i1.101

Protein Secondary Structure Prediction Using Super-chains in PDB

Faruk Berat Akcesme, Mehmet Can

Abstract

The completeness of the protein structures in the current Protein Data Bank (PDB) library for use in secondary structure prediction of unknown structure of protein is examined. To deal with this issue, randomly several 1000 protein chains batches are chosen from PDB. For each protein chain in the batch of PDB dataset that who contain the query protein chain as a subsequence are identified and named as a super-chain and prediction of the secondary structure of the query protein is performed by the use of the corresponding sub sequences of the secondary structure sequence of these chains. The technique is repeated for well known datasets such that CB513, FC699, 640, 25PDB, SCOP, and 1189 as well. It is seen that sequences of around 18% of proteins in the batch are present in other chains of PDB dataset. The average prediction accuracy of this method is found to be 80%. Therefore an unknown protein has a chance of 20% to have a super-chain in Protein Data Bank (PDB), and if a protein has a super-chain in the PDB database, there is a possibility that its secondary structure be predicted with around 80% accuracy.

Keywords

Protein Secondary Structure Prediction; PDB; Super chains

Full Text:

PDF

DOI: http://dx.doi.org/10.21533/scjournal.v5i1.101

Refbacks

There are currently no refbacks.

Digital Object Identifier DOI: 10.21533/scjournal

This work is licensed under a Creative Commons Attribution 4.0 International License

Username
Password
Remember me

Southeast Europe Journal of Soft Computing

Protein Secondary Structure Prediction Using Super-chains in PDB

Abstract

Keywords

Full Text:

Refbacks