USP Electronic Research Repository

An enhanced genetic algorithm for Ab initio protein structure prediction

Rashid, Mahmood and Khatib, F. and Hoque, T. and Sattar, A. (2015) An enhanced genetic algorithm for Ab initio protein structure prediction. IEEE Transactions on Evolutionary Computation, PP (99). NA-NA. ISSN 1089-778X

[thumbnail of Mahmood_Rashid_Article.pdf] PDF - Submitted Version
Restricted to Repository staff only

Download (1MB) | Request a copy

Abstract

In-vitro methods for protein structure determination are time-consuming, cost-intensive, and failure-prone. Because of these expenses, alternative computer-based predictive methods have emerged. Predicting a protein’s three-dimensional structure from only its amino acid sequence—also known as ab initio protein structure prediction—is computationally demanding because the search space is astronomically large and energy models are extremely complex. Some successes have been achieved in predictive methods but these are limited to small sized proteins (around 100 amino acids); thus, developing efficient algorithms, reducing the search space, and designing effective search guidance heuristics are necessary to study large sized proteins. An on-lattice model can be a better ground for rapidly developing and measuring the performance of a new algorithm, and hence we consider this model for larger proteins (>150 amino acids) to enhance the genetic algorithms framework. In this paper, we formulate protein structure prediction as a combinatorial optimization problem that uses three-dimensional face-centered-cubic lattice coordinates to reduce the search space and hydrophobic-polar energy model to guide the search. The whole optimization process is controlled by an enhanced genetic algorithm framework with four enhanced features: i) an exhaustive generation approach to diversify the search; ii) a novel hydrophobic core-directed macro-mutation operator to intensify the search; iii) a per-generation duplication elimination strategy to prevent early convergence; and iv) a random-walk technique to recover from stagnation. On a set of standard benchmark proteins, our algorithm significantly outperforms state-of-the-art algorithms. We also experimentally show that our algorithm is robust enough to produce very similar results regardless of different parameter settings.

Item Type: Journal Article
Additional Information: Bioinformatics, Computational Biology, Genetic Algorithms, Artificial Intelligence
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Q Science > QA Mathematics > QA76 Computer software
Q Science > QH Natural history > QH301 Biology
Z Bibliography. Library Science. Information Resources > ZA Information resources > ZA4050 Electronic information resources
Divisions: Faculty of Science, Technology and Environment (FSTE) > School of Computing, Information and Mathematical Sciences
Depositing User: Mahmood Rashid
Date Deposited: 07 Apr 2016 03:31
Last Modified: 27 May 2016 01:48
URI: https://repository.usp.ac.fj/id/eprint/8790

Actions (login required)

View Item View Item