An efficient computational intelligence technique for classification of protein sequences

Many artificial intelligence techniques have been developed to process the constantly increasing volume of data to extract meaningful information from it. The accurate annotation of the unknown protein using the classification of the protein sequence into an existing superfamily is considered a crit...

Full description

Saved in:
Bibliographic Details
Main Authors: Iqbal, M.J., Faye, I., Said, A.M., Samir, B.B.
Format: Conference or Workshop Item
Published: Institute of Electrical and Electronics Engineers Inc. 2014
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-84938777193&doi=10.1109%2fICCOINS.2014.6868352&partnerID=40&md5=2676071139cc5d5754f5f9bfa3ea6f99
http://eprints.utp.edu.my/31229/
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.utp.eprints.31229
record_format eprints
spelling my.utp.eprints.312292022-03-25T09:03:33Z An efficient computational intelligence technique for classification of protein sequences Iqbal, M.J. Faye, I. Said, A.M. Samir, B.B. Many artificial intelligence techniques have been developed to process the constantly increasing volume of data to extract meaningful information from it. The accurate annotation of the unknown protein using the classification of the protein sequence into an existing superfamily is considered a critical and challenging task in bioinformatics and computational biology. This classification would be helpful in the analysis and modeling of unknown protein to determine their structure and function. In this paper, a frequency-based feature encoding technique has been used in the proposed framework to represent amino acids of a protein's primary sequence. The technique has considered the occurrence frequency of each amino acid in a sequence. Popular classification algorithms such as decision tree, naive Bayes, neural network, random forest and support vector machine have been employed to evaluate the effectiveness of the encoding method utilized in the proposed framework. Results have indicated that the decision tree classifier significantly shows better results in terms of classification accuracy, specificity, sensitivity, F-measure, etc. The classification accuracy of 88.7 was achieved over the Yeast protein sequence data taken from the well-known UniProtKB database. © 2014 IEEE. Institute of Electrical and Electronics Engineers Inc. 2014 Conference or Workshop Item NonPeerReviewed https://www.scopus.com/inward/record.uri?eid=2-s2.0-84938777193&doi=10.1109%2fICCOINS.2014.6868352&partnerID=40&md5=2676071139cc5d5754f5f9bfa3ea6f99 Iqbal, M.J. and Faye, I. and Said, A.M. and Samir, B.B. (2014) An efficient computational intelligence technique for classification of protein sequences. In: UNSPECIFIED. http://eprints.utp.edu.my/31229/
institution Universiti Teknologi Petronas
building UTP Resource Centre
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Petronas
content_source UTP Institutional Repository
url_provider http://eprints.utp.edu.my/
description Many artificial intelligence techniques have been developed to process the constantly increasing volume of data to extract meaningful information from it. The accurate annotation of the unknown protein using the classification of the protein sequence into an existing superfamily is considered a critical and challenging task in bioinformatics and computational biology. This classification would be helpful in the analysis and modeling of unknown protein to determine their structure and function. In this paper, a frequency-based feature encoding technique has been used in the proposed framework to represent amino acids of a protein's primary sequence. The technique has considered the occurrence frequency of each amino acid in a sequence. Popular classification algorithms such as decision tree, naive Bayes, neural network, random forest and support vector machine have been employed to evaluate the effectiveness of the encoding method utilized in the proposed framework. Results have indicated that the decision tree classifier significantly shows better results in terms of classification accuracy, specificity, sensitivity, F-measure, etc. The classification accuracy of 88.7 was achieved over the Yeast protein sequence data taken from the well-known UniProtKB database. © 2014 IEEE.
format Conference or Workshop Item
author Iqbal, M.J.
Faye, I.
Said, A.M.
Samir, B.B.
spellingShingle Iqbal, M.J.
Faye, I.
Said, A.M.
Samir, B.B.
An efficient computational intelligence technique for classification of protein sequences
author_facet Iqbal, M.J.
Faye, I.
Said, A.M.
Samir, B.B.
author_sort Iqbal, M.J.
title An efficient computational intelligence technique for classification of protein sequences
title_short An efficient computational intelligence technique for classification of protein sequences
title_full An efficient computational intelligence technique for classification of protein sequences
title_fullStr An efficient computational intelligence technique for classification of protein sequences
title_full_unstemmed An efficient computational intelligence technique for classification of protein sequences
title_sort efficient computational intelligence technique for classification of protein sequences
publisher Institute of Electrical and Electronics Engineers Inc.
publishDate 2014
url https://www.scopus.com/inward/record.uri?eid=2-s2.0-84938777193&doi=10.1109%2fICCOINS.2014.6868352&partnerID=40&md5=2676071139cc5d5754f5f9bfa3ea6f99
http://eprints.utp.edu.my/31229/
_version_ 1738657218549514240
score 13.211869