Preface
Definition of hypothetical protein varies among different researchers in their perspectives of defining the protein. Below are definitions of hypothetical proteins that will be adopted by our in-house developed HPAS (Hypothetical Protein Analysis System) software, for instance:
- Computational prediction of coding sequence (CDS) / ORF (open reading frame) translation from nucleotide sequence.
- Protein peptides that are classified under 'Domain Unknown Function (DUF)' category in PFAM database (DUFs: families in search of function).
- Protein peptide that are lacking experimental proof of its in vivo existence (expressed in living organism).
High throughput sequencing technologis have produced large amount of transcript reads in nucleotides (DNA, mRNA and small RNA). Various of CDS and ORF prediction tools are implemented by molecular biologists to study the nature of their transcriptome in protein level. Such amount of peptide fragments (due to the nature of seqencing technology, majority of the sequences are not in full length form, hence the use of "peptide fragment" to indicate such incomplete form of protein sequence) require large scale analysis and hence comes to the need of developing HPAS software.
HPAS is an integrated platform that design to analyze, annotate and characterize protein peptides with computational automation and interactive output to assist molecular biologist into understanding their protein datasets.
Loke Kok Keong
HPAS Developer
[email protected]