Bioinformatics Exercises
  A.Questions. Sequence Analysis
(For those who have taken the courses and want to submit for evaluation, please read the instructions linked on the table of contents page. Most of the questions below have straightforward answers from the material in the corresponding courses, although a few questions require some further studies, which are still based on the course material.)

  1. What is the difference between sequence homology and sequence similarity?
  2. Provide an example for ortholog and paralog relationship with a protein family of your choice.
  3. In which circumstance does global alignment used?
  4. Why is local alignment better method for finding the sequence similarity of protein sequences?
  5. What are the things that a dotplot method can do and other alignment methods cannot?
  6. Why are scoring matrices necessary in protein sequence alignment?
  7. There are more than one scoring matrices for each type of scoring matrices such as BLOSUM. Why?
  8. Why do BLOSUM matrices serve as better weight matrices than PAM matrices?
  9. Affine gap penalty is the most commonly used type of gap penalty. Why has it become?
  10. Provide a biological example when TBLASTX should be used.
  11. What are the major reasons which speed up BLAST comparing to Smith-Waterman alignment?
  12. However, we are still using Smith-Waterman alignment. What is the most important reason for it? In other word, what is the weak point of BLAST comparing to Smith-Waterman alignment?
  13. Explain the meaning of P-value as if you are doing it to your biologist colleagues.
  14. What is the difference between P-value and E-value?
  15. Why do we need to remove the low complexity regions from the query sequences before sequence database search?
  16. It is recommended that PSI-BLAST should be used as the default method for protein sequence database search. Why is it so?
  17. PSI-BLAST may be the most powerful tool for sequence similarity search. However, there is one obvious precaution that you have to take against when using PSI-BLAST. What is it and what would you do to avoid it?
  18. What is the computational complexity for multiple alignment using dynamic programming?
  19. What is the computational complexity for multiple alignment using progressive alignment such as ClustalW?
  20. Briefly explain the importance of the order and weight of sequences in progressive alignment?
  21. Between the sequence and structure of two homologous proteins, which one is more conserved?
  22. As biologists, we are often reluctant to use software based on probabilistic algorithms. Why is it so?
  23. Briefly describe the main reason for the strength of Hidden Markov chain in modeling biological sequences?