|
|
Goal: Learn to combine tools encountered in the previous COOs to answer specific biological questions.
Remarks:
Although a couple of hints are given at the bottom of this page, the questions are less straightforward than the previous COOs, and answering them will require some trial and error. Do not hesitate to ask your assistant if you feel you are not making any progress (but please read the hints first!). Note that you can answer question 2 without fully answering question 1, so you won't need to be bored when you get stuck while your assistant is temporarily unavailable.
Format:
You have to hand in your answers to the questions of this COO on paper. The answers should be approximately one A4. You can include figures or (parts of) alignments, but please do not copy-paste complete pages of BLAST output; a summary like "a blastp search of protein x against the nr databases gives hits in species y and z" will do. If you are using web servers (e.g. for a BLAST search or a clustalw alignment), don't forget to write down the settings you are using if they are different from the default settings of the page. Make the report a story, not just a list of answers. You can find a more elaborate guide here. You can hand in your report to you COO assistant, or bring it to Can Kesmir (Kruyt Z527) or Jos Boekhorst (Kruyt Z532). The deadline is March 29th. You need to hand in a reasonable report to have the bonus point!
|
|
|
|
HIV is the principal cause of acquired immunodeficiency
syndrome (AIDS). HIV/AIDS remains to be one of the most severe health concerns, as it effects lives of more than 33 million people all around the
world.
The background article by Ho & Bieniasz gives a good
overview on HIV/AIDS.
|
1. |
In the article by Ho & Bieniasz you can read about the origins of HIV virus. It was transmitted to human from a non-human primate.
However, when (and from which organism) HIV was transmitted
to humans is still
under investigation. Other immunodeficieny viruses related to HIV have been found in many species.
In the first assignment you will try to confirm the origins of the HIV.
Use the protein with accession number Q79665.3 as the start of your analysis and try to identify this primate.
Did HIV cross the species boundary only a single time, or can you find evidence for multiple events? If you have time, repeat the same analysis starting
from a different protein and see if you get same results.
Another relevant paper on the origins of HIV virus is here.
|
2. |
Mutations in HIV proteins allow the virus to escape the human immune response. What can the effect of an immune escape be on viral load?
HIV has 9 genes (explained more in detail here).
The genetic makeup of a patient
(especially major histocompatibility molecules) determines which fragment of HIV proteins invoke
an immune response. Pol gene is one of the most immunogenic
genes of HIV. Identify the regions in the Pol gene where
immune escapes are less likely to occur.
Can you find supporting evidence for your results in the function and structure of the regions you identified?
If a patient has a strong immune response targeting one of these regions, what would be the progression speed of this patient to
AIDS? Use the protein with accession number CAB86375 as the start of your analysis.
If clustalw is taking a very long time, try to user fewer sequences or a different server.
|
|
|
|
|
|
|
|
Many of the hints of the previous COOs could proof useful: COO1, COO2, COO3
- You can get a fasta file with the amino acid sequence from a selection of BLAST hits by selecting the checkbox in front of your hits of interest in the alignment part of NCBI BLAST output, followed by clicking the "get selected sequences" link at the bottom of the page. On the next page, change the Display pull-down menu to "fasta" and select "Text" or "File" from the Send-to menu.
- If you want to make a phylogenetic tree of a protein with a lot of blast hits, don't simply take the top hits, but consider including hits with varying degrees of similarity.
- If you find too many BLAST hits for HIV-1 proteins in the nr database, try changing the database to swissprot. The virus name (eg. HIV, SIV etc) is already to be seen in the swissprot identifier. To find out the
host of a virus you have to click on the gene identifier and read the NCBI entry. Often under FEATURES there would be a "source" item which states which strain of the virus you found and (if known) what the host is.
- You can select how BLAST output is sorted by clicking on the different headers of the summary table. Sorting by "Query_coverage" can be helful in getting rid if partial hits and synthetic constructs (which are often very short).
- On the query page of NCBI BLAST, you can use Entrez queries to limit the results; the query "feline"[descr] will mostly give you hits from the feline immunedeficiency virus, while all[filter] NOT "Human immunodeficiency virus 1"[descr] will for example exclude BLAST hits from HIV-1. You can also change the hits presented in the BLAST output through the link "formatting options" on top of the results page.
- When you are selecting BLAST hits to put into a multiple sequence alignment, try to select proteins which are similar to the BLAST query over their entire length.
- Making multiple sequence alignments of long proteins like Pol can take quite a while, so it is probably a good idea to limit the number of sequence to align (max 15).
- Jalview is a nice tool for identifying specific positions and conserved regions in multiple sequence alignments; you can find a link to Jalview near the top of the EBI clustalw output.
- An easy way to include pictures in your report is to press the print-screen button, followed by pasting in a program like MS Word or Open Office Writer.
|
|
|