Cluster Analysis of Gene Expression Profiles

Purpose:To obtain insight in yeast gene expression during cell cycle using hierarchical clustering methods.

We will use the NCBI GEO server for both data and analysis tools.

Identify cell cycle related genes with the aid of cluster analysis.

  1. Search for dataset GDS124 . Read about the experiment and try to understand the experimental set up. How many time points are available? The details of these experiments are given in Spellman, et.al.
  2. Only a limited number of clustering on this data is available in GEO server. First use UPGMA clustering (under analysis option) of the genes with uncentered correlation as a distance measure. Think about how to decide which genes are cell cycle related by looking at their profiles, before you start the analysis.
  3. Select some of the genes that you identified (using the orange rectangular tool) and plot their profile. How do their expression levels change in time? Do all selected genes behave similarly? Get the profiles of these genes individually (by clicking Get profiles in Entrez-GEO ). Now you have access to the annotations of the selected genes. Do you see some that according to the annotation should be cell-cycle related? Write down the name of these genes.
  4. Look at single linkage clustering as well. How many cell cycle genes can you identify using uncentered correlation distance measure? Why? Do your results improve when you use other distance measures?
  5. Search for other data sets involving cell cycle experiments in yeast as well. Using GEO Profiles in Entrez check if some of the genes you identified above show similar behaviour in other cell cycle experiments.

Expression analysis at the tissue level.

  1. Search for dataset GDS596. Read about the experiment and try to understand the experimental set up. How many tissues/cell lines are tested? The details of these experiments are given in Su, et.al.
  2. Which type of micro array is used here?
  3. This type of micro arrays use often probe sequences that are 25 nucleotide long. Is it possible to measure in a standard microarray chip like the one used here the differences in the expression level of ubiquitin and poly ubiquitin?
  4. You can search for human ubiquitin expression profile by using the keyword M26880 . Is the tissue expression profile of the ubiquitin the one that you had been expecting?
  5. Try to identify genes that are co-expressed with ubiquitin. You can do this by using Profile neighbor facility placed on top of the expression profile. Are co-expressed genes also related to protein degradation?
  6. Now search for expression of TLR1, TLR2 and TLR7. Are TLRs co-expressed? What does this suggest?