ApiDB/EuPathDB Workshop

Data retrieval, SRT, download, Orthologs Exercises

Brian Brunk
Monday, June 9th - 3:30 pm

Exercise 1: Id list and downloads

  • Retrieve in PlasmoDB genes for the following list of identifiers that you determined via a microarray experiment to be downregulated >= 2 fold compared to 3D7 at 48 hrs in response to the knockout of the pfRh2b gene.
    • MAL13P1.273, PFA0010c, MAL8P1.163, PFI0025c, PF10_0293, PF10_0306, PF10_0255, PFF1360w, PFL2080c, PFL2520w, PF10_0282, PFL0010c, PFA0260c, PFL1700c, PF10_0289, PFI0945w, PF10_0343, PFL0590c, PFL0030c, PFE1100w, PFF1365c, PFI1005w, PFE1285w, PFB0100c, PFB0687c, PF10_0295, PFI0100c, PFB0105c, PFB0680w, PF11_0461, PF10_0347, PF10_0337, PF10_0022, PFL1465c, MAL13P1.22, PFI1370c, PFI1180w, PFF1180w, PFA0020w, MAL8P1.109, PFC0735w, PFL0585w, PFE1350c, PFE0625w, PF14_0607, PFE1050w, PFC0920w, PF13_0301, MAL7P1.125, PFF0995c, PFF0610c, PF10_0283, PFD0665c, PF10_0344, PF11_0075, PF14_0018, MAL13P1.2, PFA0630c, PF14_0373, PF14_0443, PF14_0500, MAL13P1.264, PFC0371w, PF13_0063, PF10_0203, MAL13P1.260, MAL8P1.72
  • Can you identify commonalities between members of this list that help elucidate the biology of this knockout?
  • Download using the tab delimited format a report containing attributes that may help with #2.
  • Download using the detailed report additional attributes that may help (such as metabolic pathways etc). What advantages / disadvantages does the detailed report have as compared to the simple report?
  • Download a fasta report containing the putative promoter regions of the genes (-1000 to +0 of the start).
  • You want to include comparative genomics in your transfactor analysis. Download a fasta report of the promoter regions of all the vivax and berhei orthologs of these genes.

Exercise 2 : Identify as many P. falciparum genes containing signal peptides as possible.

  • How many genes in falciparum are annotated with signal peptides (inclusion score 3)?
  • How many P vivax genes are annotated with signal peptides (inclusion score 3)?
  • How many genes on these two lists are in common (hint, use the ortholog query to transform between organisms)?
  • How many falciparum orthologs of vivax genes with signal peptides do not themselves contain signal peptides? Why might this be the case? Look at a couple of these using the synteny viewer to generate some hypotheses.
  • Generate the most comprehensive list of falciparum genes using PlasmoDB that may contain signal peptides (inclusion score 3). How many did you find?

Exercise 3: Apicoplast-targeted genes in T. gondii.

  • Identify putative nuclear genes in T. gondii that are targeted to the apicoplast.
  • How many of these have signal peptides?
  • Is the percentage in #2 above higher than the percentage of all Toxoplasma genes with signal peptides?
  • How does the percentage compare to the percentage of falciparum apicoplast targeted genes containing signal peptides?
  • Do you think these results indicate a valid approach to identifying putative Toxoplasma apicoplast targeted genes?

Exercise 4: Orthology profile

  • Identify P. falciparum genes that are conserved among all apicomplexans but not present in mammals. How many are there?
  • How does this compare to the number of falciparum genes conserved among all Plasmodium species but not present in mammals?
  • Now extend #2 to also include conservation with T. gondii but not mammals.
  • Why might these sorts of queries be very useful when analyzing a eukaryotic pathogen?