Software for finding statistically over-represented Gene Ontology groups in microarray experiments (Lewin and Grieve)

Gene Ontology (GO) terms are often used to assess the results of microarray experiments. The most common way to do this is to perform Fisher's exact tests to find GO terms which are over-represented amongst the genes declared to be differentially expressed in the analysis of the microarray experiment. However, due to the high degree of dependence between GO terms, statistical testing is conservative, and interpretation is difficult.

PoGO groups of GO terms rather than individual terms, to increase statistical power, reduce dependence between tests and improve the interpretation of results. It uses the publicly available package POSOC to group the terms.

PoGO is currently implemented in a series of perl scripts, written by Ian Grieve. It is discussed in Lewin and Grieve 2006, BMC Bioinformatics 2006, 7:426.

PoGO requires:
-> Perl installation.
-> POSOC, or a suitable POSOC clustering output file. POSOC requires Java and the Java OpenJGraph library.
-> CPAN Modules: Math::Pari (required by the Fisher's Exact Test module),
-> CPAN Modules: List::Util (required for permutation testing).

MTX2: The name of the MTX2 file that was used by POSOC to generate the clusters. This should be distilled from the CSV file obtained from the Affymetrix website (NetAffx resource eg. here for MG_U74Av2_annot.csv) using affyfile_converter.pl if not already present.
FULL: The full list of probesets (in the form of Affymetrix IDs, one per line) on the chip. DE: The list of probesets considered to be differentially expressed or otherwise of interest for comparison against the baseline (again, in the form of Affymetrix IDs, one per line).
CLUSTERS: The clustering output produced by POSOC.

Download PoGO files here

BACK to my homepage