ErmineJ consists of two jar files: ermineJ.jar and baseCode.jar. The latter contains code that is common to other projects, while ermineJ contains classes that are specific to the gene set scoring task. ErmineJ has dependencies on a number of other third party libraries including Commons Configuration. Use of the ermineJ API is covered by the Lesser Gnu Public License.
While the internal use of the ermineJ API is fairly complex, most of that complexity is not needed for use by third parties. The minimal requirements for an analysis are:
The use of java.util.Lists was intended to make it very easy for third parties to create data structures that ermineJ can handle. It is the programmer's responsibility to make sure the Lists are in the correct order. While ermineJ will detect some types of problems with the input data structures, it cannot tell that you put the probe IDs in a different order than the gene symbols.
Once the above are assembled, the analysis proceeds in three phases:
The results can then be obtained with a simple method call..
The following code snippets demonstrate how to implement these steps.
List probes = null; // List of identifiers to be analyzed
List genes = null; // List of genes corresponding to the probes. Indicates the Many-to-one mapping of probes to genes.
List goAssociations = null; // List of Collections of go terms for the probes.
List geneScores = null; // List of Doubles
/* code to initialize data structures omitted */
ClassScoreSimple css = new ClassScoreSimple( probes, genes, goAssociations );
// in our raw data, smaller values are better (like pvalues, unlike fold
// change)
css.setBigGeneScoreIsBetter( false );
// set range of sizes of gene sets to consider.
css.setMaxGeneSetSize( 100 );
css.setMinGeneSetSize( 5 );
// use this pvalue threshold for selecting genes. (before taking logs)
css.setGeneScoreThreshold( 0.001 );
// use over-representation analysis.
css.setClassScoreMethod( Settings.ORA );
/* ... etc. Reasonable defaults (?) are set for all parameters if you don't set them. */
css.run( geneScores ); // might want to run in a separate thread.
// You should iterate over your tested gene sets.
double fooPvalue = css.getGeneSetPvalue( "foo" );
double barPvalue = css.getGeneSetPvalue( "bar" );