Customizing EASE requires knowledge of the directory structure of EASE and the format of the files therein. The majority of files are text files with lines in the "standard" format:
This schema allows for many-to-many relationships between genes and pieces of information.
In the LocusLink-centric schema created by the automated update process, a line in these files might look like this:
If this occurred in a file called "GO Biological Process.txt" the \Data\Class\ directory, it would map LocusLink number 10 to the gene category "apoptosis" in the "GO Biological Process" system of classifying genes. If it occurred in a file called "Gene name.txt" in the \Data\ directory, it would map LocusLink number 10 to "apoptosis" for the "Gene name" annotation field. Hence, by knowing the purpose of the various subdirectories of the EASE directory, a user can create files with database or spreadsheet software to fit the format expected by EASE and have EASE use the files accordingly. Below are descriptions of the subfolders in the EASE directory:
\_Inline\ contains system files. Nothing to customize here.
\Data\ contains "annotation field" files in the standard format that map genes to annotation corresponding to the field.
\Data\Class\ contains gene classification "system" files in the standard format that map genes to their classification in the corresponding classification system. These files can optionally contain a third field which lists PMIDs for MedLine articles that support the classification of this gene into this category. The PMIDs are separated by semicolons within the field. These files can also be used as fields in annotation tables.
\Data\Class\Implies\ contains files that supplement files of the same name in the \Data\Class\ directory. These map the gene classifications explicit in the \Data\Class\ version to other classifications implied, in the format:
This is useful for such systems as those of the Gene Ontology wherein a gene being classified as a certain term implies that it is also classified as every superordinate (parent) category of the ontology. Although you could directly map a given LocusLink number to every possible parental Gene ontology term in the appropriate file in \Data\Class\, this is very inefficient and leads to excessive loading times. To speed data loading for such systems, only direct "child implies parent" relationships need to be specified in the \Data\Class\Implies\ supplemental file. All grandparent, etc relationships are loaded recursively. If a file from this directory is selected during annotation field selection, that field will list all explicit and implicit categories for a given gene.
\Data\Class\URL data\ contains configuration files for specifying URLs to hyperlink gene categories to definitions. The files are matched to system files in the \Data\Class\ directory, and contain two lines of URLs. The first line defines the URL to the definition of the system itself. The second line defines a template of the URL for a specific category within that classification system. EASE makes the URL specific to a classification term by replacing the string [*TERM*] in the template with the actual classification term. For URLs that require some conversion from classification term to some tag before URL generation, the template contains the string [*TAG*]. In this case, another file with the same name will occur in the \Data\Class\URL data\Tags\ directory as mentioned below.
\Data\Class\URL data\Tags\ contains the tag conversion files for the URL creation mentioned above. These files are in the format:
\Data\Convert\ contains files in the standard format that map gene identifiers to some accession system for referring to that gene. Files in this directory are named for the type of accession number being linked. For example, the "Genbank.txt" file installed by the automated update links LocusLink numbers to Genbank accessions with lines like:
These files can also be selected during the annotation field selection process to include a column listing all accessions within this system that refer to the gene of a given row in the output table.
\Enhance\ contains files to define genes that should "share" annotation when using the "Enhance" function of EASE. These files map all genes in a pair-wise fashion in the standard format:
\Help\ contains help files (like this one!) in plain text.
\Links\ contains link definition files containing templates for URLs to online tools for analyzing gene lists. EASE will detect two sets of one of the following strings in any given URL template:
... and construct the URL accordingly, replacing the [*EaseID*] string with the standard gene identifiers of the list to complete the URL. For example, if the URL template looks like:
EASE initializes the URL with:
and then concatenates all LocusLink numbers in the list with a semicolon and adds them to the end of the completed URL. In the cases of [*DATA*], [*CONVERT*], and [*CLASS*], EASE will first convert the list of genes to tags using a file specified on the second line of the link definition file located in the \Data\, \Data\Convert\, or \Data\Class\ directory respectively. For example, the
\Links\View Abstracts of PMIDs from LocusLink.txt
file contains the following lines:
To link a given list of genes to this link named "View Abstracts of PMIDs from LocusLink", EASE first converts all LocusLink numbers of the gene list to PMID numbers using the file "Known PMIDs.txt" in the \Data\ directory. Then EASE initializes the URL with:
and then concatenates all of the PMID numbers with a comma and adds them to the end of the URL.