CisMols Analyzer Help

Contents

Logging in

Managing projects

Managing gene lists

Searching for cis-clusters

Creating a CisMolGram

Modifying and analyzing a CisMolGram

Exporting a CisMolGram

Saving your search

This guide will assist you in using CisMols Analyzer, a program that identifies compositionally predicted cis-clusters, which we call CisMols (Cis-regulatory Modules), that occur in groups of co-regulated genes within each of their ortholog-pair evolutionarily conserved cis-regulatory regions. CisMols Analyzer is based on the hypothesis that the presence of a cluster of ortholog-conserved known cis-acting elements in all or many of the co-expressed or functionally related genes predicts that the genes were co-regulated by these elements. Designed to search for regulatory clusters not just in the upstream region of co-expressed genes but also in the non-coding intronic and 5' and 3' flanking genomic regions, CisMols Analyzer could lead to the discovery of probes for genome-wide identification of regulatory regions.

To assist you in using CisMols Analyzer, we have divided this help file into eight sections that reflect the main steps to using the system:

If you need additional support at any time, contact the CisMols administrator.

Logging in

To create projects and gene lists, you need a login account. You can request one by emailing the CisMols administrator. Once you have an account, you can log in by completing the following steps:

1. If the login screen does not already appear, go to the left-hand menu, and click Login:

2. Type your User Name and Password.

3. Click Login:

You should now see a screen for selecting a data source. If you don't see this screen or receive an error message, try logging in again, remembering that user names and passwords are case-sensitive. If you still are unable to log in, contact the CisMols administrator.

Return to Top

Managing projects

Projects are used to group gene lists for analysis. They can contain public gene lists, private gene lists or both (for more information about gene lists, see Managing gene lists). When projects contain public gene lists, anyone can access them and search their public gene lists without obtaining a login account. If you are only interested in searching public gene lists in existing projects, you may disregard the instructions in this section and the next section, and go to the instructions for searching for cis-clusters.

If you are interested in creating your own projects, you need to obtain a login account. Once your account is set up, you can create multiple projects and assign relevant gene lists to each. Available gene lists include those you have created as well as those that other users have created and made publicly available. For more information about gene list creation and management, see Managing gene lists.

This section gives you step-by-step instructions for performing project management tasks. To complete any of these tasks, you need to access the Create or Edit Project screen by completing the following steps:

1. Log in if you have not done so already.

2. Select a data source:

3. Click CisMols Analyzer:

4. Click Projects:

The Create or Edit Project screen opens. From this screen, you can complete the following tasks:

Creating a project

1. In the upper-left section of the Create or Edit Project screen, click Create New Project:

2. Type the Project Name and a short Description.

3. Click Create:

The Create or Edit Project screen again appears. Your project should now be listed under My Projects. To assign one or more gene lists to the project, follow the steps in the next section.

Assigning gene lists

To assign one or more gene lists to a project, complete the following steps.

Important:
Only the owner(s) of a project can assign gene lists to it. For more information, see Managing owners and users.

1. From the Create or Edit Project screen, select the project by going to My Projects and clicking the project name:

2. Do one of the following:

3. Enter as many criteria as you wish, and click Search.

The View Gene Lists screen appears, displaying all private gene lists matching your search criteria.

Note:
To display public gene lists, go to the upper-right corner of the screen, and click Show Public.

4. Go to the table of available gene lists, and select the list(s) that you want to add to your project.

5. Click Add:

The Edit Projects screen again appears, indicating that the selected gene list(s) are now included in your project.

6. To add more gene lists to the project, repeat steps 2-5.

Removing gene lists

To remove one or more gene lists from a project, complete the following steps.

Important:

1. From the Create or Edit Project screen, select the project by going to My Projects and clicking the project name:

2. Under My Genes, select the the check box next to the gene(s) you want to remove.

3. Click Remove:

4. To confirm that you want to remove the selected gene(s), click OK.

Managing owners and users

If a project contains one or more public gene lists, any user can access the project and search its public gene list(s) without obtaining a login account. However, each project can have multiple owners and users with special privileges. Owners and users are able to view private gene lists in the project. In addition, owners are able to modify and delete projects. By default, you become the sole owner of each project you create. If you wish, you can add and remove owners and users by completing the following steps:

1. From the Create or Edit Project screen, select the project by going to My Projects and clicking the project name:

2. In the upper-right section of the screen, click Manage Users:

A screen appears listing associated and non-associated users.

3. Do one of the following:

4. To exit the screen, go to the upper-right section, and click Done:

Renaming a project

To change the name of a project, complete the following steps.

Important:
Only owners can rename projects. For more information, see Managing owners and users.

1. From the Create or Edit Project screen, select the project by going to My Projects and clicking the project name:

2. For Name, type a different name for the project.

3. Click Rename:

Deleting a project

To remove a project completely from the system, complete the following steps.

Important:

1. From the Create or Edit Project screen, select the project by going to My Projects and clicking the project name:

2. Click Delete:

3. To confirm that you want to delete the selected project, click OK.

Return to Top

Managing gene lists

CisMols Analyzer works with groups of co-expressed or related genes called gene lists, which further can be grouped into projects (see Managing projects). By default, the system contains a number of public gene lists that can be searched and analyzed without a login account. If you are only interested in searching public gene lists, you may disregard the information in this section and go to the instructions for searching for cis-clusters.

If you are interested in creating your own gene lists, you need to obtain a login account. Once your account is set up, you can follow the instructions in this section to create and manage gene lists.

To complete all tasks in this section, you need to open the Welcome screen by completing the following steps:

1. Log in if you have not done so already.

2. Select a data source:

3. Click CisMols Analyzer:

The Welcome screen appears:

From this screen, you can complete the following tasks:

Creating a gene list

1. At the Welcome screen, click Create Gene List.

2. Enter any of the following criteria:

3. Click Search and Add Genes:

4. Go to the table of search results, and under Select, select the check box next to each gene that you want to add to your list. Or if you wish to add all the genes in the table to your list, go to the top of the table and select the Select All Genes check box:

Note:
By clicking the accession number of any gene in the table, you can view a regulogram, or Cis-element Hit Density Image, which depicts a moving-window average of the number of shared cis-elements occurring in phylogenetically conserved regions. For more information, see Viewing a regulogram.

5. Click Add to Gene Cart:

6. Repeat steps 2-5 as many times as you wish to add other genes to the list.

7. When you are ready to finalize the contents of your list, click View Genes and Proceed:

8. Verify the contents of your gene lists, and make any last-minute adjustments by using the Remove Selected Genes and Add More Genes buttons.

Caution:
Once a gene list is created, its contents cannot be modified.

9. Once you have finalized the contents of your gene list, go to the top of the screen, and complete the following fields:

10. Click Submit Gene List:

You will receive several emails as your gene list is being created and uploaded. Once your list is ready, you can search the list for clusters and create a CisMolGram based on your search results.

Modifying the access level

When you create a gene list, you indicate whether it should be private or public. At any time, you can modify the access level of a gene list you have created by completing the following steps:

1. At the Welcome screen, click Search for Gene List.

2. Enter as many search criteria as you wish, and click Search:

3. Select one of your gene lists by clicking the Gene List Name:

Important:
If you select a gene list that you do not own, you will not be able to complete any steps beyond this point. If you are unsure if you are the owner of a gene list, go to the last column of the My Gene Lists table. If you see a trash can icon in the same row as the gene list, you have the ability to delete the gene list and therefore are the owner. If you see No Permission, you do not have the ability to delete the gene list and therefore are not the owner.

4. Go to the top-right section of the screen, and select Private or Public.

5. Click Change Access Level:

The access level of your gene group is now modified.

Deleting associated saved searches

When users search gene lists for cis-clusters, they can save their searches (see Saving your search). Once a search is saved, it can be retrieved by its owner and other users at any time (see Retrieving a saved search). When you log in and search for a gene list, all saved searches associated with this gene list appear at the bottom of the screen. If you are the owner of a saved search -- that is, if you are the user who created it originally -- you can delete it by completing the following steps.

Important:
If you are the owner of a gene list, you are not necessarily the owner of all saved searches associated with the gene list. To own a saved search, you must be the user who created it originally. If you did not create a saved search, you cannot delete it, even if you own the gene list with which it is associated.

1. At the Welcome screen, click Search for Gene List.

2. Enter as many search criteria as you wish, and click Search:

3. Select one of your gene lists by clicking the Gene List Name:

4. Under Saved Searches, select a search that you created.

5. In the same row, under Delete, click the trash can icon:

Note:
If you see No Permission instead of a trash can icon, you are not the owner of the search and therefore do not have the ability to delete it.

The saved search is now removed from the system.

Renaming a gene list

You can rename a gene list you have created by completing the following steps:

1. At the Welcome screen, click Search for Gene List.

2. Enter as many search criteria as you wish, and click Search:

3. Select one of your gene lists by clicking the Gene List Name:

Important:
If you select a gene list that you do not own, you will not be able to complete any steps beyond this point. If you are unsure if you are the owner of a gene list, go to the last column of the My Gene Lists table. If you see a trash can icon in the same row as the gene list, you have the ability to delete the gene list and therefore are the owner. If you see No Permission, you do not have the ability to delete the gene list and therefore are not the owner.

4. For Name, type a new name for the gene list.

5. Click Rename:

The name of your gene list is now changed.

Deleting a gene list

You can delete a gene list you have created by completing the following steps:

1. At the Welcome screen, click Search for Gene List.

2. Enter as many search criteria as you wish, and click Search:

3. Go to the gene list you want to delete, and under Delete, click the trash can icon:

Note:
If you see No Permission instead of a trash can icon, you are not the owner of the gene list and therefore do not have the ability to delete it.

The gene list is now removed from the system, but all genes in the list remain in the database.

Return to Top

Searching for cis-clusters

Whether you are searching an existing gene group or a group you created, the CisMols Analyzer enables you to customize the parameters of each search you perform. Specifically, you can use Boolean logic to restrict your search to known clusters and/or a combination of individual binding sites. You also can specify the minimum number of binding sites that must appear in each cluster, or the minimum number of genes in which each cluster must appear.

Continue reading this section for step-by-step instructions on selecting a gene list and performing one of the following tasks:

Selecting a gene list

To select a gene list for analysis, complete the following steps:

1. Log in if you want to search private gene lists in addition to public gene lists.

2. From the homepage, select a data source:

3. Click CisMols Analyzer:

4. Click Search for Gene List:

Note:
Instead of searching for an existing gene list, you can click Create Gene List and follow the instructions for creating a gene list. Once your gene list is fully uploaded, repeat the steps in this section to select the gene list before performing a new search.

5. Enter as many criteria as you wish, and click Search.

The View Gene Lists screen appears.

Note:
If you are not logged in, only public gene lists are available. If you are logged in, private gene lists are displayed by default. To display public gene lists, go to the upper-right corner of the screen, and click Show Public.

6. Go to the table of available gene lists, and select the list that you want to search.

You now are ready to perform a new search or retrieve a saved search.

Performing a new search

To search a gene list from scratch, complete the following steps:

1. Select a gene list if you have not done so already.

2. Go to the bottom of the Edit Gene List Properties screen, and click View CisMols Clusters:

3. Under Min Genes and Sites in a Cluster, enter the following information:

4. For Order top 100 clusters by, indicate whether you want the system to rank results by the number of genes in which each cluster appears or by the number of binding sites in each cluster.

Note:
The system may return more than 100 results. However, to make analysis easier, only 100 are displayed. It is possible, then, that a cluster would rank in the top 100 according to one criterion (for example., number of genes) but would not rank in the top 100 according to the other criterion.

5. For View Region, type a sequence range if you wish to restrict your search of ALL genes to a range other than the default range of 10 kb upstream and downstream.

Example:
Typing 30000 for From and 40000 for To would restrict your search to the 10 kb upstream region of genes with their first exon at 40001.

6. Under Select individual sites, indicate whether or not search results must contain specific binding sites.

7. Under Select from known Modules, indicate that search results must contain a known cis-module by selecting the check box next to the module. To select all modules, select the Select from known Modules check box.

Note:
For each known module, clicking the name takes you to the PubMed citation for a publication related to its discovery.

8. Between the individual sites and known modules tables, select OR if search results should satisfy the conditions in EITHER table, or select AND if search results must satisfy the conditions in BOTH tables.

Example:
Assume that under Select individual sites, you selected And for V$AARF, Or for V$CART and Not for V$DEAF. Also assume that under Select from known modules, you selected V$EREF V$VBPF. Finally, assume that between the two tables, you select And. Combined, these selections mean that all search results must contain binding site V$AARF and known module V$EREF V$VBPF, must not contain binding site V$DEAR, and may or may not contain binding site V$CART.

9. When you are finished defining your search criteria, click CisMols Search:

A screen appears for creating a CisMolGram.

Retrieving a saved search

Once users finish searching a gene list, they can save their search so that they and other users can retrieve the search at any time, perform it again or use it as the basis of a new search. To retrieve a saved search, complete the following steps:

1. Select a gene list if you have not done so already.

2. At the bottom of the Edit Gene List Properties screen, under Saved Searches, select the search you wish to perform.

3. Click Launch Search:

The CisMols search screen appears with all of the user's original selections and data. Click CisMols Search to proceed to the next screen, which shows how the user originally configured the CisMolGram, and from this screen, click View Peak Regions of Clusters to view the originally generated CisMolGram. Or modify the user's original selections and use them as the basis of your own search. Once you're finished, you can save this search under a new name, but you cannot save it under the same name as the original search.

Note:
You also can select saved searches from the CisMols Search screen by clicking the Saved Searches button:

Return to Top

Creating a CisMolGram

Once your search results are returned, you can view the data not only in tabular form but also graphically. We call these images CisMolGrams. They depict the location of clusters within gene regions. In this section, you will learn how to create a CisMolGram -- specifically, how to complete the following tasks:

Selecting Genes

By default, all genes in the group are selected for inclusion in the CisMolGram. You can remove one or more genes by going to the top-left corner of the table, selecting On All But Selected and selecting the check boxes next to all genes that should NOT be included in the CisMolGram:

Selecting Clusters

The total number of clusters matching your search criteria is indicated in the top-left corner of the screen. You can get a complete list in Excel format by clicking Download All:

In the table, only the top 100 clusters are displayed according to the number of genes in which they appear. By going to the table and clicking No. of Sites, you can sort clusters according to the number of sites they contain. You also can display a range other than the top 100 by going to the From and To text boxes in the upper-right corner of the screen, typing a range that's within the total number of clusters, and clicking Get Cluster Numbers:

Once you have finalized the contents of the clusters table, you must select at least one cluster. There are three ways to select clusters:

Selecting the first X number of clusters in the table

Go to the drop-down menu labeled Select No. of Clusters, and select the number of clusters to include in the CisMolGram. For example, to select the first 5 clusters in the table, select 5:

Select a range of clusters from the table

Next to the Select No. of Clusters drop-down menu, go to the From and To text boxes, type the range of clusters to include in the CisMolGram, and click Select Clusters. For example, to select the clusters in rows 10 through 25 in the table, type 10 in the first box and 25 in the second box:

Selecting individual clusters

To select individual clusters, go to the table and select the check box next to each cluster that you want to include in the CisMolGram:

Generating the image

Once you have specified which genes and clusters to include in the CisMolGram, you are ready to generate the image by clicking View Clusters:

A screen appears that graphically displays peaks, or regions containing one or more clusters, according to the conditions you specified. See the next section to learn how to manipulate and analyze your CisMolGram.

Return to Top

Modifying and analyzing a CisMolGram

Once your CisMolGram is generated, you will see an image similar to the following, where each symbol represents a cluster:

This section outlines several options for modifying your CisMolGram and aiding your analysis including

Modifying peak maximums

A peak is a region of a gene containing one or more cis-cluster. By default, the system is set to display up to five clusters at any peak. If a peak includes more than five clusters, the following symbol appears:

Simply click the symbol to view all clusters in this peak.

To adjust the maximum number of clusters per peak, go to the Display options directly below the CisMolGram. For Max Clusters to display at a peak, type a whole number. Then click Refresh.

Modifying the viewable region

By default, the CisMolGram includes the region 10kb up- and downstream of the gene -- unless you specified a different region when you entered your search criteria (see Performing a new search).

At this point, you cannot modify the CisMolGram to include more clusters than specified by the gene group parameters or your original search criteria. However, you can zoom in on the existing CisMolGram by going to the Display options directly below the CisMolGram. For Viewable Region, type a region in base pairs. Then click Refresh. For all genes, peaks are now displayed for the region you specified.

Switching to ortholog sequence view

To view peaks in orthologs of the human genes (i.e., mouse genes), go to the Display options directly below the graph. Click Show Ortholog Sequence View.

Viewing a TraFaC image

For each peak, you can view a TraFaC image, or depiction of the shared transcription factor binding sites that constitute the peak. To view a TraFaC image, click any cluster in the peak:

In the pop-up window, click View Trafac Image. Here is an example of the screen that appears, along with numbered annotations explaining its parts:

  1. The two gray vertical bars are the two genes that are compared. The numbers represent the nucleotide positions with respect to the sequences used.
  2. The TF binding sites occurring in both the genes are highlighted as colored bars drawn across the two genes. Click the image to zoom in on a site of interest. The TraFaC image can be viewed based upon the individual matrices of the TF binding sites or the matrix families.
  3. Indicates the names of the TF matrices. Click them to learn more about them. Note that these links work only if you have an account with the genomatix (http://www.genomatix.de).
  4. A table describing the putative sites displayed in the image.
  5. Click Show Only Parallel Sites to display "Ordered Hits." Ordered hits would limit the shared cis-elements to only those that are positionally conserved (or are almost evenly spread and equidistant in both the ortholog genes). This feature helps in clearing or filtering out the cluttered or complex regions. Cis-clusters that have constituent cis-elements occurring parallel in the orthologous genes frequently tend to be involved in regulatory function.
  6. Click Show Query Parameters to view or modify the query parameters.

Viewing a regulogram

For each peak, you can view a regulogram, or a cis-element hit density image, which depicts a moving-window average of the number of shared cis-elements occurring in phylogenetically conserved regions. To view a regulogram, click any cluster in the peak:

In the pop-up window, click View regulogram. Here is an example of the screen that appears, along with numbered annotations explaining its parts:

  1. The grey horizontal bars are the nucleotide sequences of the two orthologous genes compared. The numbers represent the coordinates of the sequences used. The red bars are the exons.
  2. The green blocks plotted parallel to the genomic sequences are the repeat regions identified by RepeatMasker.
  3. The different-colored polygons stretching from one sequence to the other indicate the sequence similarity regions between the two genes.
  4. The Hits scale on the lower-left side refers to the number of shared cis-elements between the two sequences occurring in a sequence-conserved region.
  5. The TF BS Freq in the upper half of the left side refers to the frequencies of the binding sites in both the sequences separately.
  6. Percent Identical refers to the percent similarity between the two sequences.
  7. To view hits based on individual transcription factor binding site matrices or just the matrix family wise, select Combine unordered same-family matrices, and click Refresh.
  8. To modify the default size of 850 X 412, type a different value for the width, and click Refresh.
  9. To zoom in for more clarity, select the radio button next to the Zoom drop-down menu, select a different value from the drop-down menu (by default, 10x magnification is selected), and click the image window.
  10. To look at the actual TF binding sites (constituent elements of hits), select the radio button next to the drop-down menu to the top-left of the regulogram, select a value other than the default window size of 200 bp if you wish, and click any point on the hits graph. The TraFaC image for this point is displayed.
  11. A new feature we have added includes the option for plotting "Ordered Hits." Ordered hits would limit the shared cis-elements to only those that are positionally conserved (or are almost evenly spread and equidistant in both the ortholog genes). This feature helps in clearing or filtering out the cluttered or complex regions. Cis-clusters that have constituent cis-elements occurring parallel in the orthologous genes frequently tend to be involved in regulatory function.

Sorting items in the legend

Below the CisMolGram is a legend indicating (a) the number of genes containing each cluster and (b) the sites within each cluster. You can sort the items in this legend by clicking the buttons above it (grouped into three categories -- Gene Sorting, Site Sorting and Cluster Sorting). You also can sort items by clicking the row and column labels in the table, as outlined below:

  1. Unsorted: view genes in the legend in the same order as displayed in the CisMolGram (default)
  2. Gene Frequency: sort genes by the total number of clusters they contain
  3. Alpha: sort site names alphabetically (default)
  4. Site Frequency: sort sites by the number of times they appear in clusters
  5. Gene Count: sort clusters by the number of genes in which they appear
  6. Site Count: sort sites by the number of clusters in which they appear
  7. Gene Presence: sort clusters by their presence in the selected gene
  8. Site Presence: sort clusters by the presence of the selected site in them

Return to Top

Exporting a CisMolGram

At any time when modifying and analyzing a CisMolGram, you can export the image and legend to several common file formats. Simply go to the bottom-left corner of the screen, and for Export to, select a file format:

Return to Top

Saving your search

Before you exit the CisMols Analyzer, you can save your search, including all your selections on every screen, by completing the following steps.

Important:
To save a search, you must have an account and be logged in. For more information, see Logging in.

1. Go to the bottom of the screen, and click Save Search:

2. For Search Name, type a one-word name for your search.

3. For Description, type a short description of your search.

4. Click Save:

A message should appear indicating your search has been saved. If it does not, go back to the CisMolGram and repeat these steps to try saving your search again.

For instructions on opening saved searches, see Retrieving a saved search.

Return to Top

Go to the Biomedical Informatics web site.