FAQ

Frequently Asked Questions

CisGenome core programs can be compiled on multiple platforms including MS Windows, Mac OS, Linux and other OS systems. However, CisGenome GUI and CisGenome browser can only be used on MS Windows system.

1. What is the OS system requirements for CisGenome?

You need to prepare a tab-delimited text file first. The first line of the file should specify sample names, and the second line should give numerical group ids (e.g., 1 for IP, 2 for Control). The first two lines should start with ‘#’. From the third line, you need to provide the data in the following format:

Chromosome[tab]Position[tab]value1[tab]value2[tab]…

Here is an example file:

#chr pos IP1 IP2 IP3 CT1 CT2 CT3

#chr pos 1 1 1 2 2 2

chr21 1000 9.2 8.7 8.9 4.5 3.4 2.3

chr21 1035 6.7 6.5 7.0 3.4 5.6 2.5

…

After you get such a file, you can use “Tiling Array > Normalization > Quantile (TXT)” to perform normalization.

After you get the normalized data in a tab-delimited text file, you can use “File > Load Data > Tiling Array Dataset > Import from TXT” to convert the text file to a BAR tiling array project.

You can then use “Tiling Array > Peak Detection (TileMap)” to find peaks. Here you need to specify the optional parameters to match your array design, since the default parameter settings are chosen for Affymetrix high density tiling arrays.

2. How can I use CisGenome to analyze Agilent/NimbleGen tiling array data?

Currently, you can first ran MAT to get background corrected values. These values will be saved to BAR files. CisGenome can take these BAR files as input. You just need to prepare a text file as follows:

[item1]

type=bartilingexp

item_name=REST_norm

bar_folder=C:\Users\jihk\rest\analysis\

lib_num=7

sample_num=4

group_num=2

sample_alias_1=IP1

sample_group_1=1

bar_file_1_1=Johnson_IP364AMP(120506)_Hs35b_P01R_v01.CEL.bar

bar_file_1_2=Johnson_IP364AMP(120506)_Hs35b_P02R_v01.CEL.bar

bar_file_1_3=Johnson_IP364AMP(120506)_Hs35b_P03R_v01.CEL.bar

bar_file_1_4=Johnson_IP364AMP(120506)_Hs35b_P04R_v01.CEL.bar

bar_file_1_5=Johnson_IP364AMP(120506)_Hs35b_P05R_v01.CEL.bar

bar_file_1_6=Johnson_IP364AMP(120506)_Hs35b_P06R_v01.CEL.bar

bar_file_1_7=Johnson_IP364AMP(120506)_Hs35b_P07R_v01.CEL.bar

sample_alias_2=IP2

sample_group_2=1

bar_file_2_1=Johnson_IP369AMP(120506)_Hs35b_P01R_v01.CEL.bar

bar_file_2_2=Johnson_IP369AMP(120506)_Hs35b_P02R_v01.CEL.bar

bar_file_2_3=Johnson_IP369AMP(120506)_Hs35b_P03R_v01.CEL.bar

bar_file_2_4=Johnson_IP369AMP(120506)_Hs35b_P04R_v01.CEL.bar

bar_file_2_5=Johnson_IP369AMP(120506)_Hs35b_P05R_v01.CEL.bar

bar_file_2_6=Johnson_IP369AMP(120506)_Hs35b_P06R_v01.CEL.bar

bar_file_2_7=Johnson_IP369AMP(120506)_Hs35b_P07R_v01.CEL.bar

sample_alias_3=CT1

sample_group_3=2

bar_file_3_1=Johnson_Jurkat-Mock-10_Hs35b_P01R_v01.CEL.bar

bar_file_3_2=Johnson_Jurkat-Mock-10_Hs35b_P02R_v01.CEL.bar

bar_file_3_3=Johnson_Jurkat-Mock-10_Hs35b_P03R_v01.CEL.bar

bar_file_3_4=Johnson_Jurkat-Mock-10_Hs35b_P04R_v01.CEL.bar

bar_file_3_5=Johnson_Jurkat-Mock-10_Hs35b_P05R_v01.CEL.bar

bar_file_3_6=Johnson_Jurkat-Mock-10_Hs35b_P06R_v01.CEL.bar

bar_file_3_7=Johnson_Jurkat-Mock-10_Hs35b_P07R_v01.CEL.bar

sample_alias_4=CT2

sample_group_4=2

bar_file_4_1=Johnson_Jurkat-Mock-7_Hs35b_P01R_v01.CEL.bar

bar_file_4_2=Johnson_Jurkat-Mock-7_Hs35b_P02R_v01.CEL.bar

bar_file_4_3=Johnson_Jurkat-Mock-7_Hs35b_P03R_v01.CEL.bar

bar_file_4_4=Johnson_Jurkat-Mock-7-Rerun_Hs35b_P04R_v01.CEL.bar

bar_file_4_5=Johnson_Jurkat-Mock-7_Hs35b_P05R_v01.CEL.bar

bar_file_4_6=Johnson_Jurkat-Mock-7_Hs35b_P06R_v01.CEL.bar

bar_file_4_7=Johnson_Jurkat-Mock-7_Hs35b_P07R_v01.CEL.bar

Save it as a *.cgw file (e.g., restmat.cgw), and load it into CisGenome using the menu “File > Load Data > Tiling Array Dataset > Import from CGW”. You will then see a tiling array data set in the Project Explorer window.

Now you can analyze the dataset by clicking the menu “Tiling Array > Peak Detection (TileMap)”. Starting from this point, the data set will be treated in the same way as any data sets quantile normalized within CisGenome.

You may also directly visualize the BAR files generated by MAT program in the CisGenome Browser.

In addition to these functionalities, we are planning to incorporate a menu item into the CisGenome GUI to directly support MAT background correction in the future.

3. I want to use MAT background correction instead of quantile normalization in the tiling array analysis, does CisGenome support MAT?

4. I obtained some peak calling results from other tools, can I use CisGenome to visualize my results and perform downstream analysis?

Sure you can. To perform downstream analysis:

You just need to save the binding regions you have into a COD file, which is a tab-delimited text file with the following format:

name[tab]chr[tab]start[tab]end[tab]strand

For example:

1 chr1 1000 2000 +

2 chr1 5000 6000 -

You can also format the regions into a BED file, and CisGenome can convert BED files to COD files (“File > File Format Conversion -> BED->COD”).

You can then use the COD file to retrieve sequences, get gene annotations, and map motifs, etc.

For visualization:

If the other programs output signals into BAR files, you can use CisGenome Browser to visualize these BAR files.

If the other programs export signals into a text file in the following format:

Chr[tab]coordinate[tab]signal_value

You can also visualize the signals in the CisGenome Browser.

Otherwise, you need to convert the signals into one of the formats above, then you can visualize the signals in the genomic context.

5. How can I convert a BAR file to a text file?

Use menu “File > File Format Conversion > BAR->TXT”.

6. When I analyze ChIP-seq data, I want to first shift reads towards the DNA fragment center before detecting peaks. How can I do it?

You are right, in some analyses, shifting reads may increase statistical power, since the forward and reverse strand reads can be counted in a more appropriate manner. In order to shift reads and use the shifted reads in the ChIP-seq peak detection, you can follow the steps below.

(1) Use “Sequencing > Alignment->BAR” to convert alignment files (TXT) to BAR files.

(2) Perform data exploration and peak detection as described in the Tutorials, apply the “boundary refinement” and “single strand filtering” option, but do not check the “Use shifted reads” box. Based on the peak detection results, find the mean or median peak length L. Let H=L/2.

(3) Use “Sequencing > Shift Alignment” to shift reads, set the offset parameter to be H. This function will automatically shift forward strand reads H bp towards right, and shift reverse complement strand reads H bp towards left. The input files for this step is the BAR files generated in step (1), and the output files are the BAR files with a new suffix *_C.bar.

(4) Rerun data exploration and peak detection. This time check the “Use shifted reads” option. Note: The input files for this step are the BAR files generated in step (1). DO NOT use the BAR files generated in step (3) with *_C.bar suffix. This is because the program will automatically add the suffix to the input file name when the “use shifted reads” option is checked.

For command line users: the function for shifting reads is

> hts_alnshift2bar –i [input bar file] -s [shift offset H]