Q: I just received my differential analysis results package, what is the purpose of each file?
A: Your differential results will be contained in the file: *.results.txt. This file is formatted in paired-end interaction format (bedpe), where the first 6 fields contain positions of the left and right genomic positions, and subsequent fields contain the differential analysis statistics. The most relevant are the “p.adj” and “adj.m” fields, which represent the FDR p value and fold change, respectively. For a full description of the other fields, please see this vignette.
Q: The file containing the differential results is very large. Where should I start?
A: Prioritize regions by fold change and p value. The stringency for filtering will depend on your individual experiment. One way to easily prioritize the most significant interacting regions is to filter all results to only interactions with a p.adj < 0.05, and then sort the interactions by the absolute magnitude of fold change. For example, using awk:
{ IFS= read -r header; printf "%s\n" "$header"; awk -F'\t' '$18+0<0.05 {abs=($13<0?- $13:$13); print abs "\t" $0}' OFS='\t' | sort -t$'\t' -k1,1nr | cut -f2-; } < results.txt > filtered.results.txt
In this example, the 13th field contains the fold change, and the 18th field contains the FDR p value. We use the absolute fold change if we wish to remain agnostic of the sign of the fold change (ie. positive = more interactions in sample A, negative = more interactions in sample B).
While this is a great starting point for most analysis, individual results may require more conservative or permissive filters.
Q: How can I visualize the top differentially interacting regions?
A: There are multiple possible tools for visualization of regions of differential topology. In most cases, the differential analysis will result in a list of regions of interest. Visualization will generally involve plotting the interaction matrix of two differential samples (or groups) of these identified regions. You may also have orthogonal data types (e.g. ChIP-seq) that you wish to plot on the same genomic co-ordinates as separate tracks. Our favourites tools for generating linear genomic plots that support multiple data types are:
plotgardener: for full control over the multiple stylistic options and track types; great for publication-quality figures.
Fan-C: a simpler plotting tool, for quick figure generation. Easier to use but has less features than plotgardener.
JuiceBox: For easy interactive visualization of hic matrices in square matrix format. Good for interactive browsing. You can load .hic files as a "sample" and "control", and perform basic visualization.
Q: My epigenetics results included calls for AB compartments, TADs, and loops, but the differential analysis results are different and do not make use of this output. Why?
A: Great question. There are multiple ways to go about performing a comparison between two or more conditions in a chromatin topology experiment. Our standard epigenetics pipeline performs feature calling at the sample-level, therefore, each individual sample in the experiment will contain high-quality calls. Customers may then choose to employ standard genomics techniques and tools (e.g. bedtools) for comparing between samples at the feature-level; such as identifying loops that are unique or shared between samples/conditions. The bedtools suite of tools has a very useful function - pairToPair - for comparing bedpe files and identifying unique and shared features between them.
For a more robust whole genome region-level comparison between conditions, we provide a differential analysis solution which is feature-level agnostic and can identify the top differentially interacting regions independent of whether the features are AB compartments, TADs, and loops.
Q: I'm stuck. I have identified differentially interacting regions, AB compartments, loops, and TADs, but I’m not sure where to go next.
A: We understand that every project is unique. For some next-step example analysis solutions. Now that you’ve mapped the structural changes, the next step is to connect them to biological meaning. Here are a few common directions you could explore:
1) Link 3D changes to gene regulation
2) Connect chromatin structure to genetic variation
3) Plan functional follow-up experiments
4) Summarize and communicate your findings