Title: Basic ChIP-seq read processing
By: John Urban
For: Anyone interested (Spradling Lab, Carnegie Emb, or otherwise)
In advance: apologies for the presentation/poor grammar/typos. This was really just "brain streaming".
Not just peak calling:Below I present a way to process ChIP-seq data: peak calling is optional here as it is only a small part of what one wants to do. This is not the only way to go about processing reads for ChIP-seq. However, this is a straightforward way that will be adequate in most ChIP-seq scenarios for people who do not consider themselves experts in bioinformatics. Nevertheless, the outputs obtained by this approach will be publishable: “experts” could easily opt to follow this route and be comfortable with the results. Overall, it is simpler to learn fewer programs, and this route minimizes the need to learn more than a few. Aside from the initial read mapping steps where bowtie2/bwa and samtools are used, I map out a way to process ChIP-seq and other genomic data using only MACS2 to process the reads and obtain informative bedGraphs including coverage, SPMR, Fold Enrichment, -log10 pvalues, etc. At each step, I try to give some alternatives to explore as well.
Some of the covered topics:
- Read mapping
- Scaling replicates and conditions to each other
- Generating bedGraphs of Coverage, SPMR, FE, Subtraction, p-values, q-values, log likelihoods, etc
- Peak calling
- Calling subpeaks/summits within broader peak/enriched regions
For the rest of this tutorial, please use the Google Document described below.
I am leaving it here as a blog in case it is easier to find for beginners in bioinformatics.