对 BED/GFF/GTF/VCF/BAM 等基因组区间做交集、覆盖度、最近邻和格式转换。Genome arithmetic toolkit for intersections, coverage, closest features, and format conversion.
mamba install -c bioconda bedtools
求两个区间文件交集:
bedtools intersect -a peaks.bed -b promoters.bed > peaks.in_promoters.bed
保留 A 中所有记录,并标注是否命中 B:
bedtools intersect -a variants.bed -b genes.bed -loj > variants.with_genes.tsv
计算每个区间覆盖度:
bedtools coverage -a targets.bed -b sample.bam > target_coverage.tsv
BAM 转 BED:
bedtools bamtobed -i alignments.bam > alignments.bed
生成 genome coverage:
bedtools genomecov -ibam sample.bam -bg > sample.bedgraph
intersect:区间交集,最常用子命令。coverage:计算 A 区间被 B 覆盖的深度/比例。closest:找距离最近的区间。merge:合并重叠或相邻区间。slop:按基因组长度文件扩展区间,避免越界。-sorted:输入已排序时可显著降低内存占用。-sorted 要求 A/B 都按染色体和起点排序。chr1 和 1 不能混用。