Bio Commands 生信命令大全
    • Bowtie2 Alignment - 快速短读长比对工具,常用于 ChIP-seq、ATAC-seq、宏基因组和小基因组比对。 / Fast short-read aligner often used for ChIP-seq, ATAC-seq, metagenomics, and small genomes.
    • BWA Alignment - 面向 DNA 短读长的经典参考基因组比对工具,WGS/WES 流程中常用 BWA-MEM。 / Classic DNA read aligner; BWA-MEM is widely used in WGS/WES pipelines.
    • minimap2 Alignment - 长读长、组装序列和 spliced RNA/cDNA 比对的高性能工具。 / High-performance aligner for long reads, assemblies, and spliced RNA/cDNA alignments.
    • samtools Alignment - 处理 SAM/BAM/CRAM 比对文件的核心工具箱。 / Core toolkit for manipulating SAM/BAM/CRAM alignment files.
    • SPAdes Assembly - 常用 de novo 组装器,覆盖细菌基因组、单细胞和宏基因组等场景。 / Popular de novo assembler for bacterial genomes, single-cell data, metagenomes, and related datasets.
    • QUAST Assembly QC - 基因组组装质量评估工具,可输出 N50、总长度、错配、基因覆盖等报告。 / Genome assembly quality assessment tool reporting N50, total length, misassemblies, gene coverage, and related metrics.
    • MACS3 ChIP-seq/ATAC-seq - ChIP-seq/ATAC-seq 常用 peak calling 工具,MACS2 的后续版本。 / Peak caller for ChIP-seq and ATAC-seq; successor of MACS2.
    • BUSCO Completeness QC - 基于单拷贝直系同源基因集评估基因组、转录组或蛋白集完整性的工具。 / Tool for assessing genome, transcriptome, or protein-set completeness using single-copy ortholog sets.
    • mosdepth Coverage - 快速计算 BAM/CRAM 覆盖度、深度分布和目标区域覆盖统计的工具。 / Fast tool for BAM/CRAM coverage, depth distribution, and target-region coverage statistics.
    • SRA Toolkit Data download - NCBI SRA 数据下载与 FASTQ 转换工具集,常用 `prefetch` 和 `fasterq-dump`。 / Toolkit for downloading NCBI SRA runs and converting them to FASTQ, commonly via `prefetch` and `fasterq-dump`.
    • deepTools Epigenomics - 面向 ChIP-seq、ATAC-seq、RNA-seq 等数据的覆盖度计算、矩阵生成和可视化工具集。 / Toolkit for coverage calculation, matrix generation, and visualization for ChIP-seq, ATAC-seq, RNA-seq, and related assays.
    • Cutadapt FASTQ preprocessing - 从 FASTQ reads 中去除接头、引物和低质量片段。 / Removes adapters, primers, and low-quality sequence from FASTQ reads.
    • fastp FASTQ preprocessing - 一体化 FASTQ 质控、过滤、接头剪切和 HTML/JSON 报告工具。 / All-in-one FASTQ quality control, filtering, adapter trimming, and reporting tool.
    • Trim Galore FASTQ preprocessing - Cutadapt 和 FastQC 的常用封装工具,用于 FASTQ 接头剪切、质量过滤和报告生成。 / Wrapper around Cutadapt and FastQC for adapter trimming, quality filtering, and reporting.
    • FastQC FASTQ QC - 生成 FASTQ 测序质量报告,常用于原始数据和清洗后数据检查。 / Generates FASTQ sequencing quality reports for raw and cleaned reads.
    • Prokka Genome annotation - 原核基因组快速注释工具,可从组装 FASTA 生成 GFF、GenBank、蛋白序列和表格结果。 / Rapid prokaryotic genome annotation tool producing GFF, GenBank, protein FASTA, and tabular outputs from assemblies.
    • bedtools Genomic intervals - 对 BED/GFF/GTF/VCF/BAM 等基因组区间做交集、覆盖度、最近邻和格式转换。 / Genome arithmetic toolkit for intersections, coverage, closest features, and format conversion.
    • Kraken2 Metagenomics - 基于 k-mer 的宏基因组 reads 分类工具,常用于微生物群落物种组成分析。 / k-mer based metagenomic read classifier commonly used for microbial taxonomic profiling.
    • MAFFT Multiple sequence alignment - 常用多序列比对工具,适合核酸或蛋白序列的 MSA 构建。 / Widely used multiple sequence alignment tool for nucleotide and protein sequences.
    • IQ-TREE 2 Phylogeny - 最大似然系统发育树构建工具,集成模型选择、bootstrap 和多种树评估方法。 / Maximum-likelihood phylogenetic inference tool with model selection, bootstrap, and tree testing methods.
    • PLINK 2 Population genetics - 大规模基因型数据 QC、转换和关联分析的常用命令行工具。 / Common command-line toolkit for large-scale genotype QC, conversion, and association analysis.
    • MultiQC QC summary - 汇总 FastQC、fastp、STAR、samtools 等工具报告,生成统一 QC 报告。 / Aggregates reports from tools such as FastQC, fastp, STAR, and samtools into one QC report.
    • featureCounts RNA-seq - 基于注释文件把 RNA-seq 比对 reads 计数到 gene/exon/feature。 / Assigns aligned RNA-seq reads to genes, exons, or other genomic features.
    • kallisto RNA-seq - 基于 pseudoalignment 的快速 RNA-seq 转录本定量工具。 / Fast transcript-level RNA-seq quantification tool based on pseudoalignment.
    • Salmon RNA-seq - 快速 RNA-seq transcript-level 定量工具,常用于无比对或准比对定量流程。 / Fast transcript-level RNA-seq quantification tool using lightweight mapping methods.
    • StringTie RNA-seq - 从 RNA-seq 比对结果组装转录本并估计表达量的工具。 / Tool for transcript assembly and expression estimation from RNA-seq alignments.
    • HISAT2 RNA-seq alignment - 面向 RNA-seq 的快速剪接感知比对工具,也可用于部分 DNA reads 比对场景。 / Fast splice-aware aligner for RNA-seq reads, also usable for some DNA read alignment scenarios.
    • STAR RNA-seq alignment - 高性能 RNA-seq spliced aligner,常用于 bulk RNA-seq reads 到基因组比对。 / High-performance RNA-seq spliced aligner widely used for mapping reads to a genome.
    • BLAST+ Sequence search - NCBI BLAST+ 命令行工具,用于核酸或蛋白序列相似性搜索。 / NCBI BLAST+ command-line tools for nucleotide and protein sequence similarity search.
    • DIAMOND Sequence search - 高速蛋白序列比对工具,常作为 BLASTP/BLASTX 的快速替代。 / Fast protein aligner commonly used as a high-speed alternative to BLASTP/BLASTX.
    • MMseqs2 Sequence search - 高性能序列搜索和聚类工具,适合大规模蛋白或核酸数据集。 / High-performance sequence search and clustering suite for large protein or nucleotide datasets.
    • SeqKit Sequence toolkit - 高性能 FASTA/FASTQ 命令行工具箱,适合统计、筛选、抽样、格式转换和序列操作。 / Fast FASTA/FASTQ toolkit for statistics, filtering, sampling, conversion, and sequence operations.
    • seqtk Sequence toolkit - 轻量 FASTA/FASTQ 命令行工具,适合抽样、截取、格式转换和基础处理。 / Lightweight FASTA/FASTQ toolkit for subsampling, extraction, format conversion, and basic sequence operations.
    • Ensembl VEP Variant annotation - Ensembl Variant Effect Predictor,用于注释变异对转录本、蛋白和调控区域的潜在影响。 / Ensembl Variant Effect Predictor annotates variant consequences on transcripts, proteins, and regulatory regions.
    • SnpEff Variant annotation - 快速变异影响注释工具,可为 VCF 添加基因、转录本和预测后果信息。 / Fast variant effect annotation tool that adds gene, transcript, and consequence annotations to VCF files.
    • FreeBayes Variant calling - 基于 haplotype 的变异检测工具,可从 BAM/CRAM 中调用 SNP、indel 和复杂变异。 / Haplotype-based variant caller for SNPs, indels, and complex variants from BAM/CRAM alignments.
    • GATK Variant calling - Broad Institute 的变异分析工具箱,常用于 germline calling、GVCF 联合分型和变异过滤。 / Broad Institute toolkit for variant analysis, commonly used for germline calling, GVCF joint genotyping, and filtering.
    • bcftools VCF/BCF - 读取、过滤、规范化、合并、查询和注释 VCF/BCF 变异文件。 / Toolkit for viewing, filtering, normalizing, merging, querying, and annotating VCF/BCF variant files.
    • Nextflow Workflow - 面向可复现生信流程的工作流引擎,常与 nf-core、Conda、Docker/Singularity 配合使用。 / Workflow engine for reproducible bioinformatics pipelines, commonly used with nf-core, Conda, Docker, and Singularity.
    • Snakemake Workflow - Python 风格的可复现工作流引擎,常用于本地、HPC 和云端生信流程编排。 / Pythonic workflow engine for reproducible bioinformatics pipelines on local machines, HPC, and cloud environments.
    添加命令 / Add command | 命令列表 / Command list | 格式索引 / Formats | 资料来源 / Sources | AutoBA | tldr-pages | Awesome Bioinformatics
    GitHub | Bioconda | nf-core
    共收录 40 个生信高频命令, 27 个文件格式 / curated commands and formats.