Productivity

llvm

mohitmishra786/low-level-dev-skills · updated Apr 8, 2026

$npx skills add https://github.com/mohitmishra786/low-level-dev-skills --skill llvm
summary

Guide agents through the LLVM IR pipeline: generating IR, running optimisation passes with opt, lowering to assembly with llc, and inspecting IR for debugging or performance work.

skill.md

LLVM IR and Tooling

Purpose

Guide agents through the LLVM IR pipeline: generating IR, running optimisation passes with opt, lowering to assembly with llc, and inspecting IR for debugging or performance work.

Triggers

  • "Show me the LLVM IR for this function"
  • "How do I run an LLVM optimisation pass?"
  • "What does this LLVM IR instruction mean?"
  • "How do I write a custom LLVM pass?"
  • "Why isn't auto-vectorisation happening in LLVM?"

Workflow

1. Generate LLVM IR

# Emit textual IR (.ll)
clang -O0 -emit-llvm -S src.c -o src.ll

# Emit bitcode (.bc)
clang -O2 -emit-llvm -c src.c -o src.bc

# Disassemble bitcode to text
llvm-dis src.bc -o src.ll

2. Run optimisation passes with opt

# Apply a specific pass
opt -passes='mem2reg,instcombine,simplifycfg' src.ll -S -o out.ll

# Standard optimisation pipelines
opt -passes='default<O2>' src.ll -S -o out.ll
opt -passes='default<O3>' src.ll -S -o out.ll

# List available passes
opt --print-passes 2>&1 | less

# Print IR before and after a pass
opt -passes='instcombine' --print-before=instcombine --print-after=instcombine src.ll -S -o out.ll 2>&1 | less

3. Lower IR to assembly with llc

# Compile IR to object file
llc -filetype=obj src.ll -o src.o

# Compile to assembly
llc -filetype=asm -masm-syntax=intel src.ll -o src.s

# Target a specific CPU
llc -mcpu=skylake -mattr=+avx2 src.ll -o src.s

# Show available targets
llc --version

4. Inspect IR

Key IR constructs to understand:

Construct Meaning
alloca Stack allocation (pre-SSA; mem2reg promotes to registers)
load/store Memory access
getelementptr (GEP) Pointer arithmetic / field access
phi SSA φ-node: merges values from predecessor blocks
call/invoke Function call (invoke has exception edges)
icmp/fcmp Integer/float comparison
br Branch (conditional or unconditional)
ret Return
bitcast Reinterpret bits (no-op in codegen)
ptrtoint/inttoptr Pointer↔integer (avoid where possible)

5. Key passes

Pass Effect
mem2reg Promote alloca to SSA registers
instcombine Instruction combining / peephole
simplifycfg CFG cleanup, dead block removal
loop-vectorize Auto-vectorisation
slp-vectorize Superword-level parallelism (straight-line vectorisation)
inline Function inlining
gvn Global value numbering (common subexpression elimination)
licm Loop-invariant code motion
loop-unroll Loop unrolling
argpromotion Promote pointer args to values
sroa Scalar Replacement of Aggregates

6. Debugging missed optimisations

# Why was a loop not vectorised?
clang -O2 -Rpass-missed=loop-vectorize -Rpass-analysis=loop-vectorize src.c

# Dump pass pipeline
clang -O2 -mllvm -debug-pass=Structure src.c -o /dev/null 2>&1 | less

# Print IR after each pass (very verbose)
opt -passes='default<O2>' -print-after-all src.ll -S 2>&1 | less

7. Useful llvm tools

Tool Purpose
llvm-dis Bitcode → textual IR
llvm-as Textual IR → bitcode
llvm-link Link multiple bitcode files
llvm-lto Standalone LTO
llvm-nm Symbols in bitcode/object
llvm-objdump Disassemble objects
llvm-profdata Merge/show PGO profiles
llvm-cov Coverage reporting
llvm-mca Machine code analyser (throughput/latency)

For binutils equivalents, see skills/binaries/binutils.

Related skills

  • Use skills/compilers/clang for source-level Clang flags
  • Use skills/binaries/linkers-lto for LTO at link time
  • Use skills/profilers/linux-perf combined with llvm-mca for micro-architectural analysis