Productivity
llvm▌
mohitmishra786/low-level-dev-skills · updated Apr 8, 2026
$npx skills add https://github.com/mohitmishra786/low-level-dev-skills --skill llvm
summary
Guide agents through the LLVM IR pipeline: generating IR, running optimisation passes with opt, lowering to assembly with llc, and inspecting IR for debugging or performance work.
skill.md
LLVM IR and Tooling
Purpose
Guide agents through the LLVM IR pipeline: generating IR, running optimisation passes with opt, lowering to assembly with llc, and inspecting IR for debugging or performance work.
Triggers
- "Show me the LLVM IR for this function"
- "How do I run an LLVM optimisation pass?"
- "What does this LLVM IR instruction mean?"
- "How do I write a custom LLVM pass?"
- "Why isn't auto-vectorisation happening in LLVM?"
Workflow
1. Generate LLVM IR
# Emit textual IR (.ll)
clang -O0 -emit-llvm -S src.c -o src.ll
# Emit bitcode (.bc)
clang -O2 -emit-llvm -c src.c -o src.bc
# Disassemble bitcode to text
llvm-dis src.bc -o src.ll
2. Run optimisation passes with opt
# Apply a specific pass
opt -passes='mem2reg,instcombine,simplifycfg' src.ll -S -o out.ll
# Standard optimisation pipelines
opt -passes='default<O2>' src.ll -S -o out.ll
opt -passes='default<O3>' src.ll -S -o out.ll
# List available passes
opt --print-passes 2>&1 | less
# Print IR before and after a pass
opt -passes='instcombine' --print-before=instcombine --print-after=instcombine src.ll -S -o out.ll 2>&1 | less
3. Lower IR to assembly with llc
# Compile IR to object file
llc -filetype=obj src.ll -o src.o
# Compile to assembly
llc -filetype=asm -masm-syntax=intel src.ll -o src.s
# Target a specific CPU
llc -mcpu=skylake -mattr=+avx2 src.ll -o src.s
# Show available targets
llc --version
4. Inspect IR
Key IR constructs to understand:
| Construct | Meaning |
|---|---|
alloca |
Stack allocation (pre-SSA; mem2reg promotes to registers) |
load/store |
Memory access |
getelementptr (GEP) |
Pointer arithmetic / field access |
phi |
SSA φ-node: merges values from predecessor blocks |
call/invoke |
Function call (invoke has exception edges) |
icmp/fcmp |
Integer/float comparison |
br |
Branch (conditional or unconditional) |
ret |
Return |
bitcast |
Reinterpret bits (no-op in codegen) |
ptrtoint/inttoptr |
Pointer↔integer (avoid where possible) |
5. Key passes
| Pass | Effect |
|---|---|
mem2reg |
Promote alloca to SSA registers |
instcombine |
Instruction combining / peephole |
simplifycfg |
CFG cleanup, dead block removal |
loop-vectorize |
Auto-vectorisation |
slp-vectorize |
Superword-level parallelism (straight-line vectorisation) |
inline |
Function inlining |
gvn |
Global value numbering (common subexpression elimination) |
licm |
Loop-invariant code motion |
loop-unroll |
Loop unrolling |
argpromotion |
Promote pointer args to values |
sroa |
Scalar Replacement of Aggregates |
6. Debugging missed optimisations
# Why was a loop not vectorised?
clang -O2 -Rpass-missed=loop-vectorize -Rpass-analysis=loop-vectorize src.c
# Dump pass pipeline
clang -O2 -mllvm -debug-pass=Structure src.c -o /dev/null 2>&1 | less
# Print IR after each pass (very verbose)
opt -passes='default<O2>' -print-after-all src.ll -S 2>&1 | less
7. Useful llvm tools
| Tool | Purpose |
|---|---|
llvm-dis |
Bitcode → textual IR |
llvm-as |
Textual IR → bitcode |
llvm-link |
Link multiple bitcode files |
llvm-lto |
Standalone LTO |
llvm-nm |
Symbols in bitcode/object |
llvm-objdump |
Disassemble objects |
llvm-profdata |
Merge/show PGO profiles |
llvm-cov |
Coverage reporting |
llvm-mca |
Machine code analyser (throughput/latency) |
For binutils equivalents, see skills/binaries/binutils.
Related skills
- Use
skills/compilers/clangfor source-level Clang flags - Use
skills/binaries/linkers-ltofor LTO at link time - Use
skills/profilers/linux-perfcombined withllvm-mcafor micro-architectural analysis