qoala-opt¶

qoala-opt is the MLIR-based driver for analyses, optimizations, and lowering. It registers:

All built-in MLIR dialects (registerAllDialects) and passes (registerAllPasses).
The five qoala dialects: qnet, qmem, netqasm, qoalahost, qremote.
Every dialect-specific pass declared under Dialect/.../Passes.td and the conversion passes under Conversion/.../*.td — see the full list in the Passes reference.
Twelve cost-model cl::opt flags (table below).
The mlir-print-op-generic, --debug, --print-ir-* and --view-op-graph flags inherited from MlirOptMain.

Source: tools/qoala-opt/qoala-opt.cpp.

Cost-model flags¶

These flags configure the durations and error rates used by analyses (gate count, qubit lifetime, ESP, qmem-efficiency) and by the MILP-based block reorderer (qoalahost-reorder-blocks). They have no effect on lowering correctness.

All durations are in ticks. A tick is the compiler's discrete unit of time — a single timestep, defined as one nanosecond. This matches the resolution at which the Qoala runtime tracks wall-clock time, and lets the deadline-estimation and reordering MILPs encode every start time, duration, and gap as an integer SCIP variable rather than a continuous one.

Flag	Type	Default	Description
`--qoala-opt-single-gate-duration`	`uint32`	10	Duration of a single-qubit gate.
`--qoala-opt-single-gate-error`	`float`	0.01	Error probability of a single-qubit gate (used by the ESP analysis).
`--qoala-opt-two-gate-duration`	`uint32`	50	Duration of a two-qubit gate.
`--qoala-opt-dual-gate-error`	`float`	0.05	Error probability of a two-qubit gate (used by the ESP analysis).
`--qoala-opt-latency`	`uint32`	100	Per-receive base latency for classical communication operations. The duration of a `qoalahost.recv_int`/`recv_float` is `qoalaOptLatency + qoalaOptHostPeerLatency` (for tensor variants `recv_ints`/`recv_floats`, multiplied by the number of values). Note: no `desc()` string is registered for this flag in the source, so it appears without a description in `qoala-opt --help`.
`--qoala-opt-link-duration`	`uint64`	1000	Time taken by a single entanglement-generation attempt (a `qnet.eprs` / `qmem.eprs` / `netqasm.eprs` link).
`--qoala-opt-host-instr-time`	`uint32`	1	Time taken by a single classical instruction executed on the host (CPS) side. Combined with operation arity in `getDuration()` to model durations of `qoalahost.*` ops. (The string registered as `desc()` in the source erroneously refers to the QNPU.)
`--qoala-opt-host-peer-latency`	`uint32`	1	Per-message network latency for classical communication. Together with `--qoala-opt-latency`, sets the duration of receive operations (see the row above for the formula).
`--qoala-opt-qnos-instr-time`	`uint32`	1	Time taken by a single classical instruction executed on the QNPU/QPS side (the embedded classical processor inside the quantum stack). Used by NetQASM classical sub-instructions.
`--qoala-opt-qubit-lifetime`	`uint64`	500	Maximum allowed lifetime of a qubit (`L_max` in the deadline MILP). The same value is reused as the dephasing time `T₂` by the fidelity-estimate analysis.
`--qoala-opt-group-ent-reqs`	`bool`	`false`	If true, the functionization pass groups entanglement requests targeting the same remote into a single request routine. Reflects the current near-term-hardware constraint of one outstanding request routine per remote per program.
`--qoala-opt-program-horizon`	`uint32`	0	Upper bound on total execution time used by the deadline MILP. The default `0` is a sentinel meaning "derive automatically from the program"; the deadline MILP then uses `H = 2 × Σ duration(op)`. A user-supplied positive value below `Σ duration(op)` is rejected with a warning and the default is used instead.

There is also one hidden flag:

Flag	Type	Default	Description
`--qoala-opt-unoptimize`	`bool`	`false`	Run the "anti-optimization" passes used to compare worst-vs-best-case program runs. (`ReallyHidden`.)

Pass invocation¶

Pass names come from the mnemonic strings in the Passes.td files. For example:

qoala-opt program.mlir \
  --qnet-peephole-optimizations \
  --qnet-dead-code-elimination=with-classical-awareness=true \
  --lower-qoala-hir-to-mir \
  --lower-qoala-mir-to-lir=use-online-scheduler=true,max-ops-per-group=4

The wrapper passes (lower-qoala-mir-to-lir, lower-qmem-to-lower-dialects, etc.) accept their inner pass options embedded in the pass argument, not as separate top-level flags. --use-online-scheduler=true on its own will be rejected.

See Architecture / Pass pipeline for the recommended end-to-end ordering.

Standard MLIR knobs¶

qoala-opt inherits all flags from MlirOptMain (LLVM's opt tool). The most notable ones are mentioned here:

Flag	Use
`--debug`	Enable `LLVM_DEBUG` prints across the whole tool.
`--debug-only=<filter[,filter]>`	Restrict debug prints to specific `DEBUG_TYPE` filters.
`--mlir-print-op-generic`	Print ops in generic form, bypassing custom verifiers (handy in verifier-debug loops).
`--mlir-print-assume-verified`	Skip the verifier on print, allowing custom printers without verifier infinite loops.
`--mlir-disable-threading`	Serialize pass execution. Useful for deterministic debug output.
`--mlir-print-value-users`	Annotate printed IR with value-user comments.
`--print-ir-after=<pass>`, `--print-after-all`	Dump IR after a named pass / every pass.
`--print-ir-before=<pass>`, `--print-before-all`	Dump IR before a named pass / every pass.
`--view-op-graph`	Emit a Graphviz `.gv` representation of the op tree on stderr.

A typical graph-debug invocation looks like:

qoala-opt program.mlir --view-op-graph 2>&1 >/dev/null | tee program.gv
xdot program.gv

(The exact incantation also appears in the repo's README.md.)