HOLTrace
HOLTrace: Format

A HOLTrace file records a sequence of HOL Light proofs and theorems. This document specifies version 1 of the file format. Ths document is aimed at implementors writing tools to read or write HOLTrace files.

Files in HOLTrace format conventionally have filenames ending .50, although this is not required.

This document allows multiletter italicized variable names. Concatenation here does not mean multiplication.

Overall structure

The file is a sequence of lines. Each line consists of printable ASCII characters (bytes 32 through 126) followed by LF (byte 10). There is always a terminating LF. This is a binary file format: LF does not mean CR LF.

The first line is HOLTrace 1 (followed by LF). Each subsequent line begins with a 1-character tag and continues with a sequence of zero or more arguments. Arguments are encoded nonnegative integers, except that some tags also have text as a final argument.

Integers are encoded in a way that often appears negative or non-numeric. Specifically:

For example, for encodings relative to n = 10000: argument 0 means 0; 9 means 9; A means 10; V means 31; 10 means 32; VV means 1023; 100 means 1024; 8OG means 8976; -VV means 8977; -10 means 9968; -V means 9969; -B means 9989; ) means 9990; ! means 9999; 9OG means 10000; 9OH means 10001.

Spaces between arguments are suppressed in the file: e.g., tag b with arguments 0 and bool is written as b0bool. Exception: if an encoded argument ends with one of the characters 0123456789ABCDEFGHIJKLMNOPQRSTUV and the next encoded argument begins with one of those characters (or a space) then a space is inserted: e.g., tag e with arguments 17 and 14 is written e17 14.

Order

Some of the line definitions below say that a line refers to other lines in the file. Some references are via explicit arguments: for example, the line b2!!fun specifies a type by reference to the previously specified type. Some references are implicit: for example, a theorem is always proven by the most recent inference.

These references must be to lines earlier in the file than the current line. In other words, circular references are prohibited, and lines must appear in topological order.

A file writer normally deduplicates objects (for example, there is only one b0bool line), but a file reader does not rely on this. For example, correctly implementing the semantics of the S tag relies on recognizing whether various objects are equal.

Types

Each line with tag a or b is a "type line" specifying a type. Each type is assigned an identifying number (a "type ID") for future reference, counting from 0 upwards: the first type line specifies type 0, i.e., the type having type ID 0; the next type line specifies type 1; the next type line after that specifies type 2; etc. To track these references, the file reader mantains a type counter that starts at 0 and that increments by 1 immediately after each type line. (The file writer also does this.)

Specifically, type lines have the following meanings:

For example, the lines b0bool and b2!!fun and b2@!fun (in that order) specify types bool and fun(bool,bool) and fun(bool,fun(bool,bool)), i.e., bool and bool->bool and bool->bool->bool.

Terms

Each line with tag c or d or e or f is a "term line" specifying a term. Each term is assigned an identifying number (a "term ID") for future reference, counting from 0 upwards: the first term line specifies term 0, i.e., the term having term ID 0; the next term line specifies term 1; the next term line after that specifies term 2; etc. To track these references, the file reader mantains a term counter that starts at 0 and that increments by 1 immediately after each term line.

The term counter is separate from the type counter. For example, if the first five lines have tags b, b, b, c, and d respectively, then those lines specify type 0, type 1, type 2, term 0, and term 1 respectively.

Specifically, term lines have the following meanings:

Theorems

Each line with tag t is a "theorem line" specifying a theorem. Each theorem is assigned an identifying number (a "theorem ID") for future reference, counting from 0 upwards: the first theorem line specifies theorem 0, i.e., the theorem having theorem ID 0; the next theorem line specifies theorem 1; the next theorem line after that specifies theorem 2; etc. To track these references, the file reader mantains a theorem counter that starts at 0 and that increments by 1 immediately after each theorem line. The theorem counter is separate from the term counter and the type counter.

Specifically, theorem lines have the following meaning:

(Warning: Often the literature on logic says "sequent" rather than "theorem", reserving "theorem" for the case of no hypotheses, although the conclusion can still be a term of the form P implies Q. "Sequent" can also refer to a more general concept allowing a disjunction of any number of conclusions.)

Some readers that check inferences expect the hypotheses of each theorem to be sorted in a particular order. The order is outside the scope of this document (but is connected to the reason for variables being labeled d instead of v).

As a complete example of a theorem, here are annotations explaining the first lines with lowercase-letter tags in a trace of hol.ml:

line specifies as
b0bool type 0 type bool
b2!!fun type 1 type bool->bool
b2@!fun type 2 type bool->bool->bool
c!= term 0 constant = of type bool->bool->bool
d#T term 1 variable T of type bool
e@@! term 2 value =(T) of type bool
b2@#fun type 3 type (bool->bool)->bool
b2#!fun type 4 type (bool->bool)->(bool->bool)->bool
c!= term 3 constant = of type (bool->bool)->(bool->bool)->bool
d%p term 4 variable p of type bool
f$!! term 5 function p↦p of type bool->bool
e@#! term 6 value =(p↦p) of type (bool->bool)->bool
e%!@ term 7 value =(p↦p)(p↦p) of type bool
e%^! term 8 value =(T)(=(p↦p)(p↦p)) of type bool
c%T term 9 constant T of type bool
e$)! term 10 value =(T) of type bool->bool
e%!$ term 11 value =(T)(=(p↦p)(p↦p)) of type bool
t! theorem 0 theorem =(T)(=(p↦p)(p↦p))

This theorem has no hypotheses. Its conclusion is that the constant T is equal to the equality of p↦p and p↦p, i.e., that T is true if and only if p↦p equals p↦p.

Theorem names

Lines with tag / assign names to theorems. Specifically, these lines have the following meaning:

A theorem can have multiple names, but multiple theorems cannot have the same name.

Inferences

Each line with an uppercase letter tag (within the list of tags below) is an "inference line" specifying an inference (a way to prove a theorem). Each inference is assigned an identifying number (an "inference ID") for future reference, counting from 0 upwards: the first inference line defines inference 0, i.e., the inference having inference ID 0; the next inference line defines inference 1; the next inference line after that defines inference 2; etc. To track these references, the file reader mantains an inference counter that starts at 0 and that increments by 1 immediately after each inference line. The inference counter is separate from the theorem counter, the term counter, and the type counter.

For example, in a trace of hol.ml, the first lines include the lines shown above (b0bool etc.) with lowercase-letter tags, and also one inference line that proves theorem 0, namely a C!T line right before the c%T line.

Some readers of HOLTrace files ignore inferences; inference lines can be simply removed from the file without affecting those readers. However, file readers that check inferences expect each theorem to be preceded in the file by at least one inference and to be proven specifically by the last of those inferences. (Non-inference lines can appear between that inference and the theorem. There can also be further inferences that prove the same theorem in different ways.)

Except for the arguments listed as text below, all arguments on inference lines are nonnegative integers. Arguments labeled type are type IDs, encoded relative to the current type counter; arguments labeled term are term IDs, encoded relative to the current term counter; arguments labeled th (or TH) are theorem IDs, encoded relative to the current theorem counter. For example, ! refers to the most recently specified type or term or theorem respectively.

Ten of the inference tags are HOL Light's ten primitive inference rules:

The other inference tags are as follows:

As an example of a type definition, take type variables A and B, and consider the set of functions of the form (aba=x and b=y) having type A->B->bool, where x and y range over A and B respectively. Types in HOL Light are nonempty, so this set is nonempty. Assume that the most recent theorem proves that this set is nonempty; that the 430th term before now is this set, with type (A->B->bool)->bool; and that type 49 is the type A->B->bool. Then the inference A1H-DE!]prod]ABS_prod]REP_prod]A]B] returns the theorem ABS_prod(REP_prod(a)) = a, where ABS_prod maps A->B->bool to a new type prod(A,B) and REP_prod maps the other way.

All constant names must be distinct strings, where constant names mean the following: =; @; text in each C line; absname in each A line; repname in each A line. All type names must be distinct strings, where type names mean the following: bool; fun; ind; abstypename in each A line.

ProofTrace IDs

The tags , and ; are used to enable exact regeneration of a ProofTrace file given an appropriate HOLTrace file. Exact regeneration means a byte-for-byte match. This often requires multiple copies of the same theorem and multiple proofs of the same theorem; ProofTrace has its own set of IDs distinguishing these. A file reader that handles ProofTrace-related tags maintains a "ProofTrace counter" that starts at 0 and that increments by 1 immediately after each line with , or ; as a tag. The ProofTrace counter is separate from the other counters mentioned above.

ProofTrace lines have the following meanings:

Outside the context of generating ProofTrace outputs (to check against ProofTrace or to interoperate with tools that expect ProofTrace format), lines with tag , (or ;) are not useful and can be removed, typically saving more than half of the space in a HOLTrace file (and speeding up further processing). File readers not using ProofTrace tags also do not need the inference counter, although an "inference initialized" flag is useful for checking each theorem against the most recent inference.

A ProofTrace-to-HOLTrace converter generates / before , when it sees an ID that has a theorem name. A HOLTrace-to-ProofTrace converter treats / as applying to the current ProofTrace ID.


Version: This is version 2025.06.16 of the "Format" web page.