19.1 The Extraction module

19.1 The `Extraction` module

This section explains how to import ML objects, to realize axioms and finally to generate ML code from the extracted programs of F_w.

These features do not belong to the core system, and appear as an independent module called Extraction.v (which is compiled during the installation of the system). So the first thing to do is to load this module:

Coq < Require Extraction.

19.1.1 Generating executable ML code

The Coq commands to generate ML code are:

`Write Caml File "`string`" [` ident₁ `...` ident_n`]options.`	(for Objective Caml)
`Write CamlLight File "`string`" [` ident₁ `...` ident_n`]options.`
`Write Haskell File "`string`" [` ident₁ `...` ident_n`]options.`

where string is the name given to the file to be produced (the suffix .ml is added if necessary), and ident₁ ... ident_n the names of the constants to be extracted. This list does not need to be exhaustive: it is automatically completed into a complete and minimal environment. Remaining axioms are translated into exceptions, and a warning is printed in that case. In particular, this will be the case for False_rec. (We will see below how to realize axioms).

Optimizations.

Since Caml Light and Objective Caml are strict languages, the extracted code has to be optimized in order to be efficient (for instance, when using induction principles we do not want to compute all the recursive calls but only the needed ones). So an optimization routine will be called each time the user want to generate Caml programs. Essentially, it performs constants expansions and reductions. Therefore some constants will not appear in the resulting Caml program (A warning is printed for each such constant). To avoid this, just put the constant name in the previous list ident₁ ... ident_n and it will not be expanded. Moreover, three options allow the user to control the expansion strategy :

noopt: : specifies not to do any optimization.
exact: : specifies to extract exactly the given objects (no recursivity).
expand [ ident₁ ... ident_n ]: : forces the expansion of the constants ident₁ ... ident_n (when it is possible).

19.1.2 Realizing axioms

It is possible to assume some axioms while developing a proof. Since these axioms can be any kind of proposition or object type, they may perfectly well have some computational content. But a program must be a closed term, and of course the system cannot guess the program which realizes an axiom. Therefore, it is possible to tell the system what program (an F_w term actually) corresponds to a given Coq axiom. The command is Link and the syntax:

Link ident := Fwterm.

where ident is the name of the axiom to realize and Fwterm the term which realizes it. The system checks that this term has the same type as the axiom ident, and returns an error if not. This command attaches a body to an axiom, and can be seen as a transformation of an axiom into a constant.

These semantical attachments have to be done before generating the ML code. All type variables must be realized, and term variables which are not realized will be translated into exceptions.
Example: Let us illustrate this feature on a small example. Assume that you have a type variable A of type Set:

Coq < Parameter A : Set.

and that your specification proof assumes that there is an order relation inf over that type (which has no computational content), and that this relation is total and decidable:

Coq < Parameter inf : A -> A -> Prop.

Coq < Axiom inf_total : (x,y:A) {(inf x y)}+{(inf y x)}.

Now suppose that we want to use this specification proof on natural numbers; this means A has to be instantiated by nat and the axiom inf_total will be realized, for instance, using the order relation le on that type and the decidability lemma le_lt_dec. Here is how to proceed:

Coq < Require Compare_dec.

Coq < Link A := nat.
Constant A linked to nat

Coq < Link inf_total := le_lt_dec.
Constant inf_total linked to le_lt_dec

Warning: There is no rollback on the command Link, that is the semantical attachments are not forgotten when doing a Reset, or a Restore State command. This will be corrected in a later version.

19.1.3 Importing ML objects

In order to realize axioms and to instantiate programs on real data types, like int, string, ... or more complicated data structures, one want to import existing ML objects in the F_w environment. The system provides such features, through the commands ML Import Constant and ML Import Inductive. The first one imports an ML object as a new axiom and the second one adds a new inductive definition corresponding to an ML inductive type.

Warning.

In the case of Caml dialects, the system would be able to check the correctness of the imported objects by looking into the interfaces files of Caml modules (.mli files), but this feature is not yet implemented. So one must be careful when declaring the types of the imported objects.

Caml names.

When referencing a Caml object, you can use strings instead of identifiers. Therefore you can use the double underscore notation module__name (Caml Light objects) or the dot notation module.name (Objective Caml objects) to precise the module in which lies the object.

19.1.4 Importing inductive types

The Coq command to import an ML inductive type is:

ML Import Inductive ident [ident₁ ... ident_n] == <Inductive Definition>.

where ident is the name of the ML type, ident₁ ... ident_n the name of its constructors, and <Inductive Definition> the corresponding Coq inductive definition (see 1.3.3 in the Reference Manual for the syntax of inductive definitions).

This command inserts the <Inductive Definition> in the F_w environment, without elimination principles. From that moment, it is possible to use that type like any other F_w object, and in particular to use it to realize axioms. The names ident ident₁ ... ident_n may be different from the names given in the inductive definition, in order to avoid clash with previous constants, and are restored when generating the ML code.

One can also import mutual inductive types with the command:

`ML Import Inductive`	ident₁ `[`ident₁¹ `...` ident_n_₁¹`]`
	...
	ident_k `[`ident₁^k `...` ident_n_{_k}^k`]`
	`==` `<Mutual Inductive Definition>.`

Examples:

Let us show for instance how to import the type bool of Caml Light booleans:

Coq < ML Import Inductive bool [ true false ] ==
Coq < Inductive BOOL : Set := TRUE : BOOL
Coq < | FALSE : BOOL.
ML inductive type(s) bool imported.

Here we changed the names because the type bool is already defined in the initial state of Coq.
Assuming that one defined the mutual inductive types tree and forest in a Caml Light module, one can import them with the command:

Coq < ML Import Inductive tree [node] forest [empty cons] ==
Coq <     Mutual [A:Set] Inductive
Coq <        tree : Set := node : A -> (forest A) -> (tree A)
Coq <     with
Coq <        forest : Set := empty : (forest A)
Coq <                     | cons : (tree A) -> (forest A) -> (forest A).
ML inductive type(s) tree,
forest imported.
One can import the polymorphic type of Caml Light lists with the command:
Coq < ML Import Inductive list [nil cons] ==
Coq < Inductive list [A:Set] : Set := nil : (list A)
Coq < | cons : A->(list A)->(list A).
ML inductive type(s) list imported.

Remark: One would have to re-define nil and cons at the top of its program because these constructors have no name in Caml Light.

19.1.5 Importing terms and abstract types

The other command to import an ML object is:

ML Import Constant ident_ML == ident : Fwterm.

where ident_ML is the name of the ML object and Fwterm its type in F_w. This command defines an axiom in F_w of name ident and type Fwterm.
Example: To import the type int of Caml Light integers, and the < binary relation on this type, just do

Coq < ML Import Constant int == int : Set.
int imported.

Coq < ML Import Constant lt_int == lt_int : int -> int -> BOOL.
lt_int imported.

assuming that the Caml Light type bool is already imported (with the name BOOL, as above).

19.1.6 Direct use of ML objects

Sometimes the user do not want to extract Coq objects to new ML code but wants to use already existing ML objects. For instance, it is the case for the booleans, which already exist in ML: the user do not want to extract the Coq inductive type bool to a new type for booleans, but wants to use the primitive boolean of ML.

The command Extract fulfills this requirement. It allows the user to declare constant and inductive types which will not be extracted but replaced by ML objects. The syntax is the following

Extract Constant ident => ident'.

Extract Inductive ident=>ident' [ ident'₁ ...ident'_n ].

where ident is the name of the Coq object and the prime identifiers the name of the corresponding ML objects (the names between brackets are the names of the constructors). Mutually recursive types are declared one by one, in any order.
Example: Typical examples are the following:

Coq < Extract Inductive unit => unit [ "()" ].

Coq < Extract Inductive bool => bool [ true false ].

Coq < Extract Inductive sumbool => bool [ true false ].

19.1.7 Differences between Coq and ML type systems

ML types that are not F_w types

Some ML recursive types have no counterpart in the type system of Coq, like types using the record construction, or non positive types like

# type T = C of T->T;;

In that case, you cannot import those types as inductive types, and the only way to do is to import them as abstract types (with ML Import) together with the corresponding building and de-structuring functions (still with ML Import Constant).

Programs that are not ML-typable

On the contrary, some extracted programs in F_w are not typable in ML. There are in fact two cases which can be problematic:

If some part of the program is very polymorphic, there may be no ML type for it. In that case the extraction to ML works all right but the generated code may be refused by the ML type-checker. A very well known example is the distr-pair function:
Definition dp := [A,B:Set][x:A][y:B][f:(C:Set)C->C](f A x,f B y).
In Caml Light, for instance, the extracted term is let dp x y f = pair((f x),(f y)) and has type
dp : 'a -> 'a -> ('a -> 'b) -> ('b,'b) prod
which is not its original type, but a restriction.
Some definitions of F_w may have no counterpart in ML. This happens when there is a quantification over types inside the type of a constructor; for example:
Inductive anything : Set := dummy : (A:Set)A->anything.
which corresponds to the definition of ML dynamics.

The first case is not too problematic: it is still possible to run the programs by switching off the type-checker during compilation. Unless you misused the semantical attachment facilities you should never get any message like ``segmentation fault'' for which the extracted code would be to blame. To switch off the Caml type-checker, use the function obj__magic which gives the type 'a to any object; but this implies changing a little the extracted code by hand.

The second case is fatal. If some inductive type cannot be translated to ML, one has to change the proof (or possibly to ``cheat'' by some low-level manipulations we would not describe here).

We have to say, though, that in most ``realistic'' programs, these problems do not occur. For example all the programs of the library are accepted by Caml type-checker except Higman.v¹.