Recognition of tail calls: correctness proof
Require Import Coqlib Maps Integers AST Linking.
Require Import Values Memory Events Globalenvs Smallstep.
Require Import Op Registers RTL Conventions Tailcall.
Syntactic properties of the code transformation
Measuring the number of instructions eliminated
The return_measure c pc function counts the number of instructions
eliminated by the code transformation, where pc is the successor
of a call turned into a tailcall. This is the length of the
move/nop/return sequence recognized by the is_return boolean function.
Fixpoint return_measure_rec (
n:
nat) (
c:
code) (
pc:
node)
{
struct n}:
nat :=
match n with
|
O =>
O
|
S n' =>
match c!
pc with
|
Some(
Inop s) =>
S(
return_measure_rec n'
c s)
|
Some(
Iop op args dst s) =>
S(
return_measure_rec n'
c s)
|
_ =>
O
end
end.
Definition return_measure (
c:
code) (
pc:
node) :=
return_measure_rec niter c pc.
Lemma return_measure_bounds:
forall f pc, (
return_measure f pc <=
niter)%
nat.
Proof.
intro f.
assert (
forall n pc, (
return_measure_rec n f pc <=
n)%
nat).
induction n;
intros;
simpl.
lia.
destruct (
f!
pc);
try lia.
destruct i;
try lia.
generalize (
IHn n0).
lia.
generalize (
IHn n0).
lia.
intros.
unfold return_measure.
apply H.
Qed.
Remark return_measure_rec_incr:
forall f n1 n2 pc,
(
n1 <=
n2)%
nat ->
(
return_measure_rec n1 f pc <=
return_measure_rec n2 f pc)%
nat.
Proof.
induction n1; intros; simpl.
lia.
destruct n2. extlia. assert (n1 <= n2)%nat by lia.
simpl. destruct f!pc; try lia. destruct i; try lia.
generalize (IHn1 n2 n H0). lia.
generalize (IHn1 n2 n H0). lia.
Qed.
Lemma is_return_measure_rec:
forall f n n'
pc r,
is_return n f pc r =
true -> (
n <=
n')%
nat ->
return_measure_rec n f.(
fn_code)
pc =
return_measure_rec n'
f.(
fn_code)
pc.
Proof.
induction n;
simpl;
intros.
congruence.
destruct n'.
extlia.
simpl.
destruct (
fn_code f)!
pc;
try congruence.
destruct i;
try congruence.
decEq.
apply IHn with r.
auto.
lia.
destruct (
is_move_operation o l);
try congruence.
destruct (
Reg.eq r r1);
try congruence.
decEq.
apply IHn with r0.
auto.
lia.
Qed.
Relational characterization of the code transformation
The is_return_spec characterizes the instruction sequences
recognized by the is_return boolean function.
Inductive is_return_spec (
f:
function):
node ->
reg ->
Prop :=
|
is_return_none:
forall pc r,
f.(
fn_code)!
pc =
Some(
Ireturn None) ->
is_return_spec f pc r
|
is_return_some:
forall pc r,
f.(
fn_code)!
pc =
Some(
Ireturn (
Some r)) ->
is_return_spec f pc r
|
is_return_nop:
forall pc r s,
f.(
fn_code)!
pc =
Some(
Inop s) ->
is_return_spec f s r ->
(
return_measure f.(
fn_code)
s <
return_measure f.(
fn_code)
pc)%
nat ->
is_return_spec f pc r
|
is_return_move:
forall pc r r'
s,
f.(
fn_code)!
pc =
Some(
Iop Omove (
r::
nil)
r'
s) ->
is_return_spec f s r' ->
(
return_measure f.(
fn_code)
s <
return_measure f.(
fn_code)
pc)%
nat ->
is_return_spec f pc r.
Lemma is_return_charact:
forall f n pc rret,
is_return n f pc rret =
true -> (
n <=
niter)%
nat ->
is_return_spec f pc rret.
Proof.
The transf_instr_spec predicate relates one instruction in the
initial code with its possible transformations in the optimized code.
Inductive transf_instr_spec (
f:
function):
instruction ->
instruction ->
Prop :=
|
transf_instr_tailcall:
forall sig ros args res s,
f.(
fn_stacksize) = 0 ->
is_return_spec f s res ->
transf_instr_spec f (
Icall sig ros args res s) (
Itailcall sig ros args)
|
transf_instr_default:
forall i,
transf_instr_spec f i i.
Lemma transf_instr_charact:
forall f pc instr,
f.(
fn_stacksize) = 0 ->
transf_instr_spec f instr (
transf_instr f pc instr).
Proof.
Lemma transf_instr_lookup:
forall f pc i,
f.(
fn_code)!
pc =
Some i ->
exists i', (
transf_function f).(
fn_code)!
pc =
Some i' /\
transf_instr_spec f i i'.
Proof.
Semantic properties of the code transformation
The ``less defined than'' relation between register states
A call followed by a return without an argument can be turned
into a tail call. In this case, the original function returns
Vundef, while the transformed function can return any value.
We account for this situation by using the ``less defined than''
relation between values and between memory states. We need to
extend it pointwise to register states.
Lemma regs_lessdef_init_regs:
forall params vl vl',
Val.lessdef_list vl vl' ->
regs_lessdef (
init_regs vl params) (
init_regs vl'
params).
Proof.
induction params;
intros.
simpl.
red;
intros.
rewrite Regmap.gi.
constructor.
simpl.
inv H.
red;
intros.
rewrite Regmap.gi.
constructor.
apply set_reg_lessdef.
auto.
auto.
Qed.
Proof of semantic preservation
Definition match_prog (
p tp:
RTL.program) :=
match_program (
fun cu f tf =>
tf =
transf_fundef f)
eq p tp.
Lemma transf_program_match:
forall p,
match_prog p (
transf_program p).
Proof.
Section PRESERVATION.
Variable prog tprog:
program.
Hypothesis TRANSL:
match_prog prog tprog.
Let ge :=
Genv.globalenv prog.
Let tge :=
Genv.globalenv tprog.
Lemma symbols_preserved:
forall (
s:
ident),
Genv.find_symbol tge s =
Genv.find_symbol ge s.
Proof (
Genv.find_symbol_transf TRANSL).
Lemma functions_translated:
forall (
v:
val) (
f:
RTL.fundef),
Genv.find_funct ge v =
Some f ->
Genv.find_funct tge v =
Some (
transf_fundef f).
Proof (
Genv.find_funct_transf TRANSL).
Lemma funct_ptr_translated:
forall (
b:
block) (
f:
RTL.fundef),
Genv.find_funct_ptr ge b =
Some f ->
Genv.find_funct_ptr tge b =
Some (
transf_fundef f).
Proof (
Genv.find_funct_ptr_transf TRANSL).
Lemma senv_preserved:
Senv.equiv ge tge.
Proof (
Genv.senv_transf TRANSL).
Lemma sig_preserved:
forall f,
funsig (
transf_fundef f) =
funsig f.
Proof.
Lemma stacksize_preserved:
forall f,
fn_stacksize (
transf_function f) =
fn_stacksize f.
Proof.
Lemma find_function_translated:
forall ros rs rs'
f,
find_function ge ros rs =
Some f ->
regs_lessdef rs rs' ->
find_function tge ros rs' =
Some (
transf_fundef f).
Proof.
Consider an execution of a call/move/nop/return sequence in the
original code and the corresponding tailcall in the transformed code.
The transition sequences are of the following form
(left: original code, right: transformed code).
f is the calling function and
fd the called function.
State stk f (Icall instruction) State stk' f' (Itailcall)
Callstate (frame::stk) fd args Callstate stk' fd' args'
. .
. .
. .
Returnstate (frame::stk) res Returnstate stk' res'
State stk f (move/nop/return seq)
.
.
.
State stk f (return instr)
Returnstate stk res
The simulation invariant must therefore account for two kinds of
mismatches between the transition sequences:
-
The call stack of the original program can contain more frames
than that of the transformed program (e.g. frame in the example above).
-
The regular states corresponding to executing the move/nop/return
sequence must all correspond to the single Returnstate stk' res'
state of the transformed program.
We first define the simulation invariant between call stacks.
The first two cases are standard, but the third case corresponds
to a frame that was eliminated by the transformation.
Inductive match_stackframes:
list stackframe ->
list stackframe ->
Prop :=
|
match_stackframes_nil:
match_stackframes nil nil
|
match_stackframes_normal:
forall stk stk'
res sp pc rs rs'
f,
match_stackframes stk stk' ->
regs_lessdef rs rs' ->
match_stackframes
(
Stackframe res f (
Vptr sp Ptrofs.zero)
pc rs ::
stk)
(
Stackframe res (
transf_function f) (
Vptr sp Ptrofs.zero)
pc rs' ::
stk')
|
match_stackframes_tail:
forall stk stk'
res sp pc rs f,
match_stackframes stk stk' ->
is_return_spec f pc res ->
f.(
fn_stacksize) = 0 ->
match_stackframes
(
Stackframe res f (
Vptr sp Ptrofs.zero)
pc rs ::
stk)
stk'.
Here is the invariant relating two states. The first three
cases are standard. Note the ``less defined than'' conditions
over values and register states, and the corresponding ``extends''
relation over memory states.
Inductive match_states:
state ->
state ->
Prop :=
|
match_states_normal:
forall s sp pc rs m s'
rs'
m'
f
(
STACKS:
match_stackframes s s')
(
RLD:
regs_lessdef rs rs')
(
MLD:
Mem.extends m m'),
match_states (
State s f (
Vptr sp Ptrofs.zero)
pc rs m)
(
State s' (
transf_function f) (
Vptr sp Ptrofs.zero)
pc rs'
m')
|
match_states_call:
forall s f args m s'
args'
m',
match_stackframes s s' ->
Val.lessdef_list args args' ->
Mem.extends m m' ->
match_states (
Callstate s f args m)
(
Callstate s' (
transf_fundef f)
args'
m')
|
match_states_return:
forall s v m s'
v'
m',
match_stackframes s s' ->
Val.lessdef v v' ->
Mem.extends m m' ->
match_states (
Returnstate s v m)
(
Returnstate s'
v'
m')
|
match_states_interm:
forall s sp pc rs m s'
m'
f r v'
(
STACKS:
match_stackframes s s')
(
MLD:
Mem.extends m m'),
is_return_spec f pc r ->
f.(
fn_stacksize) = 0 ->
Val.lessdef (
rs#
r)
v' ->
match_states (
State s f (
Vptr sp Ptrofs.zero)
pc rs m)
(
Returnstate s'
v'
m').
The last case of
match_states corresponds to the execution
of a move/nop/return sequence in the original code that was
eliminated by the transformation:
State stk f (move/nop/return seq) ~~ Returnstate stk' res'
.
.
.
State stk f (return instr) ~~ Returnstate stk' res'
To preserve non-terminating behaviors, we need to make sure
that the execution of this sequence in the original code cannot
diverge. For this, we introduce the following complicated
measure over states, which will decrease strictly whenever
the original code makes a transition but the transformed code
does not.
Definition measure (
st:
state) :
nat :=
match st with
|
State s f sp pc rs m => (
List.length s * (
niter + 2) +
return_measure f.(
fn_code)
pc + 1)%
nat
|
Callstate s f args m => 0%
nat
|
Returnstate s v m => (
List.length s * (
niter + 2))%
nat
end.
Ltac TransfInstr :=
match goal with
|
H: (
PTree.get _ (
fn_code _) =
_) |-
_ =>
destruct (
transf_instr_lookup _ _ _ H)
as [
i' [
TINSTR TSPEC]];
inv TSPEC
end.
Ltac EliminatedInstr :=
match goal with
|
H: (
is_return_spec _ _ _) |-
_ =>
inv H;
try congruence
|
_ =>
idtac
end.
The proof of semantic preservation, then, is a simulation diagram
of the ``option'' kind.
Lemma transf_step_correct:
forall s1 t s2,
step ge s1 t s2 ->
forall s1' (
MS:
match_states s1 s1'),
(
exists s2',
step tge s1'
t s2' /\
match_states s2 s2')
\/ (
measure s2 <
measure s1 /\
t =
E0 /\
match_states s2 s1')%
nat.
Proof.
Lemma transf_initial_states:
forall st1,
initial_state prog st1 ->
exists st2,
initial_state tprog st2 /\
match_states st1 st2.
Proof.
Lemma transf_final_states:
forall st1 st2 r,
match_states st1 st2 ->
final_state st1 r ->
final_state st2 r.
Proof.
intros. inv H0. inv H. inv H5. inv H3. constructor.
Qed.
The preservation of the observable behavior of the program then
follows.
Theorem transf_program_correct:
forward_simulation (
RTL.semantics prog) (
RTL.semantics tprog).
Proof.
End PRESERVATION.