Translation of parallel moves into sequences of individual moves
The ``parallel move'' problem, also known as ``parallel assignment'',
is the following. We are given a list of (source, destination) pairs
of locations. The goal is to find a sequence of elementary
moves (
loc <- loc assignments) such that, at the end of the sequence,
location
dst contains the value of location
src at the beginning of
the sequence, for each (
src,
dst) pairs in the given problem.
Moreover, other locations should keep their values, except one register
of each type, which can be used as temporaries in the generated sequences.
The parallel move problem is trivial if the sources and destinations do
not overlap. For instance,
(R1, R2) <- (R3, R4) becomes R1 <- R3; R2 <- R4
However, arbitrary overlap is allowed between sources and destinations.
This requires some care in ordering the individual moves, as in
(R1, R2) <- (R3, R1) becomes R2 <- R1; R1 <- R3
Worse, cycles (permutations) can require the use of temporaries, as in
(R1, R2, R3) <- (R2, R3, R1) becomes T <- R1; R1 <- R2; R2 <- R3; R3 <- T;
An amazing fact is that for any parallel move problem, at most one temporary
(or in our case one integer temporary and one float temporary) is needed.
The development in this section was designed by Laurence Rideau and
Bernard Serpette. It is described in the paper
``Tilting at windmills with Coq: Formal verification of
a compilation algorithm for parallel moves'',
http://hal.inria.fr/inria-00176007
Require Import Relations.
Require Import Axioms.
Require Import Coqlib.
Require Import Recdef.
Section PARMOV.
Registers, moves, and their semantics
The development is parameterized by the type of registers,
equipped with a decidable equality.
Variable reg:
Type.
Variable reg_eq :
forall (
r1 r2:
reg), {
r1=
r2} + {
r1<>
r2}.
The temp function maps every register r to the register that
should be used to save the value of r temporarily while performing
a cyclic assignment involving r. In the simplest case, there is
only one such temporary register rtemp and temp is the constant
function fun r => rtemp. In more realistic cases, different temporary
registers can be used to hold different values. In the case of Compcert,
we have two temporary registers, one for integer values and the other
for floating-point values, and temp returns the appropriate temporary
depending on the type of its argument.
Variable temp:
reg ->
reg.
Definition is_temp (
r:
reg):
Prop :=
exists s,
r =
temp s.
A set of moves is a list of pairs of registers, of the form
(source, destination).
Definition moves := (
list (
reg *
reg))%
type.
Definition srcs (
m:
moves) :=
List.map (@
fst reg reg)
m.
Definition dests (
m:
moves) :=
List.map (@
snd reg reg)
m.
Semantics of moves
The dynamic semantics of moves is given in terms of environments.
An environment is a mapping of registers to their current values.
Variable val:
Type.
Definition env :=
reg ->
val.
Lemma env_ext:
forall (
e1 e2:
env),
(
forall r,
e1 r =
e2 r) ->
e1 =
e2.
Proof (@
extensionality reg val).
The main operation over environments is update: it assigns
a value v to a register r and preserves the values of other
registers.
Definition update (
r:
reg) (
v:
val) (
e:
env) :
env :=
fun r' =>
if reg_eq r'
r then v else e r'.
Lemma update_s:
forall r v e,
update r v e r =
v.
Proof.
unfold update;
intros.
destruct (
reg_eq r r).
auto.
congruence.
Qed.
Lemma update_o:
forall r v e r',
r' <>
r ->
update r v e r' =
e r'.
Proof.
unfold update;
intros.
destruct (
reg_eq r'
r).
congruence.
auto.
Qed.
Lemma update_ident:
forall r e,
update r (
e r)
e =
e.
Proof.
Lemma update_commut:
forall r1 v1 r2 v2 e,
r1 <>
r2 ->
update r1 v1 (
update r2 v2 e) =
update r2 v2 (
update r1 v1 e).
Proof.
Lemma update_twice:
forall r v e,
update r v (
update r v e) =
update r v e.
Proof.
A list of moves (src1, dst1), ..., (srcN, dstN) can be given
three different semantics, as environment transformers.
The first semantics corresponds to a parallel assignment:
dst1, ..., dstN are set to the initial values of src1, ..., srcN.
Fixpoint exec_par (
m:
moves) (
e:
env) {
struct m}:
env :=
match m with
|
nil =>
e
| (
s,
d) ::
m' =>
update d (
e s) (
exec_par m'
e)
end.
The second semantics corresponds to a sequence of individual
assignments: first, dst1 is set to the initial value of src1;
then, dst2 is set to the current value of src2 after the first
assignment; etc.
Fixpoint exec_seq (
m:
moves) (
e:
env) {
struct m}:
env :=
match m with
|
nil =>
e
| (
s,
d) ::
m' =>
exec_seq m' (
update d (
e s)
e)
end.
The third semantics is also sequential, but executes the moves
in reverse order, starting with srcN, dstN and finishing with
src1, dst1.
Fixpoint exec_seq_rev (
m:
moves) (
e:
env) {
struct m}:
env :=
match m with
|
nil =>
e
| (
s,
d) ::
m' =>
let e' :=
exec_seq_rev m'
e in
update d (
e'
s)
e'
end.
Specification of the non-deterministic algorithm
Rideau and Serpette's parallel move compilation algorithm is first
specified as a non-deterministic set of rewriting rules operating
over triples (mu, sigma, tau) of moves. We define these rules
as an inductive predicate transition relating triples of moves,
and its reflexive transitive closure transitions.
Inductive state:
Type :=
State (
mu:
moves) (
sigma:
moves) (
tau:
moves) :
state.
Definition no_read (
mu:
moves) (
d:
reg) :
Prop :=
~
In d (
srcs mu).
Inductive transition:
state ->
state ->
Prop :=
|
tr_nop:
forall mu1 r mu2 sigma tau,
transition (
State (
mu1 ++ (
r,
r) ::
mu2)
sigma tau)
(
State (
mu1 ++
mu2)
sigma tau)
|
tr_start:
forall mu1 s d mu2 tau,
transition (
State (
mu1 ++ (
s,
d) ::
mu2)
nil tau)
(
State (
mu1 ++
mu2) ((
s,
d) ::
nil)
tau)
|
tr_push:
forall mu1 d r mu2 s sigma tau,
transition (
State (
mu1 ++ (
d,
r) ::
mu2) ((
s,
d) ::
sigma)
tau)
(
State (
mu1 ++
mu2) ((
d,
r) :: (
s,
d) ::
sigma)
tau)
|
tr_loop:
forall mu sigma s d tau,
transition (
State mu (
sigma ++ (
s,
d) ::
nil)
tau)
(
State mu (
sigma ++ (
temp s,
d) ::
nil) ((
s,
temp s) ::
tau))
|
tr_pop:
forall mu s0 d0 s1 d1 sigma tau,
no_read mu d1 ->
d1 <>
s0 ->
transition (
State mu ((
s1,
d1) ::
sigma ++ (
s0,
d0) ::
nil)
tau)
(
State mu (
sigma ++ (
s0,
d0) ::
nil) ((
s1,
d1) ::
tau))
|
tr_last:
forall mu s d tau,
no_read mu d ->
transition (
State mu ((
s,
d) ::
nil)
tau)
(
State mu nil ((
s,
d) ::
tau)).
Definition transitions:
state ->
state ->
Prop :=
clos_refl_trans state transition.
Well-formedness properties of states
A state
mu, sigma, tau is well-formed if the following properties hold:
-
mu concatenated with sigma is a ``mill'', that is, no destination
appears twice (predicate is_mill below).
-
No temporary register appears in mu (predicate move_no_temp).
-
No temporary register appears in sigma except perhaps as the source
of the last move in sigma (predicate temp_last).
-
sigma is a ``path'', that is, the source of a move is the destination
of the following move.
Definition is_mill (
m:
moves) :
Prop :=
list_norepet (
dests m).
Definition is_not_temp (
r:
reg) :
Prop :=
forall d,
r <>
temp d.
Definition move_no_temp (
m:
moves) :
Prop :=
forall s d,
In (
s,
d)
m ->
is_not_temp s /\
is_not_temp d.
Definition temp_last (
m:
moves) :
Prop :=
match List.rev m with
|
nil =>
True
| (
s,
d) ::
m' =>
is_not_temp d /\
move_no_temp m'
end.
Definition is_first_dest (
m:
moves) (
d:
reg) :
Prop :=
match m with
|
nil =>
True
| (
s0,
d0) ::
_ =>
d =
d0
end.
Inductive is_path:
moves ->
Prop :=
|
is_path_nil:
is_path nil
|
is_path_cons:
forall s d m,
is_first_dest m s ->
is_path m ->
is_path ((
s,
d) ::
m).
Inductive state_wf:
state ->
Prop :=
|
state_wf_intro:
forall mu sigma tau,
is_mill (
mu ++
sigma) ->
move_no_temp mu ->
temp_last sigma ->
is_path sigma ->
state_wf (
State mu sigma tau).
Some trivial properties of srcs and dests.
Lemma dests_append:
forall m1 m2,
dests (
m1 ++
m2) =
dests m1 ++
dests m2.
Proof.
Lemma dests_decomp:
forall m1 s d m2,
dests (
m1 ++ (
s,
d) ::
m2) =
dests m1 ++
d ::
dests m2.
Proof.
Lemma srcs_append:
forall m1 m2,
srcs (
m1 ++
m2) =
srcs m1 ++
srcs m2.
Proof.
Lemma srcs_decomp:
forall m1 s d m2,
srcs (
m1 ++ (
s,
d) ::
m2) =
srcs m1 ++
s ::
srcs m2.
Proof.
intros.
unfold srcs.
rewrite map_app.
reflexivity.
Qed.
Lemma srcs_dests_combine:
forall s d,
List.length s =
List.length d ->
srcs (
List.combine s d) =
s /\
dests (
List.combine s d) =
d.
Proof.
induction s; destruct d; simpl; intros.
tauto.
discriminate.
discriminate.
elim (IHs d); intros. split; congruence. congruence.
Qed.
Some properties of is_mill and dests_disjoint.
Definition dests_disjoint (
m1 m2:
moves) :
Prop :=
list_disjoint (
dests m1) (
dests m2).
Lemma dests_disjoint_sym:
forall m1 m2,
dests_disjoint m1 m2 <->
dests_disjoint m2 m1.
Proof.
Lemma dests_disjoint_cons_left:
forall m1 s d m2,
dests_disjoint ((
s,
d) ::
m1)
m2 <->
dests_disjoint m1 m2 /\ ~
In d (
dests m2).
Proof.
unfold dests_disjoint,
list_disjoint.
simpl;
intros;
split;
intros.
split.
auto.
firstorder.
destruct H.
elim H0;
intro.
red;
intro;
subst.
contradiction.
apply H;
auto.
Qed.
Lemma dests_disjoint_cons_right:
forall m1 s d m2,
dests_disjoint m1 ((
s,
d) ::
m2) <->
dests_disjoint m1 m2 /\ ~
In d (
dests m1).
Proof.
Lemma dests_disjoint_append_left:
forall m1 m2 m3,
dests_disjoint (
m1 ++
m2)
m3 <->
dests_disjoint m1 m3 /\
dests_disjoint m2 m3.
Proof.
Lemma dests_disjoint_append_right:
forall m1 m2 m3,
dests_disjoint m1 (
m2 ++
m3) <->
dests_disjoint m1 m2 /\
dests_disjoint m1 m3.
Proof.
Lemma is_mill_cons:
forall s d m,
is_mill ((
s,
d) ::
m) <->
is_mill m /\ ~
In d (
dests m).
Proof.
unfold is_mill,
dests_disjoint;
intros.
simpl.
split;
intros.
inversion H;
tauto.
constructor;
tauto.
Qed.
Lemma is_mill_append:
forall m1 m2,
is_mill (
m1 ++
m2) <->
is_mill m1 /\
is_mill m2 /\
dests_disjoint m1 m2.
Proof.
Some properties of move_no_temp.
Lemma move_no_temp_append:
forall m1 m2,
move_no_temp m1 ->
move_no_temp m2 ->
move_no_temp (
m1 ++
m2).
Proof.
intros;
red;
intros.
elim (
in_app_or _ _ _ H1);
intro.
apply H;
auto.
apply H0;
auto.
Qed.
Lemma move_no_temp_rev:
forall m,
move_no_temp (
List.rev m) ->
move_no_temp m.
Proof.
intros;
red;
intros.
apply H.
rewrite <-
List.In_rev.
auto.
Qed.
Some properties of temp_last.
Lemma temp_last_change_last_source:
forall s d s'
sigma,
temp_last (
sigma ++ (
s,
d) ::
nil) ->
temp_last (
sigma ++ (
s',
d) ::
nil).
Proof.
Lemma temp_last_push:
forall s1 d1 s2 d2 sigma,
temp_last ((
s2,
d2) ::
sigma) ->
is_not_temp s1 ->
is_not_temp d1 ->
temp_last ((
s1,
d1) :: (
s2,
d2) ::
sigma).
Proof.
unfold temp_last;
intros.
simpl.
simpl in H.
destruct (
rev sigma);
simpl in *.
intuition.
red;
simpl;
intros.
elim H;
intros.
inversion H4.
subst;
tauto.
tauto.
destruct p as [
sN dN].
intuition.
red;
intros.
elim (
in_app_or _ _ _ H);
intros.
apply H3;
auto.
elim H4;
intros.
inversion H5;
subst;
tauto.
elim H5.
Qed.
Lemma temp_last_pop:
forall s1 d1 sigma s2 d2,
temp_last ((
s1,
d1) ::
sigma ++ (
s2,
d2) ::
nil) ->
temp_last (
sigma ++ (
s2,
d2) ::
nil).
Proof.
intros until d2.
change ((
s1,
d1) ::
sigma ++ (
s2,
d2) ::
nil)
with ((((
s1,
d1) ::
nil) ++
sigma) ++ ((
s2,
d2) ::
nil)).
unfold temp_last.
repeat rewrite rev_unit.
intuition.
simpl in H1.
red;
intros.
apply H1.
apply in_or_app.
auto.
Qed.
Some properties of is_path.
Lemma is_path_pop:
forall s d m,
is_path ((
s,
d) ::
m) ->
is_path m.
Proof.
intros. inversion H; subst. auto.
Qed.
Lemma is_path_change_last_source:
forall s s'
d sigma,
is_path (
sigma ++ (
s,
d) ::
nil) ->
is_path (
sigma ++ (
s',
d) ::
nil).
Proof.
induction sigma; simpl; intros.
constructor. red; auto. constructor.
inversion H; subst; clear H.
constructor.
destruct sigma as [ | [s1 d1] sigma']; simpl; simpl in H2; auto.
auto.
Qed.
Lemma path_sources_dests:
forall s0 d0 sigma,
is_path (
sigma ++ (
s0,
d0) ::
nil) ->
List.incl (
srcs (
sigma ++ (
s0,
d0) ::
nil))
(
s0 ::
dests (
sigma ++ (
s0,
d0) ::
nil)).
Proof.
induction sigma;
simpl;
intros.
red;
simpl;
tauto.
inversion H;
subst;
clear H.
simpl.
assert (
In s (
dests (
sigma ++ (
s0,
d0) ::
nil))).
destruct sigma as [ | [
s1 d1]
sigma'];
simpl;
simpl in H2;
intuition.
apply incl_cons.
simpl;
tauto.
apply incl_tran with (
s0 ::
dests (
sigma ++ (
s0,
d0) ::
nil)).
eapply IHsigma;
eauto.
red;
simpl;
tauto.
Qed.
Lemma no_read_path:
forall d1 sigma s0 d0,
d1 <>
s0 ->
is_path (
sigma ++ (
s0,
d0) ::
nil) ->
~
In d1 (
dests (
sigma ++ (
s0,
d0) ::
nil)) ->
no_read (
sigma ++ (
s0,
d0) ::
nil)
d1.
Proof.
intros.
generalize (
path_sources_dests _ _ _ H0).
intro.
intro.
elim H1.
elim (
H2 _ H3);
intro.
congruence.
auto.
Qed.
To benefit from some proof automation, we populate a rewrite database
with some of the properties above.
Lemma notin_dests_cons:
forall x s d m,
~
In x (
dests ((
s,
d) ::
m)) <->
x <>
d /\ ~
In x (
dests m).
Proof.
intros. simpl. intuition auto.
Qed.
Lemma notin_dests_append:
forall d m1 m2,
~
In d (
dests (
m1 ++
m2)) <-> ~
In d (
dests m1) /\ ~
In d (
dests m2).
Proof.
Hint Rewrite is_mill_cons is_mill_append
dests_disjoint_cons_left dests_disjoint_cons_right
dests_disjoint_append_left dests_disjoint_append_right
notin_dests_cons notin_dests_append:
pmov.
Transitions preserve well-formedness
Lemma transition_preserves_wf:
forall st st',
transition st st' ->
state_wf st ->
state_wf st'.
Proof.
Lemma transitions_preserve_wf:
forall st st',
transitions st st' ->
state_wf st ->
state_wf st'.
Proof.
Transitions preserve semantics
We define the semantics of states as follows.
For a triple mu, sigma, tau, we perform the moves tau in
reverse sequential order, then the moves mu ++ sigma in parallel.
Definition statemove (
st:
state) (
e:
env) :=
match st with
|
State mu sigma tau =>
exec_par (
mu ++
sigma) (
exec_seq_rev tau e)
end.
In preparation to showing that transitions preserve semantics,
we prove various properties of the parallel and sequential interpretations
of moves.
Lemma exec_par_outside:
forall m e r, ~
In r (
dests m) ->
exec_par m e r =
e r.
Proof.
induction m;
simpl;
intros.
auto.
destruct a as [
s d].
rewrite update_o.
apply IHm.
tauto.
simpl in H.
intuition.
Qed.
Lemma exec_par_lift:
forall m1 s d m2 e,
~
In d (
dests m1) ->
exec_par (
m1 ++ (
s,
d) ::
m2)
e =
exec_par ((
s,
d) ::
m1 ++
m2)
e.
Proof.
induction m1;
simpl;
intros.
auto.
destruct a as [
s0 d0].
simpl in H.
rewrite IHm1.
simpl.
apply update_commut.
tauto.
tauto.
Qed.
Lemma exec_par_ident:
forall m1 r m2 e,
is_mill (
m1 ++ (
r,
r) ::
m2) ->
exec_par (
m1 ++ (
r,
r) ::
m2)
e =
exec_par (
m1 ++
m2)
e.
Proof.
Lemma exec_seq_app:
forall m1 m2 e,
exec_seq (
m1 ++
m2)
e =
exec_seq m2 (
exec_seq m1 e).
Proof.
induction m1; simpl; intros. auto.
destruct a as [s d]. rewrite IHm1. auto.
Qed.
Lemma exec_seq_rev_app:
forall m1 m2 e,
exec_seq_rev (
m1 ++
m2)
e =
exec_seq_rev m1 (
exec_seq_rev m2 e).
Proof.
induction m1; simpl; intros. auto.
destruct a as [s d]. rewrite IHm1. auto.
Qed.
Lemma exec_seq_exec_seq_rev:
forall m e,
exec_seq_rev m e =
exec_seq (
List.rev m)
e.
Proof.
induction m;
simpl;
intros.
auto.
destruct a as [
s d].
rewrite exec_seq_app.
simpl.
rewrite IHm.
auto.
Qed.
Lemma exec_seq_rev_exec_seq:
forall m e,
exec_seq m e =
exec_seq_rev (
List.rev m)
e.
Proof.
Lemma exec_par_update_no_read:
forall s d m e,
no_read m d ->
~
In d (
dests m) ->
exec_par m (
update d (
e s)
e) =
update d (
e s) (
exec_par m e).
Proof.
unfold no_read;
induction m;
simpl;
intros.
auto.
destruct a as [
s0 d0];
simpl in *.
rewrite IHm.
rewrite update_commut.
f_equal.
f_equal.
apply update_o.
tauto.
tauto.
tauto.
tauto.
Qed.
Lemma exec_par_append_eq:
forall m1 m2 m3 e2 e3,
exec_par m2 e2 =
exec_par m3 e3 ->
(
forall r,
In r (
srcs m1) ->
e2 r =
e3 r) ->
exec_par (
m1 ++
m2)
e2 =
exec_par (
m1 ++
m3)
e3.
Proof.
induction m1; simpl; intros.
auto. destruct a as [s d]. f_equal; eauto.
Qed.
Lemma exec_par_combine:
forall e sl dl,
List.length sl =
List.length dl ->
list_norepet dl ->
let e' :=
exec_par (
combine sl dl)
e in
List.map e'
dl =
List.map e sl /\
(
forall l, ~
In l dl ->
e'
l =
e l).
Proof.
induction sl;
destruct dl;
simpl;
intros;
try discriminate.
split;
auto.
inversion H0;
subst;
clear H0.
injection H;
intro;
clear H.
destruct (
IHsl dl H0 H4)
as [
A B].
set (
e' :=
exec_par (
combine sl dl)
e)
in *.
split.
decEq.
apply update_s.
rewrite <-
A.
apply list_map_exten;
intros.
rewrite update_o.
auto.
congruence.
intros.
rewrite update_o.
apply B.
tauto.
intuition.
Qed.
env_equiv is an equivalence relation between environments
capturing the fact that two environments assign the same values to
non-temporary registers, but can disagree on the values of temporary
registers.
Definition env_equiv (
e1 e2:
env) :
Prop :=
forall r,
is_not_temp r ->
e1 r =
e2 r.
Lemma env_equiv_refl:
forall e,
env_equiv e e.
Proof.
Lemma env_equiv_refl':
forall e1 e2,
e1 =
e2 ->
env_equiv e1 e2.
Proof.
unfold env_equiv;
intros.
rewrite H.
auto.
Qed.
Lemma env_equiv_sym:
forall e1 e2,
env_equiv e1 e2 ->
env_equiv e2 e1.
Proof.
unfold env_equiv;
intros.
symmetry;
auto.
Qed.
Lemma env_equiv_trans:
forall e1 e2 e3,
env_equiv e1 e2 ->
env_equiv e2 e3 ->
env_equiv e1 e3.
Proof.
unfold env_equiv;
intros.
transitivity (
e2 r);
auto.
Qed.
Lemma exec_par_env_equiv:
forall m e1 e2,
move_no_temp m ->
env_equiv e1 e2 ->
env_equiv (
exec_par m e1) (
exec_par m e2).
Proof.
unfold move_no_temp;
induction m;
simpl;
intros.
auto.
destruct a as [
s d].
red;
intros.
unfold update.
destruct (
reg_eq r d).
apply H0.
elim (
H s d);
tauto.
apply IHm;
auto.
Qed.
The proof that transitions preserve semantics (up to the values of
temporary registers) follows.
Lemma transition_preserves_semantics:
forall st st'
e,
transition st st' ->
state_wf st ->
env_equiv (
statemove st'
e) (
statemove st e).
Proof.
Lemma transitions_preserve_semantics:
forall st st'
e,
transitions st st' ->
state_wf st ->
env_equiv (
statemove st'
e) (
statemove st e).
Proof.
Lemma state_wf_start:
forall mu,
move_no_temp mu ->
is_mill mu ->
state_wf (
State mu nil nil).
Proof.
intros.
constructor.
rewrite <-
app_nil_end.
auto.
auto.
red.
simpl.
auto.
constructor.
Qed.
The main correctness result in this section is the following:
if we can transition repeatedly from an initial state mu, nil, nil
to a final state nil, nil, tau, then the sequential execution
of the moves rev tau is semantically equivalent to the parallel
execution of the moves mu.
Theorem transitions_correctness:
forall mu tau,
move_no_temp mu ->
is_mill mu ->
transitions (
State mu nil nil) (
State nil nil tau) ->
forall e,
env_equiv (
exec_seq (
List.rev tau)
e) (
exec_par mu e).
Proof.
Determinisation of the transition relation
We now define a deterministic variant dtransition of the
transition relation transition. dtransition is deterministic
in the sense that at most one transition applies to a given state.
Inductive dtransition:
state ->
state ->
Prop :=
|
dtr_nop:
forall r mu tau,
dtransition (
State ((
r,
r) ::
mu)
nil tau)
(
State mu nil tau)
|
dtr_start:
forall s d mu tau,
s <>
d ->
dtransition (
State ((
s,
d) ::
mu)
nil tau)
(
State mu ((
s,
d) ::
nil)
tau)
|
dtr_push:
forall mu1 d r mu2 s sigma tau,
no_read mu1 d ->
dtransition (
State (
mu1 ++ (
d,
r) ::
mu2) ((
s,
d) ::
sigma)
tau)
(
State (
mu1 ++
mu2) ((
d,
r) :: (
s,
d) ::
sigma)
tau)
|
dtr_loop_pop:
forall mu s r0 d sigma tau,
no_read mu r0 ->
dtransition (
State mu ((
s,
r0) ::
sigma ++ (
r0,
d) ::
nil)
tau)
(
State mu (
sigma ++ (
temp r0,
d) ::
nil) ((
s,
r0) :: (
r0,
temp r0) ::
tau))
|
dtr_pop:
forall mu s0 d0 s1 d1 sigma tau,
no_read mu d1 ->
d1 <>
s0 ->
dtransition (
State mu ((
s1,
d1) ::
sigma ++ (
s0,
d0) ::
nil)
tau)
(
State mu (
sigma ++ (
s0,
d0) ::
nil) ((
s1,
d1) ::
tau))
|
dtr_last:
forall mu s d tau,
no_read mu d ->
dtransition (
State mu ((
s,
d) ::
nil)
tau)
(
State mu nil ((
s,
d) ::
tau)).
Definition dtransitions:
state ->
state ->
Prop :=
clos_refl_trans state dtransition.
Every deterministic transition corresponds to a sequence of
non-deterministic transitions.
Lemma transition_determ:
forall st st',
dtransition st st' ->
state_wf st ->
transitions st st'.
Proof.
Lemma transitions_determ:
forall st st',
dtransitions st st' ->
state_wf st ->
transitions st st'.
Proof.
The semantic correctness of deterministic transitions follows.
Theorem dtransitions_correctness:
forall mu tau,
move_no_temp mu ->
is_mill mu ->
dtransitions (
State mu nil nil) (
State nil nil tau) ->
forall e,
env_equiv (
exec_seq (
List.rev tau)
e) (
exec_par mu e).
Proof.
The compilation function
We now define a function that takes a well-formed parallel move mu
and returns a sequence of elementary moves tau that is semantically
equivalent. We start by defining a number of auxiliary functions.
Fixpoint split_move (
m:
moves) (
r:
reg) {
struct m} :
option (
moves *
reg *
moves) :=
match m with
|
nil =>
None
| (
s,
d) ::
tl =>
if reg_eq s r
then Some (
nil,
d,
tl)
else match split_move tl r with
|
None =>
None
|
Some (
before,
d',
after) =>
Some ((
s,
d) ::
before,
d',
after)
end
end.
Fixpoint is_last_source (
r:
reg) (
m:
moves) {
struct m} :
bool :=
match m with
|
nil =>
false
| (
s,
d) ::
nil =>
if reg_eq s r then true else false
| (
s,
d) ::
tl =>
is_last_source r tl
end.
Fixpoint replace_last_source (
r:
reg) (
m:
moves) {
struct m} :
moves :=
match m with
|
nil =>
nil
| (
s,
d) ::
nil => (
r,
d) ::
nil
|
s_d ::
tl =>
s_d ::
replace_last_source r tl
end.
Definition final_state (
st:
state) :
bool :=
match st with
|
State nil nil _ =>
true
|
_ =>
false
end.
Definition parmove_step (
st:
state) :
state :=
match st with
|
State nil nil _ =>
st
|
State ((
s,
d) ::
tl)
nil l =>
if reg_eq s d
then State tl nil l
else State tl ((
s,
d) ::
nil)
l
|
State t ((
s,
d) ::
b)
l =>
match split_move t d with
|
Some (
t1,
r,
t2) =>
State (
t1 ++
t2) ((
d,
r) :: (
s,
d) ::
b)
l
|
None =>
match b with
|
nil =>
State t nil ((
s,
d) ::
l)
|
_ =>
if is_last_source d b
then State t (
replace_last_source (
temp d)
b) ((
s,
d) :: (
d,
temp d) ::
l)
else State t b ((
s,
d) ::
l)
end
end
end.
Here are the main correctness properties of these functions.
Lemma split_move_charact:
forall m r,
match split_move m r with
|
Some (
before,
d,
after) =>
m =
before ++ (
r,
d) ::
after /\
no_read before r
|
None =>
no_read m r
end.
Proof.
unfold no_read.
induction m;
simpl;
intros.
-
tauto.
-
destruct a as [
s d].
destruct (
reg_eq s r).
+
subst s.
auto.
+
specialize (
IHm r).
destruct (
split_move m r)
as [[[
before d']
after] | ].
*
destruct IHm.
subst m.
simpl.
intuition.
*
simpl;
intuition.
Qed.
Lemma is_last_source_charact:
forall r s d m,
if is_last_source r (
m ++ (
s,
d) ::
nil)
then s =
r
else s <>
r.
Proof.
induction m;
simpl.
destruct (
reg_eq s r);
congruence.
destruct a as [
s0 d0].
case_eq (
m ++ (
s,
d) ::
nil);
intros.
generalize (
app_cons_not_nil m nil (
s,
d)).
congruence.
rewrite <-
H.
auto.
Qed.
Lemma replace_last_source_charact:
forall s d s'
m,
replace_last_source s' (
m ++ (
s,
d) ::
nil) =
m ++ (
s',
d) ::
nil.
Proof.
induction m;
simpl.
auto.
destruct a as [
s0 d0].
case_eq (
m ++ (
s,
d) ::
nil);
intros.
generalize (
app_cons_not_nil m nil (
s,
d)).
congruence.
rewrite <-
H.
congruence.
Qed.
Lemma parmove_step_compatible:
forall st,
final_state st =
false ->
dtransition st (
parmove_step st).
Proof.
intros st NOTFINAL.
destruct st as [
mu sigma tau].
unfold parmove_step.
case_eq mu; [
intros MEQ |
intros [
ms md]
mtl MEQ].
case_eq sigma; [
intros SEQ |
intros [
ss sd]
stl SEQ].
subst mu sigma.
simpl in NOTFINAL.
discriminate.
simpl.
case_eq stl; [
intros STLEQ |
intros xx1 xx2 STLEQ].
apply dtr_last.
red;
simpl;
auto.
elim (@
exists_last _ stl). 2:
congruence.
intros sigma1 [[
ss1 sd1]
SEQ2].
rewrite <-
STLEQ.
clear STLEQ xx1 xx2.
generalize (
is_last_source_charact sd ss1 sd1 sigma1).
rewrite SEQ2.
destruct (
is_last_source sd (
sigma1 ++ (
ss1,
sd1) ::
nil)).
intro.
subst ss1.
rewrite replace_last_source_charact.
apply dtr_loop_pop.
red;
simpl;
auto.
intro.
apply dtr_pop.
red;
simpl;
auto.
auto.
case_eq sigma; [
intros SEQ |
intros [
ss sd]
stl SEQ].
destruct (
reg_eq ms md).
subst.
apply dtr_nop.
apply dtr_start.
auto.
generalize (
split_move_charact ((
ms,
md) ::
mtl)
sd).
case (
split_move ((
ms,
md) ::
mtl)
sd); [
intros [[
before r]
after] |
idtac].
intros [
MEQ2 NOREAD].
rewrite MEQ2.
apply dtr_push.
auto.
intro NOREAD.
case_eq stl; [
intros STLEQ |
intros xx1 xx2 STLEQ].
apply dtr_last.
auto.
elim (@
exists_last _ stl). 2:
congruence.
intros sigma1 [[
ss1 sd1]
SEQ2].
rewrite <-
STLEQ.
clear STLEQ xx1 xx2.
generalize (
is_last_source_charact sd ss1 sd1 sigma1).
rewrite SEQ2.
destruct (
is_last_source sd (
sigma1 ++ (
ss1,
sd1) ::
nil)).
intro.
subst ss1.
rewrite replace_last_source_charact.
apply dtr_loop_pop.
auto.
intro.
apply dtr_pop.
auto.
auto.
Qed.
The compilation function is obtained by iterating the parmov_step
function. To show that this iteration always terminates, we introduce
the following measure over states.
Open Scope nat_scope.
Definition measure (
st:
state) :
nat :=
match st with
|
State mu sigma tau => 2 *
List.length mu +
List.length sigma
end.
Lemma measure_decreasing_1:
forall st st',
dtransition st st' ->
measure st' <
measure st.
Proof.
Lemma measure_decreasing_2:
forall st,
final_state st =
false ->
measure (
parmove_step st) <
measure st.
Proof.
Compilation function for parallel moves.
Function parmove_aux (
st:
state) {
measure measure st} :
moves :=
if final_state st
then match st with State _ _ tau =>
tau end
else parmove_aux (
parmove_step st).
Proof.
Lemma parmove_aux_transitions:
forall st,
dtransitions st (
State nil nil (
parmove_aux st)).
Proof.
Definition parmove (
mu:
moves) :
moves :=
List.rev (
parmove_aux (
State mu nil nil)).
This is the main correctness theorem: the sequence of elementary
moves parmove mu is semantically equivalent to the parallel move
mu if the latter is well-formed.
Theorem parmove_correctness:
forall mu,
move_no_temp mu ->
is_mill mu ->
forall e,
env_equiv (
exec_seq (
parmove mu)
e) (
exec_par mu e).
Proof.
Here is an alternate formulation of parmove, where the
parallel move problem is given as two separate lists of sources
and destinations.
Definition parmove2 (
sl dl:
list reg) :
moves :=
parmove (
List.combine sl dl).
Theorem parmove2_correctness:
forall sl dl,
List.length sl =
List.length dl ->
list_norepet dl ->
(
forall r,
In r sl ->
is_not_temp r) ->
(
forall r,
In r dl ->
is_not_temp r) ->
forall e,
let e' :=
exec_seq (
parmove2 sl dl)
e in
List.map e'
dl =
List.map e sl /\
forall r, ~
In r dl ->
is_not_temp r ->
e'
r =
e r.
Proof.
Additional syntactic properties
We now show an additional property of the sequence of moves
generated by parmove mu: any such move s -> d is either
already present in mu, or of the form temp s -> d or s -> temp s
for some s -> d in mu. This syntactic property is useful
to show that parmove preserves register classes, for instance.
Section PROPERTIES.
Variable initial_move:
moves.
Inductive wf_move:
reg ->
reg ->
Prop :=
|
wf_move_same:
forall s d,
In (
s,
d)
initial_move ->
wf_move s d
|
wf_move_temp_left:
forall s d,
wf_move s d ->
wf_move (
temp s)
d
|
wf_move_temp_right:
forall s d,
wf_move s d ->
wf_move s (
temp s).
Definition wf_moves (
m:
moves) :
Prop :=
forall s d,
In (
s,
d)
m ->
wf_move s d.
Lemma wf_moves_cons:
forall s d m,
wf_moves ((
s,
d) ::
m) <->
wf_move s d /\
wf_moves m.
Proof.
unfold wf_moves;
intros;
simpl.
firstorder.
inversion H0;
subst s0 d0.
auto.
Qed.
Lemma wf_moves_append:
forall m1 m2,
wf_moves (
m1 ++
m2) <->
wf_moves m1 /\
wf_moves m2.
Proof.
unfold wf_moves;
intros.
split;
intros.
split;
intros;
apply H;
apply in_or_app;
auto.
destruct H.
elim (
in_app_or _ _ _ H0);
intro;
auto.
Qed.
Hint Rewrite wf_moves_cons wf_moves_append:
pmov.
Inductive wf_state:
state ->
Prop :=
|
wf_state_intro:
forall mu sigma tau,
wf_moves mu ->
wf_moves sigma ->
wf_moves tau ->
wf_state (
State mu sigma tau).
Lemma dtransition_preserves_wf_state:
forall st st',
dtransition st st' ->
wf_state st ->
wf_state st'.
Proof.
Lemma dtransitions_preserve_wf_state:
forall st st',
dtransitions st st' ->
wf_state st ->
wf_state st'.
Proof.
End PROPERTIES.
Lemma parmove_wf_moves:
forall mu,
wf_moves mu (
parmove mu).
Proof.
Lemma parmove2_wf_moves:
forall sl dl,
wf_moves (
List.combine sl dl) (
parmove2 sl dl).
Proof.
As a corollary, we show that all sources of parmove mu
are sources of mu or temporaries,
and likewise all destinations of parmove mu are destinations of mu
or temporaries.
Remark wf_move_initial_reg_or_temp:
forall mu s d,
wf_move mu s d ->
(
In s (
srcs mu) \/
is_temp s) /\ (
In d (
dests mu) \/
is_temp d).
Proof.
induction 1.
split;
left.
change s with (
fst (
s,
d)).
unfold srcs.
apply List.in_map;
auto.
change d with (
snd (
s,
d)).
unfold dests.
apply List.in_map;
auto.
split.
right.
exists s;
auto.
tauto.
split.
tauto.
right.
exists s;
auto.
Qed.
Lemma parmove_initial_reg_or_temp:
forall mu s d,
In (
s,
d) (
parmove mu) ->
(
In s (
srcs mu) \/
is_temp s) /\ (
In d (
dests mu) \/
is_temp d).
Proof.
Remark in_srcs:
forall mu s,
In s (
srcs mu) ->
exists d,
In (
s,
d)
mu.
Proof.
Remark in_dests:
forall mu d,
In d (
dests mu) ->
exists s,
In (
s,
d)
mu.
Proof.
Lemma parmove_srcs_initial_reg_or_temp:
forall mu s,
In s (
srcs (
parmove mu)) ->
In s (
srcs mu) \/
is_temp s.
Proof.
Lemma parmove_dests_initial_reg_or_temp:
forall mu d,
In d (
dests (
parmove mu)) ->
In d (
dests mu) \/
is_temp d.
Proof.
As a second corollary, we show that parmov preserves register
classes, in the sense made precise below.
Section REGISTER_CLASSES.
Variable class:
Type.
Variable regclass:
reg ->
class.
Hypothesis temp_preserves_class:
forall r,
regclass (
temp r) =
regclass r.
Definition is_class_compatible (
mu:
moves) :
Prop :=
forall s d,
In (
s,
d)
mu ->
regclass s =
regclass d.
Lemma parmove_preserves_register_classes:
forall mu,
is_class_compatible mu ->
is_class_compatible (
parmove mu).
Proof.
End REGISTER_CLASSES.
Extension to partially overlapping registers
We now extend the previous results to the case where distinct
registers can partially overlap, so that assigning to one register
changes the value of the other. We asuume given a disjointness relation
disjoint between registers.
Variable disjoint:
reg ->
reg ->
Prop.
Hypothesis disjoint_sym:
forall r1 r2,
disjoint r1 r2 ->
disjoint r2 r1.
Hypothesis disjoint_not_equal:
forall r1 r2,
disjoint r1 r2 ->
r1 <>
r2.
Two registers partially overlap if they are different and not disjoint.
For the Coq development, it is easier to define the complement:
two registers do not partially overlap if they are identical or disjoint.
Definition no_overlap (
r1 r2:
reg) :
Prop :=
r1 =
r2 \/
disjoint r1 r2.
Lemma no_overlap_sym:
forall r1 r2,
no_overlap r1 r2 ->
no_overlap r2 r1.
Proof.
intros. destruct H. left; auto. right; auto.
Qed.
We axiomatize the effect of assigning a value to a register over an
execution environment. The target register is set to the given value
(property weak_update_s), and registers disjoint from the target
keep their previous values (property weak_update_d). The values of
other registers are undefined after the assignment.
Variable weak_update:
reg ->
val ->
env ->
env.
Hypothesis weak_update_s:
forall r v e,
weak_update r v e r =
v.
Hypothesis weak_update_d:
forall r1 v e r2,
disjoint r1 r2 ->
weak_update r1 v e r2 =
e r2.
Fixpoint weak_exec_seq (
m:
moves) (
e:
env) {
struct m}:
env :=
match m with
|
nil =>
e
| (
s,
d) ::
m' =>
weak_exec_seq m' (
weak_update d (
e s)
e)
end.
Definition disjoint_list (
r:
reg) (
l:
list reg) :
Prop :=
forall r',
In r'
l ->
disjoint r r'.
Inductive pairwise_disjoint:
list reg ->
Prop :=
|
pairwise_disjoint_nil:
pairwise_disjoint nil
|
pairwise_disjoint_cons:
forall r l,
disjoint_list r l ->
pairwise_disjoint l ->
pairwise_disjoint (
r ::
l).
Definition disjoint_temps (
r:
reg) :
Prop :=
forall t,
is_temp t ->
disjoint r t.
Section OVERLAP.
We consider a parallel move problem mu that satisfies the following
conditions, which are stronger, overlap-aware variants of the
move_no_temp mu and is_mill mu conditions used previously.
Variable mu:
moves.
Sources and destinations are disjoint from all temporary registers.
Hypothesis mu_no_temporaries_src:
forall s,
In s (
srcs mu) ->
disjoint_temps s.
Hypothesis mu_no_temporaries_dst:
forall d,
In d (
dests mu) ->
disjoint_temps d.
Destinations are pairwise disjoint.
Hypothesis mu_dest_pairwise_disjoint:
pairwise_disjoint (
dests mu).
Sources and destinations do not partially overlap.
Hypothesis mu_src_dst_no_overlap:
forall s d,
In s (
srcs mu) ->
In d (
dests mu) ->
no_overlap s d.
Distinct temporaries do not partially overlap.
Hypothesis temps_no_overlap:
forall t1 t2,
is_temp t1 ->
is_temp t2 ->
no_overlap t1 t2.
The following lemmas show that mu is a windmill and does not
contain temporary registers.
Lemma disjoint_list_notin:
forall r l,
disjoint_list r l -> ~
In r l.
Proof.
intros.
red;
intro.
assert (
r <>
r).
apply disjoint_not_equal.
apply H;
auto.
congruence.
Qed.
Lemma pairwise_disjoint_norepet:
forall l,
pairwise_disjoint l ->
list_norepet l.
Proof.
Lemma disjoint_temps_not_temp:
forall r,
disjoint_temps r ->
is_not_temp r.
Proof.
Lemma mu_is_mill:
is_mill mu.
Proof.
Lemma mu_move_no_temp:
move_no_temp mu.
Proof.
We define the ``adherence'' of the problem mu as the set of
registers that partially overlap with one of the registers
possibly assigned by the parallel move: destinations and temporaries.
Again, we define the complement of the ``adherence'' set, which is
more convenient for Coq reasoning.
Definition no_adherence (
r:
reg) :
Prop :=
forall x,
In x (
dests mu) \/
is_temp x ->
no_overlap x r.
As a consequence of the hypotheses on mu, none of the destination
registers, source registers, and temporaries belong to the adherence.
Lemma no_overlap_pairwise:
forall r1 r2 m,
pairwise_disjoint m ->
In r1 m ->
In r2 m ->
no_overlap r1 r2.
Proof.
induction 1;
intros.
elim H.
simpl in *.
destruct H1;
destruct H2.
left.
congruence.
subst.
right.
apply H.
auto.
subst.
right.
apply disjoint_sym.
apply H.
auto.
auto.
Qed.
Lemma no_adherence_dst:
forall d,
In d (
dests mu) ->
no_adherence d.
Proof.
Lemma no_adherence_src:
forall s,
In s (
srcs mu) ->
no_adherence s.
Proof.
Lemma no_adherence_tmp:
forall t,
is_temp t ->
no_adherence t.
Proof.
The relation env_match holds between two environments e1 and e2
if they assign the same values to all registers not in the adherence set.
Definition env_match (
e1 e2:
env) :
Prop :=
forall r,
no_adherence r ->
e1 r =
e2 r.
The following lemmas relate the effect of executing moves
using normal, overlap-unaware update and weak, overlap-aware update.
Lemma weak_update_match:
forall e1 e2 s d,
(
In s (
srcs mu) \/
is_temp s) ->
(
In d (
dests mu) \/
is_temp d) ->
env_match e1 e2 ->
env_match (
update d (
e1 s)
e1)
(
weak_update d (
e2 s)
e2).
Proof.
Lemma weak_exec_seq_match:
forall m e1 e2,
(
forall s,
In s (
srcs m) ->
In s (
srcs mu) \/
is_temp s) ->
(
forall d,
In d (
dests m) ->
In d (
dests mu) \/
is_temp d) ->
env_match e1 e2 ->
env_match (
exec_seq m e1) (
weak_exec_seq m e2).
Proof.
induction m;
intros;
simpl.
auto.
destruct a as [
s d].
simpl in H.
simpl in H0.
apply IHm;
auto.
apply weak_update_match;
auto.
Qed.
End OVERLAP.
These lemmas imply the following correctness theorem for the parmove2
function, taking partial register overlap into account.
Theorem parmove2_correctness_with_overlap:
forall sl dl,
List.length sl =
List.length dl ->
(
forall r,
In r sl ->
disjoint_temps r) ->
(
forall r,
In r dl ->
disjoint_temps r) ->
pairwise_disjoint dl ->
(
forall s d,
In s sl ->
In d dl ->
no_overlap s d) ->
(
forall t1 t2,
is_temp t1 ->
is_temp t2 ->
no_overlap t1 t2) ->
forall e,
let e' :=
weak_exec_seq (
parmove2 sl dl)
e in
List.map e'
dl =
List.map e sl /\
forall r,
disjoint_list r dl ->
disjoint_temps r ->
e'
r =
e r.
Proof.
End PARMOV.