Proof of Functional Correctness¶
Section author: Claire Dross, Yannick Moy
This section is dedicated to the functional correctness of programs. It presents advanced proof features that you may need to use for the specification and verification of your program's complex properties.
Beyond Program Integrity¶
When we speak about the correctness of a program or subprogram, we mean the extent to which it complies with its specification. Functional correctness is specifically concerned with properties that involve the relations between the subprogram's inputs and outputs, as opposed to other properties such as running time or memory consumption.
For functional correctness, we usually specify stronger properties than those required to just prove program integrity. When we're involved in a certification processes, we should derive these properties from the requirements of the system, but, especially in non-certification contexts, they can also come from more informal sources, such as the program's documentation, comments in its code, or test oracles.
For example, if one of our goals is to ensure that no runtime error is
raised when using the result of the function
Find below, it may be
enough to know that the result is either 0 or in the range of
A. We can
express this as a postcondition of
In this case, it's automatically proved by GNATprove.
However, to be sure that
Find performs the task we expect, we may want
to verify more complex properties of that function. For example, we want to
ensure it returns an index of
E is stored and returns 0
E is nowhere in
A. Again, we can express this as a
This time, GNATprove can't prove this postcondition automatically, but
we'll see later that we can help GNATprove by providing a loop invariant,
which is checked by GNATprove and allows it to automatically prove the
Writing at least part of your program's specification in the form of contracts has many advantages. You can execute those contracts during testing, which improves the maintainability of the code by detecting discrepancies between the program and its specification in earlier stages of development. If the contracts are precise enough, you can use them as oracles to decide whether a given test passed or failed. In that case, they can allow you to verify the outputs of specific subprograms while running a larger block of code. This may, in certain contexts, replace the need for you to perform unit testing, instead allowing you to run integration tests with assertions enabled. Finally, if the code is in SPARK, you can also use GNATprove to formally prove these contracts.
The advantage of a formal proof is that it verifies all possible execution
paths, something which isn't always possible by running test cases. For
example, during testing, the postcondition of the subprogram
below is checked dynamically for the set of inputs for which
called in that test, but just for that set.
Find is formally verified, that verification checks its
postcondition for all possible inputs. During development, you can attempt
such verification earlier than testing since it's performed modularly on a
per-subprogram basis. For example, in the code shown above, you can
Use_Find even before you write the body for subprogram
Contracts for functional correctness are usually more complex than contracts for program integrity, so they more often require you to use the new forms of expressions introduced by the Ada 2012 standard. In particular, quantified expressions, which allow you to specify properties that must hold for all or for at least one element of a range, come in handy when specifying properties of arrays.
As contracts become more complex, you may find it useful to introduce new abstractions to improve the readability of your contracts. Expression functions are a good means to this end because you can retain their bodies in your package's specification.
Finally, some properties, especially those better described as invariants over data than as properties of subprograms, may be cumbersome to express as subprogram contracts. Type predicates, which must hold for every object of a given type, are usually a better match for this purpose. Here's an example.
We can use the subtype
Sorted_Nat_Array as the type of a variable that
must remain sorted throughout the program's execution. Specifying that an
array is sorted requires a rather complex expression involving quantifiers,
so we abstract away this property as an expression function to improve
Is_Sorted's body remains in the package's specification
and allows users of the package to retain a precise knowledge of its
meaning when necessary. (You must use
Nat_Array as the type of the
Is_Sorted. If you use
Sorted_Nat_Array, you'll get
infinite recursion at runtime when assertion checks are enabled since that
function is called to check all operands of type
As the properties you need to specify grow more complex, you may have
entities that are only needed because they are used in specifications
(contracts). You may find it important to ensure that these entities can't
affect the behavior of the program or that they're completely removed from
production code. This concept, having entities that are only used for
specifications, is usually called having ghost code and is supported in
SPARK by the
You can use
Ghost aspects to annotate any entity including
variables, types, subprograms, and packages. If you mark an entity as
Ghost, GNATprove ensures it can't affect the program's
behavior. When the program is compiled with assertions enabled, ghost code
is executed like normal code so it can execute the contracts using it. You
can also instruct the compiler to not generate code for ghost entities.
Consider the procedure
Do_Something below, which calls a complex
function on its input,
X, and wants to check that the initial and
modified values of
X are related in that complex way.
Do_Something stores the initial value of
X in a ghost constant,
X_Init. We reference it in an assertion to check that the computation
performed by the call to
Do_Some_Complex_Stuff modified the value of
X in the expected manner.
X_Init can't be used in normal code, for example to restore
the initial value of
When compiling this example, the compiler flags the use of
as illegal, but more complex cases of interference between ghost and
normal code may sometimes only be detected when you run GNATprove.
Functions used only in specifications are a common occurrence when writing contracts for functional correctness. For example, expression functions used to simplify or factor out common patterns in contracts can usually be marked as ghost.
But ghost functions can do more than improve readability. In real-world programs, it's often the case that some information necessary for functional specification isn't accessible in the package's specification because of abstraction.
Making this information available to users of the packages is generally out of the question because that breaks the abstraction. Ghost functions come in handy in that case since they provide a way to give access to that information without making it available to normal client code.
Let's look at the following example.
Here, the type
Stack is private. To specify the expected behavior of
Push procedure, we need to go inside this abstraction and access
the values of the elements stored in
S. For this, we introduce a
Get_Model that returns an array as a representation of the
stack. However, we don't want code that uses the
Stack package to use
Get_Model in normal code since this breaks our stack's abstraction.
Here's an example of trying to break that abstraction in the subprogram
We see that marking the function as
Ghost achieves this goal: it
ensures that the subprogram
Get_Model is never used in production code.
Global Ghost Variables¶
Though it happens less frequently, you may have specifications requiring you to store additional information in global variables that isn't needed in normal code. You should mark these global variables as ghost, allowing the compiler to remove them when assertions aren't enabled. You can use these variables for any purpose within the contracts that make up your specifications. A common scenario is writing specifications for subprograms that modify a complex or private global data structure: you can use these variables to provide a model for that structure that's updated by the ghost code as the program modifies the data structure itself.
You can also use ghost variables to store information about previous runs
of subprograms to specify temporal properties. In the following example, we
have two procedures, one that accesses a state
A and the other that
accesses a state
B. We use the ghost variable
B can't be accessed twice in a row without accessing
Let's look at another example. The specification of a subprogram's expected behavior is sometimes best expressed as a sequence of actions it must perform. You can use global ghost variables that store intermediate values of normal variables to write this sort of specification more easily.
For example, we specify the subprogram
Do_Two_Things below in two
steps, using the ghost variable
V_Interm to store the intermediate
V between those steps. We could also express this using an
existential quantification on the variable
V_Interm, but it would be
impractical to iterate over all integers at runtime and this can't always
be written in SPARK because quantification is restricted to
for ... loop patterns.
Finally, supplying the value of the variable may help the prover verify the contracts.
Since properties of interest for functional correctness are more complex than those involved in proofs of program integrity, we expect GNATprove to initially be unable to verify them even though they're valid. You'll find the techniques we discussed in Debugging Failed Proof Attempts to come in handy here. We now go beyond those techniques and focus on more ways of improving results in the cases where the property is valid but GNATprove can't prove it in a reasonable amount of time.
In those cases, you may want to try and guide GNATprove to either complete the proof or strip it down to a small number of easily-reviewable assumptions. For this purpose, you can add assertions to break complex proofs into smaller steps.
pragma Assert (Assertion_Checked_By_The_Tool); -- info: assertion proved pragma Assert (Assumption_Validated_By_Other_Means); -- medium: assertion might fail pragma Assume (Assumption_Validated_By_Other_Means); -- The tool does not attempt to check this expression. -- It is recorded as an assumption.
One such intermediate step you may find useful is to try to prove a theoretically-equivalent version of the desired property, but one where you've simplified things for the prover, such as by splitting up different cases or inlining the definitions of functions.
Some intermediate assertions may not be proved by GNATprove either because
it's missing some information or because the amount of information
available is confusing. You can verify these remaining assertions by other
means such as testing (since they're executable) or by review. You can then
choose to instruct GNATprove to ignore them, either by turning them into
assumptions, as in our example, or by using a
pragma Annotate. In
both cases, the compiler generates code to check these assumptions at
runtime when you enable assertions.
Local Ghost Variables¶
You can use ghost code to enhance what you can express inside intermediate
assertions in the same way we did above to enhance our contracts in
specifications. In particular, you'll commonly have local variables or
constants whose only purpose is to be used in assertions. You'll mostly
use these ghost variables to store previous values of variables or
expressions you want to refer to in assertions. They're especially useful
to refer to initial values of parameters and expressions since the
'Old attribute is only allowed in postconditions.
In the example below, we want to help GNATprove verify the postcondition of
P. We do this by introducing a local ghost constant,
represent this value and writing an assertion in both branches of an
if statement that repeats the postcondition, but using
You can also use local ghost variables for more complex purposes such as
building a data structure that serves as witness for a complex property of
a subprogram. In our example, we want to prove that the
doesn't create new elements, that is, that all the elements present in
A after the sort were in
A before the sort. This property isn't
enough to ensure that a call to
Sort produces a value for
a permutation of its value before the call (or that the values are indeed
sorted). However, it's already complex for a prover to verify because it
involves a nesting of quantifiers. To help GNATprove, you may find it
useful to store, for each index
I, an index
J that has the expected
procedure Sort (A : in out Nat_Array) with Post => (for all I in A'Range => (for some J in A'Range => A (I) = A'Old (J))) is Permutation : Index_Array := (1 => 1, 2 => 2, ...) with Ghost; begin ... end Sort;
Ghost procedures can't affect the value of normal variables, so they're mostly used to perform operations on ghost variables or to group together a set of intermediate assertions.
Abstracting away the treatment of assertions and ghost variables inside a ghost procedure has several advantages. First, you're allowed to use these variables in any way you choose in code inside ghost procedures. This isn't the case outside ghost procedures, where the only ghost statements allowed are assignments to ghost variables and calls to ghost procedures.
As an example, the
for loop contained in
appear by itself in normal code.
Using the abstraction also improves readability by hiding complex code that isn't part of the functional behavior of the subprogram. Finally, it can help GNATprove by abstracting away assertions that would otherwise make its job more complex.
In the example below, calling
X as an operand only
P (X) to the proof context instead of the larger set of assertions
required to verify it. In addition, the proof of
P need only be done
once and may be made easier not having any unnecessary information present
in its context while verifying it. Also, if GNATprove can't fully verify
Prove_P, you can review the remaining assumptions more easily since
they're in a smaller context.
procedure Prove_P (X : T) with Ghost, Global => null, Post => P (X);
Handling of Loops¶
When the program involves a loop, you're almost always required to provide additional annotations to allow GNATprove to complete a proof because the verification techniques used by GNATprove doesn't handle cycles in a subprogram's control flow. Instead, loops are flattened by dividing them into several acyclic parts.
As an example, let's look at a simple loop with an exit condition.
Stmt1; loop Stmt2; exit when Cond; Stmt3; end loop; Stmt4;
As shown below, the control flow is divided into three parts.
The first, shown in yellow, starts earlier in the subprogram and enters the loop statement. The loop itself is divided into two parts. Red represents a complete execution of the loop's body: an execution where the exit condition isn't satisfied. Blue represents the last execution of the loop, which includes some of the subprogram following it. For that path, the exit condition is assumed to hold. The red and blue parts are always executed after the yellow one.
GNATprove analyzes these parts independently since it doesn't have a way to track how variables may have been updated by an iteration of the loop. It forgets everything it knows about those variables from one part when entering another part. However, values of constants and variables that aren't modified in the loop are not an issue.
In other words, handling loops in that way makes GNATprove imprecise when verifying a subprogram involving a loop: it can't verify a property that relies on values of variables modified inside the loop. It won't forget any information it had on the value of constants or unmodified variables, but it nevertheless won't be able to deduce new information about them from the loop.
For example, consider the function
Find which iterates over the array
A and searches for an element where
E is stored in
At the end of each loop iteration, GNATprove knows that the value stored at
I``in ``A``must not be ``E. (If it were, the loop wouldn't have
reached the end of the interation.) This proves the second assertion. But
it's unable to aggregate this information over multiple loop iterations to
deduce that it's true for all the indexes smaller than
I, so it can't
prove the first assertion.
To overcome these limitations, you can provide additional information to GNATprove in the form of a loop invariant. In SPARK, a loop invariant is a Boolean expression which holds true at every iteration of the loop. Like other assertions, you can have it checked at runtime by compiling the program with assertions enabled.
The major difference between loop invariants and other assertions is the way it's treated for proofs. GNATprove performs the proof of a loop invariant in two steps: first, it checks that it holds for the first iteration of the loop and then it checks that it holds in an arbitrary iteration assuming it held in the previous iteration. This is called proof by induction.
As an example, let's add a loop invariant to the
Find function stating
that the first element of
A is not
To verify this invariant, GNATprove generates two checks. The first checks
that the assertion holds in the first iteration of the loop. This isn't
verified by GNATprove. And indeed there's no reason to expect the first
A to always be different from
E in this iteration.
However, the second check is proved: it's easy to deduce that if the first
A was not
E in a given iteration it's still not
the next. However, if we move the invariant to the end of the loop, then it
is successfully verified by GNATprove.
Not only do loop invariants allow you to verify complex properties of loops, but GNATprove also uses them to verify other properties, such as the absence of runtime errors over both the loop's body and the statements following the loop. More precisely, when verifying a runtime check or other assertion there, GNATprove assumes that the last occurrence of the loop invariant preceding the check or assertion is true.
Let's look at a version of
Find where we use a loop invariant instead
of an assertion to state that none of the array elements seen so far are
This version is fully verified by GNATprove! This time, it proves that the
loop invariant holds in every iteration of the loop (separately proving
this property for the first iteration and then for the following
iterations). It also proves that none of the elements of
A are equal to
E after the loop exits by assuming that the loop invariant holds in the
last iteration of the loop.
Finding a good loop invariant can turn out to be quite a challenge. To make this task easier, let's review the four good properties of a good loop invariant:
|INIT||It should be provable in the first iteration of the loop.|
|INSIDE||It should allow proving the absence of run-time errors and local assertions inside the loop.|
|AFTER||It should allow proving absence of run-time errors, local assertions, and the subprogram postcondition after the loop.|
|PRESERVE||It should be provable after the first iteration of the loop.|
Let's look at each of these in turn. First, the loop invariant should be provable in the first iteration of the loop (INIT). If your invariant fails to achieve this property, you can debug the loop invariant's initialization like any failing proof attempt using strategies for Debugging Failed Proof Attempts.
Second, the loop invariant should be precise enough to allow GNATprove to prove absence of runtime errors in both statements from the loop's body (INSIDE) and those following the loop (AFTER). To do this, you should remember that all information concerning a variable modified in the loop that's not included in the invariant is forgotten by GNATprove. In particular, you should take care to include in your invariant what's usually called the loop's frame condition, which lists properties of variables that are true throughout the execution of the loop even though those variables are modified by the loop.
Finally, the loop invariant should be precise enough to prove that it's preserved through successive iterations of the loop (PRESERVE). This is generally the trickiest part. To understand why GNATprove hasn't been able to verify the preservation of a loop invariant you provided, you may find it useful to repeat it as local assertions throughout the loop's body to determine at which point it can no longer be proved.
As an example, let's look at a loop that iterates through an array
and applies a function
F to each of its elements.
After the loop, each element of
A should be the result of applying
F to its previous value. We want to prove this. To specify this
property, we copy the value of
A before the loop into a ghost variable,
A_I. Our loop invariant states that the element at each index less than
K has been modified in the expected way. We use the
attribute to refer to the value of
A on entry of the loop instead of
Does our loop invariant have the four properties of a good loop-invariant?
When launching GNATprove, we see that
INIT is fulfilled: the
invariant's initialization is proved. So are
potential runtime errors are reported and the assertion following the loop
is successfully verified.
The situation is slightly more complex for the
property. GNATprove manages to prove that the invariant holds after the
first iteration thanks to the automatic generation of frame conditions. It
was able to do this because it completes the provided loop invariant with
the following frame condition stating what part of the array hasn't been
modified so far:
pragma Loop_Invariant (for all J in K .. A'Last => A (J) = (if J > K then A'Loop_Entry (J)));
GNATprove then uses both our and the internally-generated loop invariants
PRESERVE. However, in more complex cases, the heuristics used
by GNATprove to generate the frame condition may not be sufficient and
you'll have to provide one as a loop invariant. For example, consider a
Map where the result of applying
F to an element at
K is stored at index
You need to uncomment the second loop invariant containing the frame condition in order to prove the assertion after the loop.
Code Examples / Pitfalls¶
This section contains some code examples and pitfalls.
We implement a ring buffer inside an array
Content, where the contents
of a ring buffer of length
Length are obtained by starting at index
First and possibly wrapping around the end of the buffer. We use a
Get_Model to return the contents of the ring buffer for
use in contracts.
This is correct:
Get_Model is used only in contracts. Calls to
Get_Model make copies of the buffer's contents, which isn't efficient,
but is fine because
Get_Model is only used for verification, not in
production code. We enforce this by making it a ghost function. We'll
produce the final production code with appropriate compiler switches (i.e.,
-gnata) that ensure assertions are ignored.
Instead of using a ghost function,
Get_Model, to retrieve the contents
of the ring buffer, we're now using a global ghost variable,
This example isn't correct.
Model, which is a ghost variable, must not
influence the return value of the normal function
Valid_Model is only used in specifications, we should have marked it as
Ghost. Another problem is that
Model needs to be updated inside
Push_Last to reflect the changes to the ring buffer.
Ghost and update
This example is correct. The ghost variable
Model can be referenced
both from the body of the ghost function
Valid_Model and the non-ghost
Push_Last as long as it's only used in ghost statements.
We're now modifying
Push_Last to share the computation of the new
length between the operational and ghost code.
This example isn't correct. We didn't mark local constant
Ghost, so it can't be computed from the value of ghost variable
Model. If we made
New_Length a ghost constant, the compiler would
report the problem on the assignment from
correct solution here is to compute
New_Length from the value of the
Let's move the code updating
Model inside a local ghost procedure,
Update_Model, but still using a local variable,
compute the length.
Everything's fine here.
Model is only accessed inside
itself a ghost procedure, so it's fine to declare local variable
New_Length without the
Ghost aspect: everything inside a ghost
procedure body is ghost. Moreover, we don't need to add any contract to
Update_Model: it's inlined by GNATprove because it's a local procedure
without a contract.
Max_Array takes two arrays of the same length (but not
necessarily with the same bounds) as arguments and returns an array with
each entry being the maximum values of both arguments at that index.
This program is correct, but GNATprove can't prove that
J is always in
the index range of
B (the unproved index check) or even that it's
always within the bounds of its type (the unproved overflow check). Indeed,
when checking the body of the loop, GNATprove forgets everything about the
current value of
J because it's been modified by previous loop
iterations. To get more precise results, we need to provide a loop
Let's add a loop invariant that states that
J stays in the index range
B and let's protect the increment to
J by checking that it's not
already the maximal integer value.
The loop invariant now allows verifying that no runtime error can occur in
the loop's body (property INSIDE seen in section Loop Invariants). Unfortunately, GNATprove fails to verify that the invariant
stays valid after the first iteration of the loop (property
PRESERVE). Indeed, knowing that
J is in
B'Range in a given
iteration isn't enough to prove it'll remain so in the next iteration. We
need a more precise invariant, linking
J to the value of the loop index
J = I - A'First + B'First.
We now consider a version of
Max_Array which takes arguments that have
the same bounds. We want to prove that
Max_Array returns an array of
the maximum values of both its arguments at each index.
Here, GNATprove doesn't manage to prove the loop invariant even for the
first loop iteration (property INIT seen in section Loop Invariants). In fact, the loop invariant is incorrect, as you can see by
executing the function
Max_Array with assertions enabled: at each loop
R contains the maximum of
B only until
1 because the
I'th index wasn't yet handled.
We now consider a procedural version of
Max_Array which updates its
first argument instead of returning a new array. We want to prove that
Max_Array sets the maximum values of both its arguments into each index
in its first argument.
Everything is proved. The first loop invariant states that the values of
A before the loop index contains the maximum values of the arguments of
Max_Array (referring to the input value of
A'Loop_Entry). The second loop invariant states that the values of
A beyond and including the loop index are the same as they were on
entry. This is the frame condition of the loop.
Let's remove the frame condition from the previous example.
Everything is still proved. GNATprove internally generates the frame
condition for the loop, so it's sufficient here to state that
the loop index contains the maximum values of the arguments of