previous up next Previous: 4.1 Equivalence of OCL and
Up: 4. Completeness
 Next: 4.3 Turing Completeness


   
4.2 More than Complete

Data manipulation languages normally have capabilities beyond those of relational calculus, like arithmetic operations, assignment commands, and aggregate functions. Often algebraic expressions must involve some arithmetic operations like a < b + c; notice that e.g. + does not appear in the relational algebra. The assignment of a computed relation to be the value of a relation name is undoubtedly useful, see discussion about Cartesian product or projection in the previous section. Furthermore some operations like sum, average, max, min are also desirable as aggregate functions, that can be applied to columns of a relation to obtain a single quantity. For these reasons a (complete) query language with such capabilities is said to be more than complete [Ull82, p. 175].

OCL includes the following arithmetic and aggregate functions:

type of operands operations
Real =, +, -, *, /, abs, floor, max, min, <, >, <=, >=
Integer =, +, -, *, /, abs, div, mod, max, min
Boolean =, or, xor, and, not, implies, if-then-else4
String =, size, concat, toUpper, toLower, substring
Enumeration =, <>
Notice that Integer is a subclass of Real, that is, for each formal parameter of type Real an actual parameter of type Integer can be used. OCL does not include assignment commands.

Some languages are more than complete even after eliminating the arithmetic and aggregate functions. Such languages allow, for example QBE (Query-by-Example, see [Ull82]), the computation of the transitive closure of a relation. The transitive closure R+ of a relation $R \subseteq A \times A$ is the least transitive relation containing R:

1.
xRy $\Rightarrow$ xR+y;
2.
(x R+ y $\wedge$ y R+ z) $\Rightarrow$ xR+z;
3.
if S satisfies (1) and (2), then R+ $\subseteq$ S.
The transitive closure of a relation cannot be expressed in relational algebra or relational calculus; see [AA93,AU79]. The transitive closure of a relation is needed in our example of Fig. 1 if we want to ensure that no instance of Group is (recursively) an element of itself. In the remainder of this section we will study how to express this constraint in OCL.

In order to write an OCL constraint expressing that any instance of Group is not an element of itself, we need a relation that, as R above, is a subset of the Cartesian product of a set A with itself. In the framework of UML class diagrams, the easiest is to have an association class (and not just an association name) as depicted in Fig. 3.

  
Figure 3: An association class R from class A to class A
\includegraphics{trans.eps}

We therefore lift the association name Contains to an association class as pictured in Fig. 4.
  
Figure 4: The Contains association class
\includegraphics{transitivity.eps}

Notice that the relation Contains might contain pairs of objects of different classes, like an instance of Group containing an instance of (a subclass of the abstract class) GeoFigure. The subset (which we call r) of instances of Contains that contains only the desired pairs, namely an instance of Group that is an element of an instance of Group, can be computed as follows:
    -- algorithm computing r included in Group x Group:
    Contains.allInstances->iterate(
    pair : Contains;
    r : Set(Contains) = Set{}
    |   if pair.elements.oclIsTypeOf(Group)
        then r->including(pair) else r
        endif
    )
Let us remark that, in the above query, every pair is just one pair of an instance of Group and a single instance of GraphicObject, which can be referred to by container resp. elements. The class GraphicObject is abstract, that means, any of its instances must be an instance of any of its concrete subclasses. In the iteration, therefore, we only include those pairs whose GraphicObject-component is of type Group. Now the Warshall's algorithm (see e.g. [Ger93]) can be applied to r and in this way a new set s of instances of Contains is computed that is the transitive closure of r:
    -- algorithm computing s, the transitive closure of r:
    Group.allInstances->iterate(
    g3 : Group ;
    s : Set(Contains) = r
    |   Group.allInstances->iterate(
        g2 : Group ;
        s2 : Set(Contains) = s
        |   Group.allInstances->iterate(
            g1 : Group ;
            s1 : Set(Contains) = s2
            |   if s1->exists(c1,c2 : Contains |
                   (c1.container=g1 and c1.elements=g2) or
                   (c1.container=g1 and c1.elements=g3 and
                    c2.container=g3 and c2.elements=g2)  )
                then s1->including(c) else s1 endif
                -- where:
                -- c : Contains with c.container=g1 and c.elements=g2
    )   )   )
In the above algorithm the sets s1 and s2 had to be declared in order for s to be properly updated. The variant of exists with two iterators used in this algorithm is inexistent in OCL. However, given the equivalences $(\exists x)\varphi \equiv \neg(\forall x)\neg\varphi$and $%
(\exists x)(\exists y)\varphi %
\equiv \neg(\forall x)\neg(\exists y)\varphi ...
...eg\neg(\forall y)\neg\varphi %
\equiv \neg(\forall x)(\forall y)\neg\varphi, %
$
collection->exists(e1,e2 | <Boolean-expr-on-e1-and-e2>)
is the abbreviation we use for
not collection->forAll(e1,e2 | not <Boolean-expr-on-e1-and-e2>).

The OCL constraint we are looking for is the non-reflexivity of the set s:

    not s->exists(d : Contains | d.container=d.elements)
The desired OCL constraint is therefore the following:
    not 
    Group.allInstances->iterate(
    g3 : Group ;
    s : Set(Contains) = Contains.allInstances->iterate(
                        pair : Contains;
                        r : Set(Contains) = Set{}
                        |   if pair.elements.oclIsTypeOf(Group)
                            then r->including(pair) else r
                            endif
                        )
    |   Group.allInstances->iterate(
        g2 : Group ;
        s2 : Set(Contains) = s
        |   Group.allInstances->iterate(
            g1 : Group ;
            s1 : Set(Contains) = s2
            |   if s1->exists(c1,c2 : Contains |
                   (c1.container=g1 and c1.elements=g2) or
                   (c1.container=g1 and c1.elements=g3 and
                    c2.container=g3 and c2.elements=g2)  )
                then s1->including(c) else s1 endif
                -- where:
                -- c : Contains with c.container=g1 and c.elements=g2
    )   )   )->exists(d : Contains | d.container=d.elements)

Here we discover a problem: the instance c of Contains with c.container=g1 and c.elements=g2, necessary when computing s for recording partial paths, cannot be created. This instance is not meant to be added to the actual state of the model, the same as many sets and such that can be calculated using OCL are not meant to be added to the actual state of the model. Furthermore, and differently to what was remarked in Section 4.1 about Cartesian product and projection, we are not trying to generate an instance of an unknown type but of a type present in the class diagram.

An alternative would be to manipulate, instead of a set s of instances of Contains, a set s' of pairs of instances of Group (represented by a sequence of length two) such that, if the sequence {g1,g2} belongs to s', then there is a path in r from g1 to g2. Unfortunately this is impossible since within OCL all collections of collections are automaticaly flattened (although it is not specified what type has the result of flattening a set of sequences).

The third idea that comes to mind, in order to overcome nested collections, is to manipulate a sequence t with an even number of elements such that if g1 and g2 belong to t, g1 is at an odd position i of t, and g2 is at position i+1, then there is a path in r from g1 to g2. This seems to be a satisfactory solution, the problematic sentence

    s1->including(c)
could then be replaced by
    t1->append(g1)->append(g2)
(also in this case we need auxiliary variables t1 and t2). Now, the initial value of t is not r but
    t : Sequence(Group)
    = r->iterate(
      pair : Contains ;
      res : Sequence(Group) = Sequence{}
      |   res->append(pair.container)->append(pair.elements)
      )

Also the algorithm computing the transitive closure of r has to be accordingly adapted wherever s (or s1 or s2) is used. There is just one more place where one of these variables is used, namely when the existence is asked of two pairs c1 and c2 in s1 that satisfy certain property. Now we are not looking for the existence of two elements in t1 but for the existence of two subsequences of length two both beginning at an odd position and satisfying the same property. We access the subsequences using an index that points to their position in t. We replace the test

    s1->exists(c1,c2 : Contains |
    (c1.container=g1 and c1.elements=g2) or
    (c1.container=g1 and c1.elements=g3 and
     c2.container=g3 and c2.elements=g2)  )
by the following
    Sequence{1..(t1->size)/2}->exists(i,j : Integer |
    (t1->at(2*i-1) = g1 and t1->at(2*i) = g2) or
    (t1->at(2*i-1) = g1 and t1->at(2*i) = g3 and
     t1->at(2*j-1) = g3 and t1->at(2*j) = g2)       )
where Sequence{1..(t1->size)/2} is the sequence of integer numbers between 1 and the size of t1 (which we know of even length) divided by two, and the elements of this sequence are used to index the sequence t1.

Finally in a similar way we can test the non-reflexivity of t:

    not Sequence{1..(t->size)/2}->exists(i : Integer
    |   t->at(2*i-1) = t->at(2*i)                  )

There is still a last hurdle to surmount. t is just a mnemonic we have used, that has to be replaced by the algorithm that computes it, since we cannot use a variable. t occurs three times in the above test of non-reflexivity, and we would like to calculate it only once. Moreover, if we replace each one of the three occurrences of t in the test above, then we cannot be sure that each instance of the sequence is equal to another one, and in particular when testing if t->at(2*i) = t->at(2*i+1) it is absolutely necessary that both t's at each side of the equation are equal. The only constructs using new names are the operations iterating on collections, and of them only iterate allows for a variable of an arbitrary type (namely the accumulator) and the assignment of an arbitrary value to this variable. But the last value of the accumulator is the return value of an iterate, and we need a Boolean.

We can therefore just assume that the different occurrences of the algorithm computing t return always the same sequence. The resulting algorithm is as follows:

    not Sequence{1..(
        -- t
        Group.allInstances->iterate(
        g3 : Group ;
        t : Sequence(Group) = -- r
            Contains.allInstances->iterate(
            pair : Contains;
            r : Set(Contains) = Set{}
            |   if pair.elements.oclIsTypeOf(Group)
                then r->including(pair) else r
                endif
            )->iterate(
            pair : Contains ;
            res : Sequence(Group) = Sequence{}
            |   res->append(pair.container)->append(pair.elements)
            )
        |   Group.allInstances->iterate(
            g2 : Group ;
            t2 : Sequence(Group) = t
            |   Group.allInstances->iterate(
                g1 : Group ;
                t1 : Sequence(Group) = t2
                |   if Sequence{1..(t1->size)/2}->exists(i,j : Integer |
                       (t1->at(2*i-1) = g1 and t1->at(2*i) = g2) or
                       (t1->at(2*i-1) = g1 and t1->at(2*i) = g3 and 
                        t1->at(2*j-1) = g3 and t1->at(2*j) = g2)       )
                    then t1->append(g1)->append(g2) else t1 endif
        )   )   )   
        ->size)/2}->exists(i : Integer
    |   -- t
        Group.allInstances->iterate(
        g3 : Group ;
        t : Sequence(Group) = -- r
            Contains.allInstances->iterate(  
            pair : Contains;
            r : Set(Contains) = Set{}
            |   if pair.elements.oclIsTypeOf(Group)
                then r->including(pair) else r
                endif
            )->iterate( 
            pair : Contains ;
            res : Sequence(Group) = Sequence{}
            |   res->append(pair.container)->append(pair.elements)
            )
        |   Group.allInstances->iterate(
            g2 : Group ;
            t2 : Sequence(Group) = t
            |   Group.allInstances->iterate(
                g1 : Group ;
                t1 : Sequence(Group) = t2
                |   if Sequence{1..(t1->size)/2}->exists(i,j : Integer |
                       (t1->at(2*i-1) = g1 and t1->at(2*i) = g2) or
                       (t1->at(2*i-1) = g1 and t1->at(2*i) = g3 and
                        t1->at(2*j-1) = g3 and t1->at(2*j) = g2)       )
                    then t1->append(g1)->append(g2) else t1 endif
        )   )   )
        ->at(2*i-1) =
        -- t
        Group.allInstances->iterate(
        g3 : Group ;
        t : Sequence(Group) = -- r
            Contains.allInstances->iterate(  
            pair : Contains;
            r : Set(Contains) = Set{}
            |   if pair.elements.oclIsTypeOf(Group)
                then r->including(pair) else r
                endif
            )->iterate( 
            pair : Contains ;
            res : Sequence(Group) = Sequence{}
            |   res->append(pair.container)->append(pair.elements)
            )
        |   Group.allInstances->iterate(
            g2 : Group ;
            t2 : Sequence(Group) = t
            |   Group.allInstances->iterate(
                g1 : Group ;
                t1 : Sequence(Group) = t2
                |   if Sequence{1..(t1->size)/2}->exists(i,j : Integer |
                       (t1->at(2*i-1) = g1 and t1->at(2*i) = g2) or
                       (t1->at(2*i-1) = g1 and t1->at(2*i) = g3 and
                        t1->at(2*j-1) = g3 and t1->at(2*j) = g2)       )
                    then t1->append(g1)->append(g2) else t1 endif
        )   )   )
        ->at(2*i)
    )

Although the computation capabilities for computing the transitive closure of a binary relation are in principle present in OCL, here again a concept of tuple functions would have made the above algorithm considerably simpler. At this point we want to remark the notion of relational completeness as formulated in [Ozk86, p. 94]:

In language implementations, the following two operations are needed to assure relational completeness:
(a)
The ability to represent assignments, that is, the ability to create new relations to store the results of relational algebra operations that are also relations. [...]
(b)
The ability to compute transitive closures which enables recursion and/or nesting of relational algebra operations to express expressions of arbitrary complexity. [...]
Also Codd (see [Cod72]) asserted the need for more than complete languages, providing tuple and aggregate functions.