[zurück] [Zusammenfassung] [Copyright] [Inhaltsverzeichnis] [nächstes]

Sather - A Language Manual - Kapitel 6
Parametrized Classes and Arrays


All Sather classes may be parametrized by one or more type parameters. Type parameters are essentially placeholders for actual types; the actual type is only known when the class is actually used. The array class, which we have already seen, is an example of a parametrized class.Whenever a parameterized type is referred to, its parameters are specified by type specifiers. The class behaves like a non-parameterized version whose body is a textual copy of the original class in which each parameter occurrence is replaced by its specified type.


6.1 Parametrized concrete types

As an example of a parametrized class, consider the class PAIR, which can hold two objects of arbitrary types. We will refer to the types as T1 and T2:

class PAIR{T1,T2} is
   readonly attr first:T1;
   readonly attr second:T2;

   create(a_first:T1, a_second:T2):SAME is
      res ::= new;
      res.first := a_first;
      res.second := a_second;
      return res;
   end;
end;

We can use this class to hold a pair of integers or a pair of an integer and a real etc.

c ::= #PAIR{INT,INT}(5,5);     -- Holds a  pair of integers
d ::= #PAIR{INT,FLT}(5,5.0);   -- Holds an integer and a FLT
e ::= #PAIR{STR,INT}('this',5);-- A string and an integer
f:INT := e.second;
g:FLT := d.second;

Thus, instead of defining a new class for each different type of pair, we can just parametrize the PAIR class with different parameters.


6.1.1 Why Parametrize?

Parametrization is normally presented as a mechanism for achieving efficiency by specializing code to use particular types. However, parametrization plays an even more important conceptual role in a language with strong typing like Sather.

For instance, we could define a pair to hold $OBs

class OB_PAIR is
   readonly attr first,second:$OB;

   create(a_first, a_second:$OB):SAME is
      res ::= new;
      res.first := a_first;
      res.second := a_second;
      return res;
   end;
end; -- class OB_PAIR

There is no problem with defining OB_PAIR objects; in fact, it looks a little simpler.

c ::= #OB_PAIR(5,5);     -- Holds a  pair of integers
d ::= #OB_PAIR(5,5.0);   -- Holds an integer and a FLT

However, when the time comes to extract the components of the pair, we are in trouble:

-- f:INT := e.second; ILLEGAL! second is declared to be a $OB

We can typecase on the return value:

f_ob:$OB := e.second;
f:INT;
typecase f_ob
when INT then f := f_ob;
end;

The above code has the desired effect, but is extremely cumbersome. Imagine if you had to do this every time you removed an INT from an ARRAY{INT}! Note that the above code would raise an error if the branch in the typecase does not match.

The parametrized version of the pair container gets around all these problems by essentially annotating the type of the container with the types of the objects it contains; the types of the contained objects are the type parameter.


6.2 Support for Arrays

Arrays (and, in fact, most container classes) are realized using parametrized classes in Sather. There is language support for the main array class ARRAY{T} in the form of a literal expressions of the form

a:ARRAY{INT} := |1,2,3|;

In addition to the standard accessing function, arrays provide many operations, ranging from trivial routines that return the size of the array to routines that will sort arbitrary arrays. See the array class in the container library for more details. There are several aspects to supporting arrays:


6.2.1 Array Access

The form 'a[4]:=..' is syntactic sugar for a call of a routine named aset' with the array index expressions and the right hand side of the assignment as arguments. In the class TRIO below we have three elements which can be accessed using array notation

class TRIO is
   private attr a,b,c:FLT;

   create:SAME is
      return new
   end;

   aget(i:INT):FLT is
      case i
      when 0 then return a
      when 1 then return b
      when 2 then return c
      else raise 'Bad array index!\n';
      end;
   end;

   aset(i:INT, val:FLT) is
      case i
      when 0 then a := val;
      when 1 then b := val;
      when 2 then c := val;
      end;
   end;
end;

The array notation can then be used with objects of type TRIO

trio:TRIO := #TRIO;  -- Calls TRIO::create
trio[2] := 1;
#OUT + trio[2];        -- Prints out 1

See Operator Redefinition, chapter 7 for more details on the section on operator redefinition.


6.2.2 Array Classes: Including AREF and calling new();

Sather permits the user to define array classes which support an array portion whose size is determined when the array is created. An object can have an array portion by including AREF{T}.

class POLYGON is
   private include AREF{POINT}
      aget->private old_aget, aset->private old_aset;
         -- Rename aget and aset

   create(n_points:INT):SAME  is
      -- Create a new polygon with a 'n_points' points
      res:SAME := new(n_points);  -- Note that the new takes
      -- as argument of the size of the array
   end;

   aget(i:INT):POINT is
      if i > asize then raise 'Not enough polygon points!' end;
      return old_aget(i);
   end;

   aset(i:INT, val:POINT) is
      if i > asize then raise 'Not enough polygon points!' end;
      old_aset(i,val);
   end;
end;

Since AREF{T} already defines 'aget' and 'aset' to do the right thing, we can provide wrappers around these routines to, for instance, provide an additional warning message. The above example makes use of the POINT class from Defining Simple Classes, subsection 2.2.1. We could have also used the PAIR class defined on Parametrized concrete types, section 6.1. The following example uses the polygon class to define a triangle.

poly:POLYGON := #POLYGON(3);
poly[0] := #POINT(3,4);
poly[1] := #POINT(5,6);
poly[2] := #POINT(0,0);

AREF defines several useful routines:

asize:INT                        -- Returns the size of the array
aelt!:T;                         -- Yields successive array elements
aelt!(once beg:INT):T;           -- Yields elements from index 'beg'
aelt!(once beg,once num:INT):T;  -- Yields 'num' elts from index 'beg'
aelt!(once beg,once num,once step:INT):T;
   -- Yields 'num' elements, starting at index 'beg' with a 'step'
... Analgous versions of aset! ..
acopy(src:SAME);                  -- Copy what fits from 'src' to self
acopy(beg:INT,src:SAME);          -- Start copying into index 'beg'
acopy(beg:INT,num:INT,src:SAME);
   -- Copy 'num' elements into self starting at index 'beg' of self
aind!:INT;                         -- Yields successive array indices

When possible, use the above iterators since they are built-in and can be more efficient than other iterators.


6.2.3 Standard Arrays: ARRAY{T}

The class ARRAY{T} in the standard library is not a primitive data type. It is based on a built-in class AREF{T} which provides objects with an array portion. ARRAY obtains this functionality using an include, but chooses to modify the visibility of some of the methods. It also defines additional methods such a contains, sort etc. The methods aget, aset and asize are defined as private in AREF, but ARRAY redefines them to be public.

class ARRAY{T} is
   private include AREF{T}
      -- Make these public.
      aget->aget,
      aset->aset,
      asize->asize;
   ...

   contains(e:T):BOOL is ... end
   ...
end;

The array portion appears if there is an include path from the type to AREF for reference types or to AVAL for immutable types.

Array Literals

Sather provides support for directly creating arrays from literal expressions

a:ARRAY{INT} := |2,4,6,8|;
b:ARRAY{STR} := |'apple','orange'|;

.

The type is taken to be the declared type of the context in which it appears and it must be ARRAY{T} for some type T. An array creation expression may not appear

The types of each expression in the array literal must be subtypes of T. The size of the created array is equal to the number of specified expressions. The expressions in the literal are evaluated left to right and the results are assigned to successive array elements.


6.2.4 Multi-dimensional Arrays

Special support is neither present nor needed for multi-dimensional arrays. The 'aget' and 'aset' routines can take multiple arguments, thus permitting multiple indices. The library does provide ARRAY2 and ARRAY3 classes, which provide the necesary index computation. All standard array classes are addressed in row-major order. However, the MAT class is addressed in column major order for compatibility with external FORTRAN routines[13]. Multi-dimensonal array literals may be expressed by nesting of standard array literals

a:ARRAY{ARRAY{INT}} := ||1,2,3|,|3,4,5|,|5,6,7||;


6.3 Type Bounds

When writing more complex parametrized classes, it is frequently useful to be able to perform operations on variables which are of the type of the parameter. For instance, in writing a sorting algorithm for arrays, you might want to make use of the 'less than' operator on the array elements.If a parameter declaration is followed by a type constraint clause ('<' followed by a type specifier), then the parameter can only be replaced by subtypes of the constraining type. If a type constraint is not explicitly specified, then '< $OB' is taken as the constraint. A type constraint specifier may not refer to SAME'. The body of a parameterized class must be type-correct when the parameters are replaced by any subtype of their constraining types this allows type-safe independent compilation.

For our example, we will return to employees and managers. Recall that the employee abstraction was defined as:

abstract class $EMPLOYEE is
   name:STR;
   id:INT;
end;

We can now build a container class that holds employees. The container class makes use of a standard library class, a LIST, which is also parametrized over the types of things being held.

class EMPLOYEE_REGISTER{ETP < $EMPLOYEE} is
   private attr emps:LIST{ETP};

   create:SAME is
      res ::= new;
      res.emps := #;
      return res;
   end;

   add_employee(e:ETP) is
      emps.append(e);
   end;

   n_employees:INT is
      return emps.size;
   end;

   longest_name:INT is
    -- Return the length of the longest employee name
      i:INT := 0;
      cur_longest:INT := 0;
      loop
         until!(i=n_employees);
         employee:ETP := emps[i];
         name:STR := employee.name;
         -- The type-bound has the '.name' routine
         if name.size > cur_longest then
            cur_longest := name.size;
         end;
      end;
      return cur_longest;
   end;
end;

The routine of interest is 'longest_name'. The use of this routine is not important, but we can imagine that such a routine might be useful in formatting some printout of employee data. In this routine we go through all employees in the list, and for each employee we look at the 'name'. With the typebound on ETP, we know that ETP must be a subtype of $EMPLOYEE. Hence, it must have a routine 'name' which returns a STR.

If we did not have the typebound (there is an implicit typebound of $OB), we could not do anything with the resulting 'employee'; all we could assume is that it was a $OB, which is not very useful.


6.3.1 Why have typebounds?

The purpose of the type bound is to permit type checking of a parametrized class over all possible instantiations. Note that the current compiler does not do this, thus permitting some possibly illegal code to go unchecked until an instantiation is attempted.


6.3.2 Supertyping and Type Bounds

The need for supertyping clauses arises from our definitition of type-bounds in parametrized types. The parameters can only be instantiated by subtypes of their type bounds.

You may, however, wish to create a parametrized type which is instantiated with classes from an existing library which are not under the typebound you require. For instance, suppose you want to create a class PRINTABLE_ SET, whose parameters must support both hash and the standard string printing routine str. The library contains the following abstract classes.

abstract class $HASH < $IS_EQ is
   hash:INT;
end;

abstract class $STR is
   str:STR;
end;

However, our PRINTABLE_SET{T} must take all kinds of objects that support both $HASH and $STR, such as integers, floating point numbers etc. How do we support this, without modifying the distributed library?

abstract class $HASH_AND_STR > INT, FLT, STR is
   hash:INT;
   str:STR;
end;

class PRINTABLE_SET{T < $HASH_AND_STR} is
   -- Set whose elements can be printed

   str:STR is
      res:STR := '';
      loop
         res := res + ','.separate!(elt!.str);
      end;
      return res;
   end;
end;

The PRINTABLE_SET class can now be instantiated using integers, floating point numbers and strings. Thus, supertyping provides a way of creating supertypes without modifying the original classes (which is not possible if the original types are in a different library).

Note that this is only useful if the original classes cannot be modified. In general, it is usually far simpler and easier to understand if standard subtyping is used.

A more complicated example arises if we want to create a sorted set, whose elements must be hashable and comparable. From the library we have.

abstract class $HASH < $IS_EQ is
   hash:INT;
end;

abstract class $IS_LT{T} < $IS_EQ is  -- comparable values
   is_lt(elt:T):BOOL;
end;

However, our SORTABLE_SET{T} must only take objects that support both $HASH and $IS_LT{T}

abstract class $ORDERED_HASH{T} < $HASH, $IS_LT{T} is
end;

class ORDERED_SET{T < $ORDERED_HASH{T}} is
   -- Set whose elements can be sorted

   sort is
      -- ... uses the < routine on elements which are of type T
   end;

The above definition works in a straightforward way for user classes. For instance, a POINT class as defined below, can be used in a ORDERED_SET{POINT}

class POINT < $ORDERED_HASH{POINT} is ...
   -- define hash:INT and is_lt(POINT):BOOL

But how can you create an ordered set of integers, for instance? The solution is somewhat laborious. You have to create dummy classes that specify the subtyping link for each different parametrization of $ORDERED_HASH

abstract class $DUMMY_INT > INT < $ORDERED_HASH{INT} is end;
abstract class $DUMMY_STR > STR < $ORDERED_HASH{STR} is end;
abstract class $DUMMY_FLT > FLT < $ORDERED_HASH{FLT} is end;

Note that the above classes are only needed because we are not directly modifying INT and FLT to subtype from $ORDRED_HASH{T}. In the following diagram , recall that since there is no relationship between different class parametrizations, it is necessary to think of them as separate types.


6.4 Parametrized Abstract Classes

Abstract class definitions may be parameterized by one or more type parameters within enclosing braces; in the example, the type parameter is 'T'. There is no implicit type relationship between different parametrizations of an abstract class. Parameter names are local to the abstract class definition and they shadow non-parameterized types with the same name. Parameter names must be all uppercase, and they may be used within the abstract type definition as type specifiers. Whenever a parameterized type is referred to, its parameters are specified by type specifiers. The abstract class definition behaves like a non-parameterized version whose body is a textual copy of the original definition in which each parameter occurrence is replaced by its specified type. Parameterization may be thought of as a structured macro facility

Sather abstract classes may be similarly parametrized by any number of type parameters. Each type parameter may have an optional type bound; this forces any actual parameter to be a subtype of the corresponding type bound. Given the following definitions,

abstract class $A{T < $BAR} is
   foo(b:T):T;
end; -- abstract class $A{T}

abstract $BAR is
end;

class BAR < $BAR is
end;

we may then instantiate an abstract variable a:$A{BAR}. BAR instantiates the parameter T and hence must be under the type bound for T, namely$BAR. If a type-bound is not specified then a type bound of $OB is assumed.

How are different parametrizations related?

It is sometimes natural to want a $LIST{MY_FOO} < $LIST{$MY_FOO}. Sather, however, specifies no subtyping relationship between various parametrizations. Permitting such implicit subtyping relationships between different parametrizations of a class can lead to type safety violations.


6.5 Overloading

There are two aspects to the use of overloading in a parametrized class - one aspect is the behavior of the interface of the parametrized class itself, and the other aspect is calls within the parametrized class where one or more arguments have the type of one of the type parameters, or is related to the type parameters through static type inference (see .


6.5.1 Overloading In the Parametrized Class Interface

Argument with the type of a class parameter cannot be used to resolve overloading (such an argument is similar to an 'out' argument or a return type in this respect).

class FOO{T1<$STR ,T2<$ELT} is
   bar(a:T1);  -- (1)
   bar(a:T2);  -- (2)

Even though the type bounds for T1 and T2 are distinct and one is more specific than the other, this is not a sufficient constraint on the actual instantiation of the parameter. In a class such as

FOO{ARRAY{INT}, ARRAY{INT}}

for instance, the two versions of 'bar' will essentially be identical.


6.5.2 Overloading Resolution within the Parametrized Class

Note: The current ICSI compiler does not yet have this behaviour implemented. In the current compiler, overloading resolution is based on the actual instantiated class.

For all calls within the parametrized class, the resolution of overloading is done with respect to the type bounds of the parameters. Consider a class that makes use of output streams

abstract class $OSTREAM is
   plus(s:$STR);
end;

A parametrized class can then write to any output stream

class FOO{S < $OSTREAM} is
   attr x,y:INT;

   describe(s:S) is
      s + 'Self is:';
      s + x;
      s + ',';
      s + y;
   end;
end;

Now, suppose we instantiate the class FOO with a FILE

class FILE < $OSTREAM is
   plus(s:$STR) is ... -- (1)
   plus(s:INT) is ...  -- (2)

a:FOO{FILE} := ..
f:FILE := FILE::open_for_read('myfile');
a.describe(f)

Only '(1) plus($STR)' will be called in FOO{FILE}, even though the more specific '(2) plus(INT)' is available in FILE.

The reason for this behavior is to preseve the ability to analyze a stand alone class, which is needed for separate compilation of parametrized classes - this requires that the behavior of the parametrized class be completely determined by the typebounds and not based on the existance of specialized overloaded routines in particular instantiations.


[zurück] [Zusammenfassung] [Copyright] [Inhaltsverzeichnis] [nächstes]
Sather - A Language Manual
12 Oktober 1999
B. Gomes, D. Stoutamire, B. Vaysman and H. Klawitter
Norbert Nemec nobbi@gnu.org