![]() |
Frequently Asked Questions about Sather |
![]() |
Sather has garbage collection, statically-checked strong typing, multiple inheritance, separate implementation and type inheritance, parameterized classes, dynamic dispatch, iteration abstraction, higher-order routines and iters, exception handling, assertions, preconditions, postconditions, and class invariants. Sather code can be compiled into C code and can efficiently link with C object files.
Sather has a very unrestrictive license aimed at encouraging contribution to the public library without precluding the use of Sather for proprietary projects.
class MAIN is main is #OUT + "Hello World!\n"; end; -- main end; -- class MAIN
http://www.icsi.berkeley.edu/~sather.
There is a newsgroup "comp.lang.sather" that is devoted to discussion of Sather issues.
There is a Sather mailing list maintained at the International Computer Science Institute (ICSI). Since the formation of the newsgroup, this list is primarily used for announcements. To be added to or deleted from the Sather list, send a message to
sather-request@icsi.berkeley.eduIf you have problems with Sather or related questions that are not of general interest, mail to
sather-bugs@icsi.berkeley.eduThis is also where you want to send bug reports and suggestions for improvements.
It can also be loaded from these ftp servers:
I am looking for reliable sites on other continents to mirror the Sather distribution and be included in this FAQ. If you can help with this, please send me mail.
ftp.icsi.berkeley.edu: /pub/sather These sites mirror the Sather distribution: ftp.sterling.com: /programming/languages/sather ftp.uni-muenster.de: /pub/languages/sather maekong.ohm.york.ac.uk: /pub/csp/sather
(no longer maintained, but still running)ftp.th-darmstadt.de: /pub/programming/languages/sather ftp.sra.co.jp: /pub/lang/sather
The ICSI implementation includes a browser, class libraries including basic types, math structures such as matrices, vectors, rational numbers, associative data structures, file manipulation (etc.), the compiler source (in Sather), and a large library of contributed but unofficial code. Contributed binaries for some systems are also available. ICSI maintains but does not support a library of donated code, some of it of a tutorial nature, at
http://www.icsi.berkeley.edu/Sather/Contrib/contrib.htmlThe ICSI compiler generates C. `gdb' or other C debuggers can be used for debugging in combination with a compiler flag. However, there is not yet a debugger which uses Sather the syntax and namespace.
There is another dialect of Sather called Sather-K that is being developed at the University of Karlsruhe, where it has been used in undergraduate instruction. The library of Sather-K is Karla, the KARlsruhe Library or Algorithms, and it has been used in graduate courses on algorithms and object-oriented design.
The Sather-K compiler and library are available at
i44ftp.info.uni-karlsruhe.de: /pub/sather and /pub/Karlaas well as at the ICSI ftp site in pub/sather/Sather-K.
Parallel Sather (pSather) is a parallel version of the language, developed and in use at ICSI. pSather addresses non-uniform-memory-access multiprocessor architectures but presents a shared memory model to the programmer. It extends serial Sather with threads, synchronization and data distribution. Unlike actor languages, multiple threads can execute in one object. A distinguished class GATE combines various dependent low-level synchronization mechanisms efficiently: locks, futures, and conditions. The new version of the pSather compiler is being integrated into the serial Sather 1.1 compiler. More information on pSather is available at:
http://www.icsi.berkeley.edu/~sather/psather.html
Ultimately there will be a better development environment; we envision an interpreter/on-the-fly compiler. This won't be too hard to do because the compiler already emits an abstract machine representation that is appropriate for interpretation. There are presently students working on extensions to the compiler as class projects.
|
|
The compiler itself is quite slow. Small programs compile quite quickly. The bottleneck is in generating and compiling the C code. The compiler is an example of a very large Sather program. It takes about 40 secs for the compiler to generate optimized C on a modern workstation (with a *lot* of memory, though!), and then another couple of minutes to compile the C code using parallel make on our local network. We ship the generated C and executables for common platforms.
Most importantly, have enough physical RAM. 32MB is much superior to 16 MB. 32 should be enough for most things, except perhaps recompiling the compiler again. Netscape is a memory hog. Quit it. Besides, you have to do real work sometime.
Secondly, the default compiler flags are not tuned to make the fastest compilation, but to generate the fastest or safest executables. Do not use "-O_fast" until the final production stages: the loop and subexpression optimizations are surprisingly slow and memory-consuming.
-only_check -- Just report errrors, do not generate any codeUsing this option means that the compiler does not have to generate code, so it will use far less memory and get to your errors much faster. I strongly recommend doing this until all your compile time errors have disappeared.
-chk -- Turn on all checking in all classes -- - slows down resulting executable, -- but it should never crash -debug_source -- Debugging with source line numbers -debug_no_backtrace -- Stack information is expensive. -- You can usually make do by using gdb and printing the stack -- using the command 'bt'Note that all the -debug options reduce incrementality considerably since whenever source line numbers change slightly (even though the generated code remains the same), a lot of recompilation must take place
To get extra debugging information, such as a full backtrace of the stack whenever a crash occurs, use the option
-debug -- More debugging informationThis may explode the executable size to an extent that you might find unacceptable.
-output_C -- This is VERY important! If you don't -- do this you will never get incremental -- C compilation. -only_reachable -- Alternatively, since unreachable code -- is only checked after the executable is -- generated, you can kill the compile when -- it says ``Checking unreachable...'' -O_inline_routines 30 -- These strongly help executable performance -O_inline_iters 30 -- but don't add much to compile time. -replace_iters -chk_no_void all -- Turn off expensive checking options -chk_no_bounds all -chk_no_pre all -chk_no_assert allIf you are strapped for memory or VM space consider using "-only_C" and then do a manual "make" of the generated C code. This might save some extra swapping above "-output_C".
A heuristic approach is used to determine which routines and iterators are simple enough to be inlined. Complexity is computed by traversing the abstract machine representation of the function body. Weights are assigned to all expressions and statements. Calls are replaced if the computed function weights are less than a specified threshold value. Some statements or expressions are never inlined, including "raise" and "loop".
The "-O_fast" and "-O" options provide default inlining for function and iterators. The "-inline" option just turns on the inlining without affecting anything else. By default inlining uses the inlining threshold set to 16 statements plus expressions, which was experimentally found to be optimal for a few applications including the compiler running on a Sparcstation 10.
The optimal inlining threshold is dependent on the application and underlying architecture and for exotic machines may be somewhat different from the default level. To specify inlining with the threshold different from default can use one of the following options:
-O_inline_routines <threshold> -O_inline_iters <threshold>This also allows one to experiment with selectively inlining either routines or iterators. Using too high a threshold leads to bloated code.
The reported performance improvements are between 10% and 40%. I/O intensive applications are less affected by inlining. The default threshold inlines 40% of all calls in the compiler itself.
Surprisingly, moderate levels of inlining have not appeared to have negative consequences on the executable size. In fact, default inlining might even slightly reduce the generated executable. The benefits are dependent on the underlying architecture and parameter passing conventions.
Iterators are as efficient as standard loops when they are built-in or inlined (they are converted into standard C loops). In other cases they are probably significantly better than the "iterator objects" i.e. cursors, that might otherwise be required. Simple iterators are inlined. In order to be inlined, an iterator must have at most one yield, and the yield must not be in a conditional statement. i.e. the iterator definition must be of the form:
iter_def!( ) is iter_init; loop ... code before the yield yield (only one, and not in an if statement) ... code after the yield end; other_code; end;
Routines are inlined if they contain only a single return statement at the end of the routine, and do not contain any "raise" statement.
See the next section for some other options to produce fast code.
Some preliminary experiments on inlining can be found at
http://www.icsi.berkeley.edu/~fleiner/benchmarks/ has performance mesurements of inlining and some
Consult http://www.icsi.berkeley.edu/~fleiner/benchmarks for information on performance impact of these optimizations.
Many people have contributed code, which can be found in the Contrib/ directory of the distribution.
gomes@icsi.berkeley.edu: General Library, Browser, graph classes, neural nets mbk@inls1.ucsd.edu: Matrix/vector, numerical, fortran lewikk@aud.alcatel.com: Emacs support cbitmead@versant.com: New file classes haps@inf.tu-dresden.de: INTI, FLTI based on GNU MP sather-bugs@icsi.berkeley.edu: General questions
There is a more comprehensive list of people at
http://www.icsi.berkeley.edu/~sather/whoswho.html and on http://www.icsi.berkeley.edu/~sather/contrib.html
Sather is contravariant. That means that it isn't possible to get type errors at runtime. It also means that some ways of doing object-oriented programming will require that you, the Sather programmer, insert explicit type checks (using a typecase) in places where a covariant compiler would have inserted an implicit check for you. We choose contravariance because it eliminates a potential source of bugs that can't be discovered at compile time; other language designers have choosen the opposite to allow more expressiveness. Eiffel says toMAHto, we say toMAYto.
A frequently cited reason for not specifying an order of evaluation is to allow the compiler to choose an order of evaluation which leads to the most efficient code; for example, simple arguments can be evaluated after complicated ones to relieve register pressure. This can also be done for ordered Sather arguments in the absence of side effects.
While the order is unspecified in C, the evaluations of arguments must appear to occur in _some_ order, not interleaved in execution. (In the extreme this would allow C compilers to fork threads when evaluating arguments, a practice which would break most existing code.) Since a compiler capable of taking advantage of the parallelism made available by unordered arguments must do dependency analysis to make sure the generated instructions appear to evaluate the arguments in some order, such a compiler would of course be able to do the same dependency analysis and instruction reordering on arguments required to be observed evaluating left to right. The generated code would only be different if there were side effects in an argument evaluation which would make the order of evaluation important; and such code would clearly be in error if the argument order was unspecified.
It's really a question about what the language does with erroneous code that depends on the order of evaluation. It would be nice to detect such situations, but this is very hard. By leaving the order unspecified one allows bugs (which usually appear only when changing compilers). Sather chooses to just eliminate the possibility.
class MAIN is main is br::=bind(_:INT.plus(_)); -- Notice the :INT #OUT + "1 + 2 = " + br.call(1,2) + '\n'; end; end;In the case of overloaded routines, the type must be inferred from the declared type of the variable.
class MAIN is foo is ... end; foo: INT is ... end; main is a: ROUT := bind(foo); -- Selects the first "foo" b: ROUT:INT := bind(foo); -- Selects the second "foo" end; end;
class A is foo is ... end; bar is ...this way instead:... end; end; class B is include A foo->old_foo; -- Wants uses in bar to be renamed as well foo is ... end; end;
class A is foo is ... end; foo2 is foo; end; bar is ...The indirection is not a performance problem with inlining.end; end; class B is include A foo->old_foo; foo is ... end; foo2 is old_foo; end; end;
Languages that operate only over immutable objects are called functional languages; operations defined over immutable types are side-effect free and therefore referentially transparent (any given expression always evaluates to the same result). There are many ways to implement immutable objects. Immutable objects may be implemented as actual values (primitive or composite) or as references to actual values or even as applied closures yielding actual values, but in all cases the value of the immutable object is the same and never changes for as long as it exists. In contrast, languages like Sather also provide reference objects, which are best used to model entities that have an identity plus a current state. The idea of an object identity bound to a modifiable state introduces side effects into the language, which can make expressions referentially opaque (an expression involving a reference object may evaluate to a different result each time that it is invoked).
Sather distinguishes between reference and immutable objects at the level of types. Instances of immutable types have value semantics: once created they never change, and there is no such thing as a "reference" to a immutable object. Reference objects have an identity and the state of a reference object can be modified by writing to its attributes.
Logically, when immutable objects are passed as arguments, their value is first copied and then the operations are invoked on the copy. The special properties of immutable objects make them especially amenable to compiler optimizations; a immutable object may be copied freely without the possibility of aliasing conflicts, allowing them to be kept in registers or efficiently stored on the stack without requiring heap allocation.
A variable of abstract type can be used to store either immutable or reference objects. Because it is desirable to make is possible to replace any concrete type by an abstraction, it is necessary for immutable types and reference types to have the same semantics with respect to assignment, passing arguments to functions, and applying the dot "." operator. It is possible to make reference types behave with value semantics by coding them with this in mind; for example, the Sather INTI class (infinite-precision integer) returns a new INTI on modification. This makes it possible to substitute either an INTI or an INT into code using only standard integer operations; more generally, it would be possible to make an abstract class $INTEGER defining an interface that such code had to conform to regardless of the implementation.
The immutable nature of immutable types means that the implicit routines that set attributes return a new object rather than modifying the old one in place. For this reason, the syntax "a.b:=c" is not legal, because it is really syntactic sugar for "a.b(c)". For a immutable class, this routine "b" has a return value which must be used in the calling context. Therefore this example should have been written "a:=a.b(c)". Notice that if you are setting multiple fields, one can conveniently string them together "a:=a.b(c).d(e).f(g)".
If this seems unnatural, consider how operations work on integers: when subtracting five from seven to get two, one isn't modifying "the" seven, turning all sevens into twos; logically one makes a new integer instead of modifying an existing one in place. Similarly, bitfield operations like AND and OR conceptually create a new integer rather than modifying one in place. It just isn't reasonable to have reference semantics on basic types. Sather allows arbitrary classes to have compiler enforced value semantics. Here's what a complex number class might look like (this is a simplified fragment of the complex number class found in the library)
immutable class CPX is attr re: FLT; attr im: FLT; create(r,i: FLT): SAME is res: SAME; -- res is a CPX immutable object res := res.re(r); -- Create a new value with re = r -- and reassign it to the result res := res.im(i); return res; -- Or, more concisely: return re(r).im(i); end; main is a: CPX; a := a.re(5.0); -- Set the real part to 5.0 -- end; end;
When immutable objects are assigned to a variable of abstract type, "boxing" occurs. This means that some heap is allocated to hold the value along with a tag that can be used for dispatching. Reference objects always have a tag so they are not boxed.
An `out' argument is passed from the called method to the caller when the called method is returned. It is a fatal error for the called method to examine the value of out argument before assigning to it.
An `inout' argument is passed to the called method and then back to the caller when the method returns. Modifications to `inout' arguments are not observed by the caller until the method returns (value-result semantics).
Remember that argument modes are specified both at the method definition and method call. Some examples of usage are given below. Test/test-out.sa contains a variety of other examples. For more information on argument modes refer to pp.43,44 of Sather 1.1 manual.
|
![]() |
Last change: 7/16/96 The Sather Team (sather@icsi.berkeley.edu) |