The Smalltalk Language

Questions about	       Page
Literals                 47
Variables and Names      53
Global Variables         59
Classes                  65
What Variables Hold      68
Blocks                   71
Methods                  78
Inheritance, Self, Super 80
Other Questions          84

Literals

Literals are objects written literally, that is, with actual characters on the page or screen. Since the value of a literal is always the same, literals are sometimes called constants or literal constants.

Smalltalk has literal constants for integers, floating point numbers, a limited set of fractions, characters, booleans, strings, symbols, and arrays with literal contents.

What are the forms of integer literals?

Integers have a rich set of literal constant forms which allow the description of values with bases ranging from 2 to 36 and in arbitrary lengths.

The simplest integer literal consists of one or more decimal digits with an optional leading minus sign.

1   123   -3
There is no practical limit to the length of an integer constant.

124876435876348763498763498764359876459873659876134876234876132
A number base can be specified by a prefix with a decimal number in the range 2 to 36 followed by a lower case 'r', and followed by one or more of the digits allowed for that base. The values in each line below are equal:

2r11111111   4r3333   8r377   10r256   16rFF   32r7V

36rSMALLTALK   80738163270632
The characters allowed for base n are the n leading characters in this string:

0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ
Integers may also have an exponent, a lowercase 'e' followed by one or more decimal digits. The values in each line are equal:

1     1e0

300   2r11e2   3e2   4r3e2
See 4.3 'What are the forms of fraction literals?' for information on negative exponents.
Note the following cases which do not produce quite the results that might be initially expected:

16r2E1    Produces the value 737; 
          the 'E' is uppercase and isn't an exponent.

16RFF     Tries to send the RFF message to 16 since the 'R' is uppercase, 
          or get a message about digit too big, depending on the vendor.

16r2abff  Tries to send the abff message to 2; 'abff' is lowercase.

What are the forms of floating point literals?

Floating point numbers have a rich set of literal constant forms which allow the description of values with bases ranging from 2 to 36 and, in some implementations, in several precisions.

The simplest floating literal consists of one or more decimal digits with an optional leading minus sign and an imbedded decimal point.

1.0   12.3   -3.0
A leading or trailing decimal point is taken as a statement separator. These are not floating point constants:

123.   .123
There is no practical limit to the length of an floating point constant, but only as much of the constant as can be represented in the implementation will be retained at run time. It is sometimes useful to code constants at a greater precision to allow for portability to platforms with a greater precision at a later date.

3.14159265358979323846264338327950288419716939937510582097494459
A number base can be specified by a prefix with a decimal number in the range 2 to 36 followed by a lower case 'r', and followed by one or more of the digits allowed for that base. The values in each line are equal:

2r11111111.0  4r3333.0  8r377.0  10r256.0  16rFF.0   32r7V.0

36rSMALL.TALK   48069417.81373362
The characters allowed for base n are the n leading characters in this string:

0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ
Floating point values may also have an exponent, which is a lowercase 'e' (and 'd' in ParcPlace implementations) followed by one or more decimal digits. The values in each line are equal:

1.0     1.0e0

300.0   2r11.0e2   3.0e2   4r3.0e2
In ParcPlace implementations, the e exponent indicates a short precision floating point number and the d indicates a double-precision floating-point number.

In other implementations, the precision is platform and implementation dependent.

See ``Floating Point'' on page 121 for more information on floating point numbers.

What are the forms of fraction literals?

The only fraction literals are almost accidental and come from allowing signed exponents on integers.

123e-4				Equivalent to the expression: (123/1000)

What is the form of a character literal?

Character literals consist of a dollar sign followed by a character. A dollar sign followed by a blank is the blank character, and followed by a dollar sign is a dollar sign.

$a   $A   $z   $9   $.   $(   $)   $ (a blank)   $$
The character following the dollar sign can be any character in the implementation character set, including linefeed or return, but only the ASCII characters in the range 32 to 127 are guaranteed portable between implementations and platforms.

What is the form of a string literal?

A string literal is a pair of single quote marks, possibly enclosing other characters.

''   'a'   'A String of Characters'
A single quote mark is represented by two successive single quote marks.

'It isn''t too hard to do.'
Note that a string with one character is not a character. These are not equal:

'a'   $a

What are the boolean literals?

The boolean literals are true and false.

The value true is not the same as the class True; it is an instance of class True. The value false is not the same as the class False; it is an instance of class False. Thus, this code will fail with a message about the receiver not being a Boolean:

| bool |

bool := False.

bool ifFalse: [ Transcript cr; show: 'False']
The correct code is:

| bool |

bool := false.

bool ifFalse: [ Transcript cr; show: 'false']
While true is an instance of True, it is a special one and other instances of True cannot be substituted. The same also holds true for false.

" A new instance of True "

True basicNew ifTrue: [ 1 ]  ifFalse: [0]

" Gets message about not being a Boolean "
" A new instance of True "

True basicNew = true

" Answers false "

What are the forms of symbol literals?

Symbols are similar to strings but have additional properties.

See (some question about what symbols really are...)
Symbol literals are made up of a pound or number sign ('#') followed by a message selector.

#==   #abc   #negated   #to:by:   #in:the:course:of:
Since valid binary selectors are made up of one or two special characters, the effect of writing three or more characters is implementation defined. For example, IBM Smalltalk takes #=== as the two successive symbols #== and #= while Digitalk Smalltalk/V-Mac considers #=== as error.

IBM Smalltalk
IBM Smalltalk allows an alternate form of symbol literal consisting of a pound (or number) sign followed by a string. The examples above would look like this:

#'=='   #'abc'  #'negated' #'to:by:'  #'in:the:course:of:'
However, any character that is valid in a string can be in the extended symbol literal.

#' '   #'-----'   #'#'   #'A big step for mankind.'
An extended literal of the form:

#aSymbolString
produces a symbol with exactly the same value as the expression:

aSymbolString asSymbol

What is the form of an array literal?

An array literal is formed from a pound (or number) sign and a pair of parentheses enclosing numeric literals, valid message selectors, character literals, string literals, and other array literals. (Not included are boolean literals and extended symbol literals.) Nested arrays need not have the leading pound sign.

#( 2 3 4)                  An array of integers

#('abc' 'def')             An array of strings

#(2 $c 7.3 'def' 2r1111)   An array holding various objects

#( #(1 2) #(3 4) #(5 6))   An array of arrays

#((1 2) (3 4) (5 6))       The same array of arrays
IBM Smalltalk
IBM Smalltalk supports byte array literals which are formed from a pound (or number) sign and a pair of square brackets enclosing integer literals in the range 0-255.

#[0 1 2 3 255 7]           A byte array of 6 values

Variables and Names

What characters can be used in names?

The following characters can be used in names.

abcdefghijklmnopqrstuvwxyz

ABCDEFGHIJKLMNOPQRSTUVWXYZ

0123456789 _
Most implementations support an underscore character as if it were an alphabetic character. However, some support underscore as an alternate way of writing assignment; such usage is obsolete though.

See ``Can I use an underscore character?'' on page 59 for more information.
Names must start with an uppercase or lowercase letter (or an underscore where allowed).

Are there conventions for naming classes?

Yes. Classes are virtually always named as one or more words in some natural language. The variable must start with an uppercase letter. By convention, each additional word starts with a capital letter and underscores are not used.

SortedCollection   LargeLandMammal   RetiredEmployee
Related sets of classes are often named with some common prefix or suffix.

Collection   SortedCollection   OrderedCollection

WriteFileStream   ReadFileStream   ReadWriteFileStream

Are there conventions for naming methods?

Yes. Methods are virtually always named as one or more words in some natural language, or a standard term or abbreviation. By convention, each word but the first starts with a capital letter and underscores are not used.

sqrt   negated   min:   asFloat   employeeSpouseName   isEmpty
Methods with keyword selectors are named so that the part of the keyword selector that precedes a parameter names or implies that value.

on:from:to:   when:do:   translateBy:   truncateTo:   x:y:

indexOfSubcollection:startingAt:ifAbsent:

What different kinds of variables are there?

There are ten kinds of variables in Smalltalk. Four kinds have some amount of global scope:

Smalltalk Global Variables
Global to all classes (in dictionary Smalltalk). See question [4.22].
Class variables
Global to a class. See question [4.25].
Pool variables
Global to classes that use the pool dictionary. See question [4.29].
Special variables: nil, true, false, self, super, and Smalltalk
Global to all classes
The other six kinds are local variables. The scope of block parameters and locals depend on the implementation, with some older implementations not supporting local block variables, and making block parameters local to the method.

Instance variables
Specific to each instance
Class instance variables
Specific to class methods and subclasses. See question [4.26].
Method parameters
Local to the method
Method local variables
Local to the method
Block parameters
Local to the block, except that some older implementations make block parameters local to the method.
Block local variables
Local to the block, except that some older implementations make block parameters local to the method.
See [4.48] 'Do blocks differ between implementations?' on page 76.
In a method, instance variable names, method parameters and locals must be unique, and must not be the same as block parameters and locals. None can be the same as a special variable. Different blocks in the same method may have the same parameter and local variable names as other blocks in the method.

method: parm1 with: parm2

	| temp1 temp2 |

	temp1 := instVar1.

	self flareWith: [ :bparm |

		| btemp |

		btemp := pbarm + instVar1 ].

	self flareWith: [ :bparm2 |

		| btemp2 |

		btemp2 := pbarm2 + instVar1 ].
The variables parm1, parm2, temp1, temp2, and instvar1 must all be unique, and must be different from bparm, bparm2, btemp, and btemp2. However, it would be valid to have bparm2 be the same as bparm, and btemp the same as btemp2.

What is the lifetime of these variables?

The list below shows the lifetime of each of the kinds of variables.

Smalltalk Globals
Forever; the variables can be deleted from the dictionary, thus shortening their 'lifetime'.
Class variables
The life of the class; the variables can be deleted from the class, thus shortening their 'lifetime'.
Pool variables
The life of the class or pool; the variables can be deleted from the pool, thus shortening their 'lifetime'.
Special variables
Forever.
Instance variables
The life of the instance.
Class instance variables
The life of the class; the variables can be deleted from the class, thus shortening their 'lifetime'.
Method parameters
The life of the method context.
Method local variables
The life of the method context.
Block parameters
The life of the block invocation, except for implementations not supporting block local variables, in which case it is the life of the method context.
Block local variables
The life of the block invocation, except for implementations not supporting block local variables, in which case it is the life of the method context.

What is the search order for global variables?

Global variables are found by searching first for a class variable, then searching pool dictionary list in the order specified in the class definition, and then looking in the dictionary Smalltalk.

Are there conventions for naming variables?

Instance variables are usually named either by the class of the data, or by the expected use or actual contents, or sometimes both. Skublics calls these typed and semantic names.

Examples of typed names:

aString	An instance of String

anInteger	An instance of Integer or one of its subclasses

aNumber	An instance of Number or one of its subclasses
Examples of semantic names:

numberOfBurners	The number of burners (of a stove or hot air balloon)

nameOfEmployee	Some object holding an employee name
Examples of combined typed and semantic names:

employeeNameString	An instance of String holding an employee name

burnerCountInteger	An instance of Integer (or of a subclass)

Are there conventions for naming instance variables?

Instance variables hold the data for classes and are defined in the class itself. Since there is little context from which to imply a meaning, the names should be descriptive of the purpose and use of the variable. Thus, instance variables should be named using semantic names.

employeeName        An employe name

numberOfCylinders   A count

lotSizeInAcres      A number of some kind, giving the lot size in acres

Are there conventions for naming class variables?

Class variables hold various bits of data, and are defined in the class itself. Since there is little context from which to imply a meaning, the names should be descriptive of the purpose and use of the variable. Thus, class variables should be named using semantic names.

MaximumCylinderCount      The most!

NextEmployeeNumber        What's to come next
See [4.25] 'What is a class variable?' for more information.

Are there conventions for naming method parameters?

Method parameters can be named using either typed or semantic naming, depending on context. Typed names are useful when a value is of some expected class and context provides semantic clues:

cylinders: anInteger

	anInteger > MaximumCylinderCount ifTrue: [ "error" ].

	numberOfCylinders := anInteger
Semantic names are useful when context either does not provide semantic clues or the code is clearer anyway. Compare:

taxOnBuildingLot: lotSizeInAcres

	^ lotSizeInAcres * self buildingLotTaxRate
with:

taxOnBuildingLot: aNumber

	^ aNumber * self buildingLotTaxRate

Are there conventions for naming temporary variables?

Local variables are best if named with a meaningful semantic name that indicates how the variable is used.

Some authors recommend using a new variable and a new name for each different use, rather than naming variables with simple names like n or temp and reusing them.

(need example)

When can variables have leading capital letters?

Leading upper-case letters are required on global variables (as defined in the dictionary Smalltalk), class variables, and pool variables. They are optional on other variables.

Some implementations make exceptions:
Implementation     Must be lowercase
Digitalk           Local variables, method & block parameters
IBM Smalltalk     (none)
VisualWorks	     ?

Can I use an underscore character?

Some implementations of Smalltalk allow an underscore character to be used in variable names. Some implementations still treat an underscore as an alternative for the assignment operation.
Implementation     Is Underscore Allowed in Variables
IBM Smalltalk      Yes; may be leading character; acts as lowercase
VisualWorks	     Yes.
Digitalk           No

Global Variables

What is a global variable?

A global variable is a variable that is available to all classes, or, sometimes just to more than one class.

See [4.12] 'What different kinds of variables are there?'.

When should global variables be used?

Most experts say to never use a global variable! In Smalltalk, it is particularly bad to use variables defined in the dictionary Smalltalk because they are global to everything. Only classes and pool dictionaries should be global.

Some software development tools provide for namespaces in Smalltalk which limit the visibility of global variables to sections of a class library.

What alternatives are there to global variables?

Instead of a global variable, use a class method. Have it return the same value as the global variable would have held. Values returned by methods are hidden. The method can be overridden by subclasses and a different value might be answered.

Some authorities recommend using class methods to answer constant values rather than using class variables:

maximumCylinderCount

	^ 12
Some authorities recommend using a new class to hold values like the next employee number. The class, say EmployeeNumber, would have one instance per sequence of employee numbers. The next employee number would be thus be an instance variable.

Also see [4.27] 'What good is an class instance variable?' on page 61 for another alternative.

What is a class variable?

A class variable is a special variable which is global to the instance and class methods of a class and all of its subclasses. Class variables must start with an uppercase letter and are a part of the definition of the class itself.

Class variables are used as alternatives to global variables when the needed scope is a class, its subclasses, and their instances.

What is a class instance variable?

A class instance variable is an instance variable of the class itself. It belongs to the class and its subclasses. It cannot be seen by instances of the class. It follows the normal rules for instance variables and can have a leading uppercase letter only if generally allowed in the implementation.

What good is an class instance variable?

Class instance variables are rarely used by applications. One possible use is to hold a value which is returned by a class method, rather than use a class variable to hold the same value. Instead of writing:

MySpecialValue
to access a class variable, you might write:

self class mySpecialValue
which invokes the class method mySpecialValue which answers the value of a class instance variable holding the same value. To set the value, you might write:

self class mySpecialValue: aValue
There are several advantages:

Information hiding
Instances of the class or of its subclasses cannot access the variable directly.
Overriding in subclasses
Subclasses can override the method, providing modified access or hiding access.

Pool Dictionaries

What are pool dictionaries?

Pool dictionaries are dictionaries which are examined by the compiler to resolve names found in methods. Each class can define its own pool dictionaries. Pool dictionaries are not inherited.

Pool dictionaries are usually used to hold the names of constants, often achieving the same purpose as header files of #DEFINE statements as used in C and C++.

Pool dictionary names are, however, themselves global variables and are stored in the Smalltalk dictionary.

Example 7 shows a possible pool dictionary definition.

" Definition of DiddleZork pool dictionary "

| pool |

pool := Dictionary new.

pool at: #FlagBits put: 2r10001000.

pool at: #ZonkFlag put: 16rFFFF0000.

pool at: #DiddleIt put: 2r00000001.

Smalltalk at: DiddleZork put: dict
If a class uses the pool dictionary DiddleZork, it might hold code like that in Example 8.

" Use of DiddleZork pool dictionary "

FlagBits bitOr: DiddleIt

" Answers: 89 "
Pool dictionaries can be set as in Example 9, but it is bad practice. Pool dictionaries should only hold constant values.

" Setting values in DiddleZork pool dictionary "

FlagBits := 2r10001111

What pool dictionaries come with Smalltalk?

Each vendor provides a different set of pool dictionaries.

IBM Smalltalk
Pool Dictionary    Dictionary Contents
CfsConstants       Constants used by file system calls
CgConstants        Constants used by graphics calls
CldtConstants      Characters often used in printable strings
CwConstants        Constants used by widgets calls
SystemExceptions   Values used in exception handling
Digitalk V/Mac 2.0
Pool Dictionary    Dictionary Contents
CharacterConstants Values of ASCII characters: Tab, Space, Lf, Cr, ...
EventConstants     Values like: Button1, Button2, ShiftKey, ...
ParcPlace VisualWorks 2.0
Pool Dictionary    Dictionary Contents
IOConstants        Values like: CR and LF
TextConstants      Values like: CR, LF, Space, Tab, Ctrln

Should pool dictionary names used prefixes?

Yes, although most vendors do not do this for their own pools. If you define a pool dictionary, use a meaningful prefix to assist in identifying the values when others see them in the code.

For example, if your are defining a pool dictionary which holds various limits on financial transactions, you might name the pool dictionary FinancialLimits and prefix each entry with FinLim:
Pool Constant              Description
FinLimWireTransfer         The largest allowed wire transfer
FinLimATMDailyWithdrawal   The largest ATM cash withdrawal for a single day
FinLimATMMinimumIncrement  The smallest bill carried in an ATM machine

Are there alternatives to pool dictionaries?

Yes.

Class methods
If there are not a huge number of values, then class methods can substitute for pool dictionaries. If the values apply to just one class (and its subclasses) and there are few of them, then the methods might go directly in the class itself. If there are more than a few, consider making a new class just to hold the constants.
Instance variable with a dictionary value
Rather than making a pool dictionary, make the same dictionary but put it into an instance variable. Each instance can share the same dictionary, or might have slightly different dictionaries, depending on the needs of the application. Access to the dictionary is simply:
dict
at: key 
Class instance variable with a dictionary value
Rather than making a pool dictionary, make the same dictionary but put it into an class instance variable, and make a class method that answers the dictionary. There is just one dictionary for the class and it takes up no space in the instance. Access to the dictionary is simply:
self class dict at: key 

When should pool dictionaries be used?

Pool dictionaries should be only when these four criteria hold true:

When there are a lot of named values
If there are just a few values it is better to have a class method return the value.
When the values go together in a coherent way
If the values are not connected they have no business in a pool dictionary together. They should be either in multiple pools or not in pools at all.
When the values are constant
Do not use pool dictionaries for values that change. Use them only for constant values, and write some code somewhere that initializes the pool dictionary when the application or class is loaded. Pools that are changed are giant traps for the unwary and will rise up and bite at inopportune times.
When the dictionary can be used by more than one class
If the values in a dictionary belong to just one class they are better off in an instance variable or class variable, even if it means having to do an explicit at: to get the value.

Classes

How can I initialize classes?

Classes are initialized by code in the class itself. By convention, a class method named initialize is used, and is run once, by manual invocation, when the class is defined. When making changes to the dictionary, it is easy to forget to run the method. Many programmers put a comment at the top of the method:

	" Don't forget to re-initialize me when you make changes:

          ThisClassName initialize     "
and then select the expression and evaluate it after making a change.

IBM Smalltalk and ENVY
Some implementations of Smalltalk automatically invoke a class initialize method when the class is loaded from a fileout. IBM Smalltalk and systems with Envy do this; others may. The fileout is generated with a line for each class initialize method like this:

Clunker initialize !
When classes are loaded from an unload file (using Envy or IBM Smalltalk), the class is sent the message loaded. It can then initialize itself.
"ENVY/Developer has two ways of automatically initializing stuff on load: loaded and the SubApplication>>toBeLoadedCode. The toBeLoadedCode message is run before your app is loaded. It is primarily used for things like pool dictionaries which are required for code to be properly compiled or linked on load. Basically it is just an arbitrary string that you define to do whatever you want. On load, the string is fetched, compiled and run."
"One thing that you should watch out for is multiple initializations. It is quite common to call, from MyApp class>>loaded, the initialization method of several classes MyApp defines. For initialization methods that initialize a class variable, you only want to initialize it once but if you blindly send #initialize (or whatever) to subclasses which inherit it from the class with the classVar, you could be initializing the classVar many times. Worse, you could reinitialize it when you load some app which subclasses your class."
"I generally define two load initialize paths: initializeOnLoad and initializeOnLoadOneTime. The first I automatically send to all classes defined by the app. This is necessary to initialize class instance variables (i.e., you must initialize each class individually). The second I send only to defined classes which directly implement initializeOnLoadOneTime. So, my loaded method looks something like:"
loaded
self defined do: [:aClass |
aClass initializeOnLoad.
(aClass class includesSelector: #initializeOnLoadOneTime)
ifTrue: [aClass initializeOnLoadOneTime]]
Jeff McAffer, jeff@is.s.u-tokyo.ac.jp

How do I create new classes from within a method?

The details of this differ; peek inside your favorite browser for details. But, note that this operation can only be done in the development environment; most vendors don't allow it in a packaged application.

How do I add a method to a class from within a method?

The details of this differ; peek inside your favorite browser for details. But, note that this operation can only be done in the development environment; most vendors don't allow it in a packaged application.

What is a Metaclass?

A metaclass is an instance of class Metaclass (or MetaClass in some systems). Each class is an instance of an instance of Metaclass.

See 12.9 'Where is the 'new' method (or: Why is 'new' THERE!)?' for details.

How should an object be checked for class membership?

It's a bad idea to check an instance to see what its class is. It often indicates some real problem with the design of a class when it seems necessary to check an instance for membership in some class.

If it is necessary, check for membership in a hierarchy by asking if an instance belongs to a class or one of its subclasses.

thisThing isKindOf: Integer
Why? Because new subclasses can appear and disappear. The instance you have today may turn into an instance of a subclass tomorrow. Integers are a prime example. There never are any instances of Integer around. There may be instances of SmallInteger, LargePositiveInteger, and LargeNegativeInteger, or instances of SmallInteger and LargeInteger, or some other set of subclasses of Integer. Further, just what values map to just what classes varies by implementation.

This is true of most well designed hierarchies; classes come and go. Basic structure is less apt to change.

What Variables Hold

What do Smalltalk variables hold?

It is commonly said that Smalltalk has solved the 'pointer problem' since Smalltalk appears to not have any pointers at all. Smalltalk has solved the 'pointer problem', but it did it by making everything a pointer rather than eliminating them. Since everything is a pointer (and there are no pointer manipulation operations), Smalltalk does not have the exposures of, say, C or C++ to pointer abuse.

Every value in every variable in Smalltalk is a pointer to the value it represents. However, rarely does this fact become visible in programs. Since all variables always hold pointers to objects, it is common to speak as if variables held the objects themselves. Rather than saying the string referenced by name it is common to say the string in name.

See [7.4] 'How big are SmallIntegers?' on page 120.
See [5.6] 'When do pointers become visible?' on page 88.

What is a type?

Popular languages that most programmers are familiar with, such as C, C++, COBOL, FORTRAN, PL/I, and Pascal, have types. A variable, say zot, cannot be declared without specifying, explicitly, or implicitly by some default rules, what it 'holds'. This is done using language keywords as in C:

    long int zot;

    float flot;
The variable zot holds (or describes or references or whatever) some memory. On many machines this will be 32 bits of memory aligned in some way. Since the type is long int, the compiler will generate code that treats these bits as an integer.

The variable flot holds (or describes or references or whatever) some memory. This will be 32 bits of memory aligned in some way. Since the 'type' is float, the compiler will generate code that treats these bits as a floating point number.

The bits are otherwise the same. There is no distinction between memory locations that hold integers of 32 bits length and floating point numbers of 32 bits length. (Using some well-defined constructions, C programmers can even access the same memory 'cell' with variables of various types; FORTRAN programmers can, too, even more easily, with equivalence declarations.)

The language has to know at compile time what type a variable is, and it generates code using that knowledge. There are no tags on the data saying that 'this is an integer' or 'this is a float'.

Types are a characteristic of variables which aid the compiler in producing proper code.

Does Smalltalk have types?

See [4.39] 'What is a type?' for an earlier question on the same topic and [4.41] 'Is there any disagreement about types in Smalltalk?' on page 71 for other views.
Microsoft Press Computer Dictionary. 2nd Ed. Cook, Hill & Canning, ``Inheritance is not Subtyping''
in Theoretical Aspects of Object-Oriented Programming, MIT Press, page 516.
Smalltalk has variables but no language keywords to specify what the variable holds. Variables are declared by listing their names:

| zot flot |
The variables zot and flot can 'hold' objects.

See [4.38] 'What do Smalltalk variables hold?'
So what are the types of zot and flot? There aren't any. (One could just as well say that they have the type 'object', but does a language with one type have types?)

Now, consider this code:

| zot flot |

zot  := 23.

flot := 23.45.
What are the types of the variables zot and flot? Does zot now have the type 'Integer' and flot the type 'Float'? No. They didn't have a type before and they don't after assignment.

Assigning a value to a variable in Smalltalk does not change the type of the variable, at least without doing great damage to the popularly held view of what 'type' means. A variable may 'hold' an integer without being of type Integer.

Now, consider:

| zot flot |

zot  := 23.

flot := 23.45.

flot := flot asInteger
Has flot changed its type yet again? No!

Consider this code:

holdsNil: aCollection 

	aCollection 

		do: [ :element |  

			element = nil  

				ifTrue: [ ^ true ] ]. 

	^ false
What is the type of aCollection? Does it change its 'type' every time that the method is called? In this method aCollection can be any object that responds to do: by invoking a block and passing one parameter to it. It doesn't have to be a collection of any kind.

Saying that all types derive from type Object, as some proponents of other languages do, misses the point entirely. There are no types to derive from. Object is no more a type than Float.

Smalltalk inheritance is not a type inheritance but an implementation sharing inheritance. A subclass is not a subtype but an implementation of a new object that shares some or all of the implementation of the parent class.

Smalltalk values are tagged. The variable aCollection is self describing. There is no need to specify that a variable that holds it holds a collection. The compiler can generate code without having any type information -- the only operations in Smalltalk are message sends and assigns.

Smalltalk compilers optimize generated code, sometimes making non-binding assumptions about what a variable might hold. See [12.1] 'How can Smalltalk be optimized?'

Is there any disagreement about types in Smalltalk?

There is some disagreement on this topic, to put it mildly; saying that it causes flame wars again puts it mildly.

Some claim that, while it is literally true that Smalltalk variables have no type declaration, classes are really the types in Smalltalk programs. This seems to redefine what the term type means from a characteristic of a variable to a characteristic of data. While this kind of redefinition happens across time with all languages, it is an especially dangerous thing to happen to technical terms which must have a precise definition in order to be useful at all.

Others argue that Smalltalk should have a type system, and that programmers should declare the types of variables. OK, but then the result is not Smalltalk, it is some new language similar to Smalltalk. It may be better, or it may not, but it's not Smalltalk.

Yet others argue that some new thing should be introduced to replace types. In one proposal, the type of a variable becomes a list of the classes whose instances the variable might at some time hold. Again, this really defines a new language similar to Smalltalk.

(Need to get pointers into the literature.)

Blocks

What are blocks?

Blocks are expressions or groups of expressions enclosed in square brackets:

[ 2 + 3 ]
[	a := 1.

	b := 2.

	c := a + b ]
A block is itself an expression; its value is the block itself, not the result of evaluating the contents.

Blocks resemble small subroutines:
The block in the third line is effectively ignored; its value is the block itself but nothing is done with the result. The block must be explicitly invoked, or evaluated, as in Example 11:
|
a |
a := 1.
[ a := a + 1 ] value.
^ a
" Answers: 2 "
Example 11 answers 2.
Parameterized blocks are invoked with one of the value messages: value:, value:value:, value:value:value:, or valueWithArguments:.
Example 12 answers 3:
|
a |
a := 1.
[ :increment |
a := a + increment ] value: 2.
^ a
" Answers: 3 "
Example 13 answers 3:
|
a |
a := 1.
[ :increment |
| temp |
temp := a + increment.
a := temp ] value: 2.
^ a
" Answers: 3 "

Are blocks objects?

Yes. Blocks are full fledged objects. They can be assigned to variables, placed into objects, and passed as parameters. The sort block used by sorted collections is a good example of passing a block as a parameter.

SortedCollection sortBlock: [ :first :second | first < second ]
Sort blocks can be assigned to variables and used later:

descending := [ :first :second | first < second ].

ascending  := [ :first :second | first > second ].

...

SortedCollection sortBlock: ascending

When are block objects created?

Block objects are created at the point in a program when they are invoked, passed as a parameter, or assigned to a variable. In some cases, the compiler can optimize away the need to actually create a block object. In other cases, the block object must be created.

In Example 14, the block refers to a method local variable, a. Changes to the variable are reflected in the block no matter when it is evaluated.

| a block |

a := 1.

block := [ a ].

a := 3.

block value

" Answers: 3 "
In Example 15, the outer block is passed the block local variable, a, and it answers another block which answers the value of the parameter. This binds the value of x to that of a at the time of the assignment.

| a block |

a := 1.

block := [ :x | [x] ] value: a.

a := 3.

block value

" Answers: 1 "
See also Example 18 in 4.48 Do blocks differ between implementations?

When is it useful to put blocks in variables?

Many authorities think that blocks should rarely be put into variables. But there are times when it is a useful practice, especially in just those circumstances where a large switch statement in C is useful: given some integer value with a large number of possible values, choose some action that depends on the value.

A common case occurs when processing a stream of characters or processing an error return code from an operating system.

Are blocks optimized?

Blocks can be optimized and most implementations do some degree of optimization of blocks.

In the simplest of cases, ifTrue: and ifFalse: are not sent, and the block's code is expanded inline. While this may require compiling code for two cases (when the receiver is and is not boolean), it is much faster in the common case.

How about optimizations of blocks not simply executed inline?

Optimizations can be performed on blocks that will be passed as parameters or stored into variables. These optimizations can make a considerable difference in both execution time and memory use.

In the general case, a block has to know about the method context in which it was assigned or from which it was passed. The block might refer to parameters of the method, to local variables of the method, or to instance variables of the object, or it might contain a return statement that would cause a return from the method.

If a block contains no references to variables outside of itself and has no return statement, then the block object can be smaller and simpler. A block that references only instance variables might also have a block object that is simpler than the general case.

In order to allow an implementation to perform those optimizations it can perform, it is best to code blocks according to these guidelines:

Make all temporary variables local to the block
Do not define outside the block those temporary variables that will be used only inside the block. That is, write:
|
x |
x := [ :parm |
| local |
... local ... ]
instead of:
| x local |
x := [ :parm |
... local ... ]
Pass instance variables or block local variables instead of referring to them directly
Instead of referencing instance variables directly, pass their values as parameters. That is, write:
| x |
x := [ :parm |
... parm ... ].
x value: instVar
instead of:
| x |
x := [ ... instVar ... ].
x value
Avoid returns from blocks
It's generally a bad idea to have a return from a block that is stored or passed since there is no guarantee that the context in which the block object was created still exists. Such blocks cannot be optimized since they must carry along the whole context of the method invocation in the method where they are written.

Do blocks differ between implementations?

Yes. In particular, most Digitalk versions of Smalltalk, and earlier versions of VisualWorks do not support local variables within blocks; all locals variables must be declared at the method level.

Worse, they consider block parameters as method-wide local variables.

For example, Example 16 may have two method-wide local variables (counter and index) or just one (counter), depending on which implementation is used:

| counter |

counter := 0.

(1 to: 20) 

	do: [ :index | 

		counter := counter + index ].
Example 17 may answer some integer (in this case probably 2) when run on a Smalltalk with method-level local variables. The problem is that the parameters have the same names as the two declared local variables, and the values passed to the block when it is sorted by the addAll: replace the values explicitly assigned. Other systems should flag such uses as errors.

| a b |

a := #(9 1 6 4 8).

b := SortedCollection sortBlock: [ :a :b | a > b ].

b addAll: a.

^ a
Example 18 assigns blocks to blockArray.

| blockArray dataArray |

blockArray := Array new: 5.

dataArray := #( 'Apple' 'Orange' 'Grape' 'Lemon' 'Kiwi').

1 to: blockArray size do: [ :index |

	blockArray 

		at: index

		put: [ dataArray at: index ] ].

^ (blockArray at: 2) value
The block:

[ dataArray at: index ]
has a reference to the variable index. If index is a method-level local variable, then it is the same variable for each block stored and the desired effect will not occur.

If index is a block-local variable, then it is a different variable each time the block is invoked, and a different value will be stored with the block; this is the desired situation.

The example ends with the return of the value of the second saved block.

When run on Digitalk Smalltalk/V-Mac 2.0, this code gets a walkback in the stored block because of an index out of bounds. The value of index is 6, indicating that it is one more than the last value it had in the loop.

When run on ParcPlace VisualWorks 2.0, this code answers the string 'Orange'.

Adapted from Ralph Johnson, Classic Smalltalk Bugs

What is a closure?

A closure is a closed expression, or an expression which carries along the meanings of its free variables. Closure is not a Smalltalk term, but comes from language theory. In Smalltalk, blocks in implementations which have local state (local variables private to the block) are called closures. Blocks in implementations where there are no block local variables and/or where block parameters are local variables in the containing method are not closures.

Methods

What are private methods?

Private methods are methods whose author marked them as private using some convention or another. The idea is that the author of the code will not himself use them from outside the class tree in which they are defined. The author of the code has hopes that his friends won't use them either.

Private methods are indicated different ways in different implementations:

Digitalk Smalltalk
Indicated by a comment at the front of the method.
(What happens with Team/Tools?)
IBM Smalltalk and systems with ENVY
A separate method type is maintained by the system. Browsers can show plublic, private, or both.
ParcPlace VisualWorks
Separate categories are created for private methods.

Can the privacy of private methods be enforced?

No. Marking a method private is like telling your dog not to eat that steak you're leaving out on the floor. Maybe your dog is well trained...

There are ways to see if a private method is being used from outside its intended scope. The techniques vary slightly by implementation.

VisualWorks

thisContext sender receiver == self 

	ifFalse: [^self error: 'Hey, I''m private!'].

self privateStuff
``Due to the run time cost of 'realizing' a context object that is normally cached inside the VM, this technique is too expensive for most use, but using conditional compilation techniques described in our Jan 96 TSR [The Smalltalk Report] column, you could arrange to do such privacy checks during testing, and re-compile them out for delivery.''
Jan Steinman, Barbara Yates <barbara.bytesmiths@acm.org> IBM Smalltalk

The heart of the technique is a one-liner which answers the caller of the current method.:

Processor activeProcess stackAtFrame: 1 offset: -3
If used from within a block the results are undefined; it usually will answer the method that invoked the block but that's not the same as the method that invoked the method that contains the block.

It's handy to package up such expressions, say by making a new method. The first bit of code below is a new method for Object which answers its callers caller. (Thus the '1' becomes a '2' indicating one level further up the stack.)

! Object methods !

sender

	^ Processor activeProcess stackAtFrame: 2 offset: -3
This method can be used to check for privacy:

myMethod

	self sender = self ifFalse: [ self error: 'blah' ].

	" ... do your stuff ... "
A more comprehensive method, again for Object, is a bit slower but is simpler to use. It performs the full check for privacy and issues a message if violated:

! Object methods !

isPrivateMethod

	| activeProcess sender sendersSender |

	activeProcess := Processor activeProcess.

	sender        := activeProcess stackAtFrame: 1 offset: -3.

	sendersSender := activeProcess stackAtFrame: 2 offset: -3.

	sender = sendersSender 

		ifFalse: [ self error: 'My sender is not myself.' ]
Use it this way:

myMethod

	self isPrivateMethod.

	" ... do your stuff ... "

Inheritance, Self and Super

What is self?

The special variable self represents the object on behalf of which a method is being run. It may be an instance of the class in which the method is located or it may be an instance of a subclass.

The value of self is considered a hidden parameter in all message sends. For example, in the expression:

employee name: aString
two parameters are passed: first, the value in employee which will become self, and then the value in aString.

Messages sent to self are sent to the object that self represents.

What is super?

The special variable super is another name for self. It has one special property: messages sent to super are bound to a method found by starting the search in the parent of the class in which the method is written.

In its simplest and most common case, super is used in user written class methods named new.

" The standard new method idiom "

new

	^ super new

		initialize
In Example 21, an instance needs to be initialized. The normal idiom overrides the new method, gets a new instance, sends the initialize message to it, and returns the new instance (assuming that initialize answers self, which it usually does).

The problem comes when getting that new instance. The expression self new cannot be used since it would recursively reinvoke the method. Sending new to super solves the problem by asking the parent to do the work. Since the parent presumably did it right before we overrode new, it will still do it right now.

See [4.56] 'When should super not be used?'

Why can't I pass super as a parameter?

The value of super is identical to the value of self. If super is passed as a parameter or assigned to a variable, the value passed or assigned is that of self. its special properties are lost.

Messages intended for the parent of the class must be sent directly to super, literally:

super aSelector
Why is the implementation this way? The alternative is to have two values, one for self and one for super, which refer to the same object. They would have to compare equal using = but not using ==, which would be a bizarre result. Besides, there is no practical need to have two such values.

[4.55] 'Why doesn't cascading with super work?'

Why doesn't cascading with super work?

The following two bits of code are equivalent:

super m1; m2
and:

super m1.

self m2
That is, it's defined in the Blue Book this way and that's the way it is. Anyone who knows a good reason why it must be this way is invited to communicate with the editor.

But see [4.56] 'When should super not be used?'.

When should super not be used?

The variable super should only be used when the message name is the same as the current method name. That is, it should always be true that methodName is the same as messageName in:

methodName

	super messageName
Messages sent to any other name should use self and not super. Using super bypasses method(s) that belong to self (or lower parents of self). If that is your intent, you need to carefully rethink what you are doing. Such code is ugly, tricky, hard to read, hard to maintain, and will rise up and zap you or a teammate later.

Note to implementers: Compilers really should warn about this case.

Example 22 is similar to code found by the author in a commercial product. The fromFile: method uses super new to bypass the redundant buildTitle in new.

new

	super new buildTitle

fromFile: aString

	super new 

		fileName: aString;

		buildTitleFrom: aString
The problem comes when subclassing this code and overriding new. The method fromFile: will then bypass the subclasses new method. Such bugs are no fun to find. The original code might have been written like this. (Bold in a method shows changes.)

new

	super new initialize

fromFile: aString

	self new 

		fileName: aString

initialize

	fileName isNil 

		ifFalse: [ self buildTitle ]

		ifTrue:  [ self buildTitleFrom: fileName ]

What is the difference between self and yourself?

The word self is a reserved word in Smalltalk that refers to the object which is the receiver of a message.

The word yourself is a message name which can be sent to any object. In response, the object answers self. That is, yourself has the implementation:

yourself

	^self
yourself is used in cascaded messages to assure that the value answered by the cascade is the receiver of the cascade. For example, the code:

oc := OrderedCollection new

	add: 'hello';

	add: 'there
assigns 'there' to oc since the last add: returns 'there'. However, the code:

oc := OrderedCollection new

	add: 'hello';

	add: 'there;

	yourself
assigns the new instance of OrderedCollection to oc since yourself always answers the receiver.

(Need to ref questions on add:.)

David Buck, The Object People

Are there any uses of yourself except in cascades?

``Cascaded messages account for virtually all of the cases where yourself is used. There are some other possible uses, but they are contrived. Suppose I maintained a dictionary of messages to send to different objects to get back a string to display on the screen. I could write:

   stringFor: anObject

      ^anObject perform: (dictionary at: anObject class)
In the dictionary, I could have:

    Integer->#printString

    Client->#formattedPrintString

    String->#yourself

    ...
I would think twice (maybe three times) about using such a technique. It does, however, demonstrate that it's possible to have another meaningful use of yourself.''

David Buck, dbuck@ccs.carleton.ca

Other Questions

Ones not having found a proper home yet...

What does subclassResponsibility mean, and what is it used for?

The message subclassResponsibility causes an error message to be issued that says something like:
'My subclass should have implemented this message.'
It is used when a superclass needs to define some protocol but cannot provide an implementation. It documents an interface indicating that the method exists and it serves as documentation to implementers of subclasses that the method must be implemented. It is never intended that it be executed.

An example is < (less-than) in Magnitude. This message must be provided by all subclasses; a meaningful less-than comparison is what makes a magnitude be a magnitude. However, the actual implementation is very dependent on the data formats of the subclasses; thus Magnitude cannot do anything better than issue an error message.




Last Modified: 01:37pm PST, January 07, 1996