Classic Smalltalk Bugs

compiled by Ralph Johnson -- University of Illinois at Urbana-Champaign

Every programming system is prone to certain kinds of bugs.  A good
programmer learns these bugs, and how to avoid them.  Smalltalk is
no different.  Smalltalk eliminates lots of bugs that are common in
other languages, such as bugs in linear search algorithms.  (Just use do:)
However, it has its own set of classic bugs, which every new Smalltalk 
programmer runs into.  

There are several reasons to collect classic bugs.  The first is that
it will help experienced programmers test and debug programs, and can
help us design better programs in the first place.  Second, if we
teach these bugs up front then people should learn to be good
programmers faster.  Third, perhaps we can redesign the system to
eliminate some of these bugs, or we can write checking tools to
spot them automatically.

Bug 1: Variable-sized classes

Set, Dictionary, and OrderedCollection are variable-sized classes
that grow.  They grow by making a copy of themselves and "becoming"
the copy.  If you add new instance variables to a subclass then
you have to make sure that these instance variables get copied, too,
or else you will mysteriously lose the values of the instance
variables at random points in time.

Smalltalk-80 R4.0 (and probably some earlier versions) has a
copyEmpty: method in Collection that you are supposed to override
if you make a subclass of Collection that adds instance varaibles.
The solution to this bug is to write a version of copyEmpty: for
your class.

It has been suggested that it would be easy to write a tool that
checked that every new subclass of Collection that added instance
variables also defined a method for copyEmpty:.

Collection bugs

Bug 2: add: returns its argument

For every collection that grows, add: returns its argument,
not the receiver, and people usually assume that it returns
its receiver.  Thus, they write "(c add: x) add: y" when they should
really write "c add: x; add: y" or else "c add: x. c add: y".
Note that this is one of the good uses for "yourself", you can write
(Set new 
	add: x; 
	add: y;
	...;
	yourself) 
to make sure that you have the new Set.

Note that there are good reasons why add: returns its arguments,
and even if there weren't, it is a very, very bad mistake to
implement add: so that it returns the receiver, and so confuse
every other Smalltalk programmer on the planet.
Making add: return its argument often keeps you from resorting
to temps, because you can create the argument to add: on the
fly, and then do other things with it after the add:.  If you
want to access the collection, you can do it with yourself or
cascaded messages, as described above.

Bug 3: changing collection while iterating over it

You should never, never, never iterate over a collection which
the iteration loop somehow modifies. Elements of the collection
will be moved during the iteration, and elements might be missed
or handled twice.  Instead, make a copy of the collection you
are iterating over, i.e.
   aCollection copy do: [:each | aCollection remove: each]
is a good program, but if you leave out the copy then it isn't.

Mario Wolczko suggested a solution that catches this problem the
instant it occurs (at some performance penalty of course).  The
solution is to change the collection classes.  Each iteration method
enters that collection into a set of collections being iterated over
(IteratedCollections), executes the block, then removes the collection
from the set.  Collections are usually (only?) modified using at:put:
or basicAt:put:, so these are overriden to check that the collection
is not in IteratedCollections.  If it is, an error is signalled.  You
can either use this technique all the time, or you can just install
these classes when you are testing and debugging your program.  These
changes are packaged in a file Iterator-check.st that is available on
the Manchester and Illinois servers.  On the Illinois server, it is
in /pub/MANCHESTER/manchester/4.0/Iterator-check.st.

Bug 4: modifying copies of collections

It is common for an object to have an accessing method that returns a
collection of objects that you can modify.  However, sometimes
an object will return a copy of this collection to keep you from
modifying it.  Instead, you are probably supposed to use messages
that will change the collection for you.  The problem is that this
is often poorly documented, and a person who likes to modify collections
directly will run into problems.  See  "ScheduledControllers
scheduledControllers" for an example.

The solution is to either provide better documentation, to claim
that nobody is allowed to modify copies of collections returned
from other objects, or to have objects that don't want their
collections modified to return immutable versions of the collections
that will give an error if you try to modify them.

Bug 5: Missing ^

It is very easy to leave off a return caret on an expression.
If there is no return at the end of a method, Smalltalk returns
the receiver of the method.  It only takes one missing return
to mess up a long chain of method invocations.

Bug 6: Class instance creation methods

Writing a correct instance create method is apparently non-trivial.
The correct way to do it is to have something like
new
  ^super new init
where you redefine init in each class to initialize that class's
instance variables.  In turn, init is defined as an instance method
init
  super init "to initialize inherited instance variables"
  "initialize variables that I define"

There are lots of ways to do this wrong.  Perhaps the most common
is to forget the return, i.e. to write
   super new init
The result is that you have the class where you want the instance of
the class.  This is a special case of bug #5.

Another error is to make an infinite loop by writing
  ^self new init

If Smalltalk doesn't respond when you think it should, press ^C to
get the debugger.  If the debugger shows a stack of "new" messages
then you know you made this mistake.

Finally, you should only define "new" once for each class hierarchy
and let subclasses inherit the method.  If you redefine it in each
class then you will reinitialize the new object many times, wasting
time and perhaps memory.

One way to keep this from happening is to make the "new" method in
Object send init, and have the "init" method in Object do nothing.
Of course, sometimes the version of init that you define has arguments,
and this wouldn't help those cases.  It is probably better to rely
on education to eliminate this kind of error.

Bug 7: Assigning to classes

OrderedCollection := 2
is perfectly legal Smalltalk, but does dreadful things to your image.
This bug could be eliminated if the compiler gave a warning when you
assigned to a global variable that contained a class.

Bug 8:  become:

become: is a very powerful operation.  It is easy to destroy your image
with it.  Its main use is in growing collections (see bug #1), since
it can make every reference to the old version of a collection become a
reference to the new, larger version.  It has slightly different semantics
in Smalltalk/V and Smalltalk-80, since "x becomes: y" causes every 
reference to x and y to be interchanged in Smalltalk-80, but does not
change any of the references to y in Smalltalk/V.

Suppose that you want to eliminate all references to an object x.
Saying "x becomes: nil" works fine in Smalltalk/V, but will cause
every reference to nil to become a reference to x in Smalltalk-80.
This is a sure calamity.  You want x to become a new object with no
references, such as in "x becomes: String new".

Bug 9: Recompiling bugs in Smalltalk/V

It is easy to have references to obsolete objects in Smalltalk/V
if you change code without cleaning things up carefully. For example,
the associations whose keys are the referenced names in the Pool 
Dictionary are stored in the CompiledMethods at compile time. If you 
create a new version of the Pool Dictionary and install it by simple
assignment then the compiled methods still refer to the old associations.

If you substitute a new instance of Dictionary or replace, rather than
update an association in a pool dictionary, you have to recompile all
methods using variables scoped to that Pool.

This is is also annoying when using ENVY, where the methods are under
strict control. Perhaps Pool Dictionaries should be be first-class
versioned pre-requisites of Classes, just like the class definition.

BTW we are using/VPM 1.4 with ENVY 1.3

1. If you prune & graft a subtree of your class structure you have to
make sure that all referencing methods are recompiled. Otherwise you
will run (or your customer, because this is only detected at run time)
into an Deleted class error message.
Thomas Muhr posted a "Bite" a while ago to handle this problem for
Smalltalk/V 286.

Bug 10: Opening windows

Smalltalk-V and the older versions of Smalltalk-80 do not return to
the sender when a new window is opened. Thus, any code after
a message to open a window will never be executed.  This is the
cause of much frustration.  For example, if you try to open two
windows at once, i.e.

TextPane new open.
TextPane new open

in Smalltalk-V and

aScheduledWindow1 open.
aScheduledWindow2 open

in Smalltalk-80, then you will get only one open window,
and one forgotten piece of code.  This problem has been fixed
in Objectworks/Smalltalk R 4.1, so the above code will create 
two windows as you would expect. 

The fix for earlier versions of Smalltalk-80 is to use the
openNoTerminate method to open the window, which does not
transfer control to it.  A useful trick is to store the new
window in a global variable so you can test it.  

Aad Nales says that the fix for Smalltalk/V is to fork the 
creation of the new window.

[Textpane open] fork.

If this is not what the programmer wants then it is probably necessary
to hack the dispatcher code and remove the dropSenderChain message,
which is the ultimate cause of the problem.

Bug 11: blocks

Blocks are very powerful, and it isn't hard for programmers to get
into trouble trying to be too tricky.  To compound problems, the
two versions of Smalltalk have slightly different semantics for
blocks, and one of them often leads to problems.

Originally blocks did not have truly local variables.  The block
parameters were really local variables in the enclosing method.
Thus,

| x y |
  x := 0.
  (1 to: 100) do: [:z | x := x + z]

actually had three temporaries, x, y, and z.  This leads to bugs
like the following

someMethod

 | a b  |
 a := #(4 3 2 1).
 b := SortedCollection sortBlock: [:a :b | a someOperation: b].
 b addAll: a.
 Transcript show: a.

When elements are added to b, the sortBlock is used to tell where
to put them, but this will change a and b.  addAll: is OK, but
the "a" that gets displayed on the transcript will be an integer,
not an array.

Early versions of Smalltalk-80 (2.4 and before?) implemented blocks
like this, and Smalltalk/V still does.  However, in current PPS
implementations, blocks are close to being closures. You can declare
variables local to a block, and the names of the block parameters are
local to the block. Most people agree that this is a much better
definition of blocks than the original one.  Nevertheless, people
planning to use Smalltalk/V should realise that it has a different
semantics for blocks.

This difference can lead to some amusing problems.  For example, here
is some code written by someone who had obviously learned Scheme.

| anotherArray aBlockArray |

aBlockArray := Array new: 4.
anotherArray := #(1 2 4 8).

1 to: 4 do: [ :anIndex |
 aBlockArray at: anIndex put: [ (anotherArray at: anIndex) * 2 ]].


The programmer expected that each block would be stored in the array
along with its own value of anIndex.  If anIndex were just a local
variable of the method then this will not work.  It assumes that 
each execution of the block gets its own version of anIndex, and
Smalltalk/V and old Smalltalk-80 actually make each execution share
the same version.

So, if you are using Smalltalk/V then be careful not to reuse the
names of arguments of blocks unless you know that the blocks are
not going to have their lives overlap.  Thus,

  aCollect do: [:i | ...].
  bCollect do: [:i | ...].

is probably OK because do: does not store its argument, so the
blocks will be garbage by the time the method is finished.
However, if the first block were stored in a variable somewhere
and evaluated during the execution of the second block then
problems would probably occur.

Bug #11:  Cached Menus

Menus are often defined in a class method, where they are created
and stored in a class variable or a class instance variable.  The
method will look something like this

initializeMenu
 ...

Note that accepting the method does *not* change the menu.  You
have to execute the method to change the class variable or class
instance variable.  Often the initializeMenu method is invoked by
the class method initialize.  This can lead to the strange effect
that you can initialize the menu by deleting the class and filing
it in again, but otherwise you don't seem to be able to change
the menu (because you haven't figured out that you should really
be executing the initializeMenu method).

To make matters worse, it is possible that each instance of the
controller, or model, or whatever has the menu, stores its own
copy of the menu in an instance variable.  If that is the case then
it is not enough to execute initializeMenu, you also must cause each
object to reinitialize its own copy of the menu.  It is often easier
to just delete the objects and recreate them.

Often a class will have a #flushMenus method to clear out all menus.
Typically the method that fetches the menu will check to see if it
is nil and invoke initializeMenu if it is.  So, flushMenus will just
nil out the variable holding the menu.  The best way to figure out
what is happening is to look at all uses of the variable.  Smalltalk
experts rarely have problems with this bug, but it often confuses
novices.

Caching is a very common technique in Smalltalk for making programs
more efficient in both time and space.  Caching of menus is one of
the simplest uses of caches, and other uses can create more subtle bugs.

Bug 12: Smalltalk/V class library
Thomas Muhr makes these comments about bugs in the Smalltalk/V
class library that you should know if you want to keep your
programs fast and correct.

2. Never use symbols to label objects if you are dealing with many
objects. This will slow down your system to an almost dead halt. Use
strings instead.
3. Never use Sets when you can otherwise assure the uniqueness. Look
at the implementation of "add:" for Sets and you'll know what I mean:
on every "add:" the new element is compared to all others resulting
into a nonlinear time for adding to Sets.
4. Do not think that if you "collect:" something from a
SortedCollection, that your result will be sorted as the origin,
unless you use the default sortBlock. This is one of the bugs provided
by the language vendor

Bug 13: Using cascaded messages improperly
From: jaeck@alc.com (William A. Jaeck)

I had created a subclass of OrderedCollection with an instance
method called with:. This is supposed to do the same thing as
add:.

Then, I implemented a class method called with:with: as

   with: arg1 with: arg2

	 ^ self with: arg1; with: arg2

This ended up producing a result as if I had implemented it as

	^ self with: arg2

The correct implementation of with:with: is, of course

	^ (self with: arg1) with: arg2


Bug 14: Using class methods to implement a singleton object.
From: riks@ogicse.cse.ogi.edu (Rik Fischer SmOOdy)

A common "dirty trick"  is to use a class with class methods as an
    implementation for a "globally known" object which is expected
    to be unique.  This class is never expected to be instantiated:
    instances would have no capabilities except those inherited by
    default from #Object.
The trick usually works, but it qualifies as design by accident.
A tyro might study the object as an example and be misled.
    (It is important to encourage re-use to keep the visible structure
   as comprehensible as we can)
Future engineers might decide that there needs to be another "instance"
    of this globally known object.  I have seen an example (identity
    of perpetrators withheld to protect the unconvicted) where the
    second instance was implmented as a sub-class, with numerous methods
    over-written to access a different class variable.

On Smalltalk V/MAC, a quick scan found (Color MTrap Compiler and LCompiler )
Smalltalk80 V4 comes with DefineOpcodePool but that may have other
    redeeming social value.
The most egregious examples have come with other installed packages.

Mr. Manners instead approves of the way Smalltalk80 uses the globally
published objects named Transcript, Undeclared, and even Smalltalk itself.

Bug #15: Redefining = but NOT hash
From: Mario.Winter@FernUni-Hagen.de (Mario Winter)

Kent Beck makes these comments about redefining = most often demands 
redefining hash, because all subclasses of Set only use hash to compare 
their Elements. So don`t wonder if your set contains elements which you 
consider to be equal, if you have not redefined hash to compute the same 
hash-value for ,equal" elements. Always remember that Object defines = as 
==, that is equality is defined as identity.


Many thanks to the many people who contributed bugs or solutions to bugs
to the list.  These include

 muhr@opal.cs.tu-berlin.de (Thomas Muhr)
 steinman@is.morgan.com (Jan Steinman)
 knight@mrco.carleton.ca (Alan Knight)
 mario@cs.man.ac.uk (Mario Wolczko)
 peterg@netcom.com (Peter Goodall)
 Aad Nales <nales@cs.few.eur.nl>
 scrl@otter.hpl.hp.com (Simon Lewis)
 msmith@volcano.ma30.bull.com (Mike Smith)
 dai@mrco.carleton.ca (Naci Dai)
 dcr0@speedy.enet.dec.com (Dave Robbins)
 randy@tigercat.den.mmc.com (Randy Stafford)
 Hubert.Baumeister@mpi-sb.mpg.de (Hubert Baumeister)
 eliot@dcs.qmw.ac.uk (Eliot Miranda)
 dmm@aristotle.ils.nwu.edu (donald)
 amir@is.morgan.com (Amir Bakhtiar)
 Kurt Piersol <Piersol@Apple.com>
 sullivan@ticipa.ti.com (Michael Sullivan)
 terry@zoe.network23.com (Terry)
 brent@uwovax.uwo.ca (Brent Sterner)
 frerk@informatik.uni-kl.de
 nicted@toz.buffalo.ny.us (Nicole Tedesco)
 riks@ogicse.ogi.edu (Rik Fischer Smoody)
 marten@feki.toppoint.de (Marten Feldtmann) 
 mst@vexpert.dbai.tuwien.ac.at (Markus Stumptner)
 Mario.Winter@FernUni-Hagen.de (Mario Winter)
 jaeck@alc.com (William A. Jaeck)

**