Classic Smalltalk Bugs compiled by Ralph Johnson -- University of Illinois at Urbana-Champaign Every programming system is prone to certain kinds of bugs. A good programmer learns these bugs, and how to avoid them. Smalltalk is no different. Smalltalk eliminates lots of bugs that are common in other languages, such as bugs in linear search algorithms. (Just use do:) However, it has its own set of classic bugs, which every new Smalltalk programmer runs into. There are several reasons to collect classic bugs. The first is that it will help experienced programmers test and debug programs, and can help us design better programs in the first place. Second, if we teach these bugs up front then people should learn to be good programmers faster. Third, perhaps we can redesign the system to eliminate some of these bugs, or we can write checking tools to spot them automatically. Bug 1: Variable-sized classes Set, Dictionary, and OrderedCollection are variable-sized classes that grow. They grow by making a copy of themselves and "becoming" the copy. If you add new instance variables to a subclass then you have to make sure that these instance variables get copied, too, or else you will mysteriously lose the values of the instance variables at random points in time. Smalltalk-80 R4.0 (and probably some earlier versions) has a copyEmpty: method in Collection that you are supposed to override if you make a subclass of Collection that adds instance varaibles. The solution to this bug is to write a version of copyEmpty: for your class. It has been suggested that it would be easy to write a tool that checked that every new subclass of Collection that added instance variables also defined a method for copyEmpty:. Collection bugs Bug 2: add: returns its argument For every collection that grows, add: returns its argument, not the receiver, and people usually assume that it returns its receiver. Thus, they write "(c add: x) add: y" when they should really write "c add: x; add: y" or else "c add: x. c add: y". Note that this is one of the good uses for "yourself", you can write (Set new add: x; add: y; ...; yourself) to make sure that you have the new Set. Note that there are good reasons why add: returns its arguments, and even if there weren't, it is a very, very bad mistake to implement add: so that it returns the receiver, and so confuse every other Smalltalk programmer on the planet. Making add: return its argument often keeps you from resorting to temps, because you can create the argument to add: on the fly, and then do other things with it after the add:. If you want to access the collection, you can do it with yourself or cascaded messages, as described above. Bug 3: changing collection while iterating over it You should never, never, never iterate over a collection which the iteration loop somehow modifies. Elements of the collection will be moved during the iteration, and elements might be missed or handled twice. Instead, make a copy of the collection you are iterating over, i.e. aCollection copy do: [:each | aCollection remove: each] is a good program, but if you leave out the copy then it isn't. Mario Wolczko suggested a solution that catches this problem the instant it occurs (at some performance penalty of course). The solution is to change the collection classes. Each iteration method enters that collection into a set of collections being iterated over (IteratedCollections), executes the block, then removes the collection from the set. Collections are usually (only?) modified using at:put: or basicAt:put:, so these are overriden to check that the collection is not in IteratedCollections. If it is, an error is signalled. You can either use this technique all the time, or you can just install these classes when you are testing and debugging your program. These changes are packaged in a file Iterator-check.st that is available on the Manchester and Illinois servers. On the Illinois server, it is in /pub/MANCHESTER/manchester/4.0/Iterator-check.st. Bug 4: modifying copies of collections It is common for an object to have an accessing method that returns a collection of objects that you can modify. However, sometimes an object will return a copy of this collection to keep you from modifying it. Instead, you are probably supposed to use messages that will change the collection for you. The problem is that this is often poorly documented, and a person who likes to modify collections directly will run into problems. See "ScheduledControllers scheduledControllers" for an example. The solution is to either provide better documentation, to claim that nobody is allowed to modify copies of collections returned from other objects, or to have objects that don't want their collections modified to return immutable versions of the collections that will give an error if you try to modify them. Bug 5: Missing ^ It is very easy to leave off a return caret on an expression. If there is no return at the end of a method, Smalltalk returns the receiver of the method. It only takes one missing return to mess up a long chain of method invocations. Bug 6: Class instance creation methods Writing a correct instance create method is apparently non-trivial. The correct way to do it is to have something like new ^super new init where you redefine init in each class to initialize that class's instance variables. In turn, init is defined as an instance method init super init "to initialize inherited instance variables" "initialize variables that I define" There are lots of ways to do this wrong. Perhaps the most common is to forget the return, i.e. to write super new init The result is that you have the class where you want the instance of the class. This is a special case of bug #5. Another error is to make an infinite loop by writing ^self new init If Smalltalk doesn't respond when you think it should, press ^C to get the debugger. If the debugger shows a stack of "new" messages then you know you made this mistake. Finally, you should only define "new" once for each class hierarchy and let subclasses inherit the method. If you redefine it in each class then you will reinitialize the new object many times, wasting time and perhaps memory. One way to keep this from happening is to make the "new" method in Object send init, and have the "init" method in Object do nothing. Of course, sometimes the version of init that you define has arguments, and this wouldn't help those cases. It is probably better to rely on education to eliminate this kind of error. Bug 7: Assigning to classes OrderedCollection := 2 is perfectly legal Smalltalk, but does dreadful things to your image. This bug could be eliminated if the compiler gave a warning when you assigned to a global variable that contained a class. Bug 8: become: become: is a very powerful operation. It is easy to destroy your image with it. Its main use is in growing collections (see bug #1), since it can make every reference to the old version of a collection become a reference to the new, larger version. It has slightly different semantics in Smalltalk/V and Smalltalk-80, since "x becomes: y" causes every reference to x and y to be interchanged in Smalltalk-80, but does not change any of the references to y in Smalltalk/V. Suppose that you want to eliminate all references to an object x. Saying "x becomes: nil" works fine in Smalltalk/V, but will cause every reference to nil to become a reference to x in Smalltalk-80. This is a sure calamity. You want x to become a new object with no references, such as in "x becomes: String new". Bug 9: Recompiling bugs in Smalltalk/V It is easy to have references to obsolete objects in Smalltalk/V if you change code without cleaning things up carefully. For example, the associations whose keys are the referenced names in the Pool Dictionary are stored in the CompiledMethods at compile time. If you create a new version of the Pool Dictionary and install it by simple assignment then the compiled methods still refer to the old associations. If you substitute a new instance of Dictionary or replace, rather than update an association in a pool dictionary, you have to recompile all methods using variables scoped to that Pool. This is is also annoying when using ENVY, where the methods are under strict control. Perhaps Pool Dictionaries should be be first-class versioned pre-requisites of Classes, just like the class definition. BTW we are using/VPM 1.4 with ENVY 1.3 1. If you prune & graft a subtree of your class structure you have to make sure that all referencing methods are recompiled. Otherwise you will run (or your customer, because this is only detected at run time) into an Deleted class error message. Thomas Muhr posted a "Bite" a while ago to handle this problem for Smalltalk/V 286. Bug 10: Opening windows Smalltalk-V and the older versions of Smalltalk-80 do not return to the sender when a new window is opened. Thus, any code after a message to open a window will never be executed. This is the cause of much frustration. For example, if you try to open two windows at once, i.e. TextPane new open. TextPane new open in Smalltalk-V and aScheduledWindow1 open. aScheduledWindow2 open in Smalltalk-80, then you will get only one open window, and one forgotten piece of code. This problem has been fixed in Objectworks/Smalltalk R 4.1, so the above code will create two windows as you would expect. The fix for earlier versions of Smalltalk-80 is to use the openNoTerminate method to open the window, which does not transfer control to it. A useful trick is to store the new window in a global variable so you can test it. Aad Nales says that the fix for Smalltalk/V is to fork the creation of the new window. [Textpane open] fork. If this is not what the programmer wants then it is probably necessary to hack the dispatcher code and remove the dropSenderChain message, which is the ultimate cause of the problem. Bug 11: blocks Blocks are very powerful, and it isn't hard for programmers to get into trouble trying to be too tricky. To compound problems, the two versions of Smalltalk have slightly different semantics for blocks, and one of them often leads to problems. Originally blocks did not have truly local variables. The block parameters were really local variables in the enclosing method. Thus, | x y | x := 0. (1 to: 100) do: [:z | x := x + z] actually had three temporaries, x, y, and z. This leads to bugs like the following someMethod | a b | a := #(4 3 2 1). b := SortedCollection sortBlock: [:a :b | a someOperation: b]. b addAll: a. Transcript show: a. When elements are added to b, the sortBlock is used to tell where to put them, but this will change a and b. addAll: is OK, but the "a" that gets displayed on the transcript will be an integer, not an array. Early versions of Smalltalk-80 (2.4 and before?) implemented blocks like this, and Smalltalk/V still does. However, in current PPS implementations, blocks are close to being closures. You can declare variables local to a block, and the names of the block parameters are local to the block. Most people agree that this is a much better definition of blocks than the original one. Nevertheless, people planning to use Smalltalk/V should realise that it has a different semantics for blocks. This difference can lead to some amusing problems. For example, here is some code written by someone who had obviously learned Scheme. | anotherArray aBlockArray | aBlockArray := Array new: 4. anotherArray := #(1 2 4 8). 1 to: 4 do: [ :anIndex | aBlockArray at: anIndex put: [ (anotherArray at: anIndex) * 2 ]]. The programmer expected that each block would be stored in the array along with its own value of anIndex. If anIndex were just a local variable of the method then this will not work. It assumes that each execution of the block gets its own version of anIndex, and Smalltalk/V and old Smalltalk-80 actually make each execution share the same version. So, if you are using Smalltalk/V then be careful not to reuse the names of arguments of blocks unless you know that the blocks are not going to have their lives overlap. Thus, aCollect do: [:i | ...]. bCollect do: [:i | ...]. is probably OK because do: does not store its argument, so the blocks will be garbage by the time the method is finished. However, if the first block were stored in a variable somewhere and evaluated during the execution of the second block then problems would probably occur. Bug #11: Cached Menus Menus are often defined in a class method, where they are created and stored in a class variable or a class instance variable. The method will look something like this initializeMenu ... Note that accepting the method does *not* change the menu. You have to execute the method to change the class variable or class instance variable. Often the initializeMenu method is invoked by the class method initialize. This can lead to the strange effect that you can initialize the menu by deleting the class and filing it in again, but otherwise you don't seem to be able to change the menu (because you haven't figured out that you should really be executing the initializeMenu method). To make matters worse, it is possible that each instance of the controller, or model, or whatever has the menu, stores its own copy of the menu in an instance variable. If that is the case then it is not enough to execute initializeMenu, you also must cause each object to reinitialize its own copy of the menu. It is often easier to just delete the objects and recreate them. Often a class will have a #flushMenus method to clear out all menus. Typically the method that fetches the menu will check to see if it is nil and invoke initializeMenu if it is. So, flushMenus will just nil out the variable holding the menu. The best way to figure out what is happening is to look at all uses of the variable. Smalltalk experts rarely have problems with this bug, but it often confuses novices. Caching is a very common technique in Smalltalk for making programs more efficient in both time and space. Caching of menus is one of the simplest uses of caches, and other uses can create more subtle bugs. Bug 12: Smalltalk/V class library Thomas Muhr makes these comments about bugs in the Smalltalk/V class library that you should know if you want to keep your programs fast and correct. 2. Never use symbols to label objects if you are dealing with many objects. This will slow down your system to an almost dead halt. Use strings instead. 3. Never use Sets when you can otherwise assure the uniqueness. Look at the implementation of "add:" for Sets and you'll know what I mean: on every "add:" the new element is compared to all others resulting into a nonlinear time for adding to Sets. 4. Do not think that if you "collect:" something from a SortedCollection, that your result will be sorted as the origin, unless you use the default sortBlock. This is one of the bugs provided by the language vendor Bug 13: Using cascaded messages improperly From: jaeck@alc.com (William A. Jaeck) I had created a subclass of OrderedCollection with an instance method called with:. This is supposed to do the same thing as add:. Then, I implemented a class method called with:with: as with: arg1 with: arg2 ^ self with: arg1; with: arg2 This ended up producing a result as if I had implemented it as ^ self with: arg2 The correct implementation of with:with: is, of course ^ (self with: arg1) with: arg2 Bug 14: Using class methods to implement a singleton object. From: riks@ogicse.cse.ogi.edu (Rik Fischer SmOOdy) A common "dirty trick" is to use a class with class methods as an implementation for a "globally known" object which is expected to be unique. This class is never expected to be instantiated: instances would have no capabilities except those inherited by default from #Object. The trick usually works, but it qualifies as design by accident. A tyro might study the object as an example and be misled. (It is important to encourage re-use to keep the visible structure as comprehensible as we can) Future engineers might decide that there needs to be another "instance" of this globally known object. I have seen an example (identity of perpetrators withheld to protect the unconvicted) where the second instance was implmented as a sub-class, with numerous methods over-written to access a different class variable. On Smalltalk V/MAC, a quick scan found (Color MTrap Compiler and LCompiler ) Smalltalk80 V4 comes with DefineOpcodePool but that may have other redeeming social value. The most egregious examples have come with other installed packages. Mr. Manners instead approves of the way Smalltalk80 uses the globally published objects named Transcript, Undeclared, and even Smalltalk itself. Bug #15: Redefining = but NOT hash From: Mario.Winter@FernUni-Hagen.de (Mario Winter) Kent Beck makes these comments about redefining = most often demands redefining hash, because all subclasses of Set only use hash to compare their Elements. So don`t wonder if your set contains elements which you consider to be equal, if you have not redefined hash to compute the same hash-value for ,equal" elements. Always remember that Object defines = as ==, that is equality is defined as identity. Many thanks to the many people who contributed bugs or solutions to bugs to the list. These include muhr@opal.cs.tu-berlin.de (Thomas Muhr) steinman@is.morgan.com (Jan Steinman) knight@mrco.carleton.ca (Alan Knight) mario@cs.man.ac.uk (Mario Wolczko) peterg@netcom.com (Peter Goodall) Aad Nales scrl@otter.hpl.hp.com (Simon Lewis) msmith@volcano.ma30.bull.com (Mike Smith) dai@mrco.carleton.ca (Naci Dai) dcr0@speedy.enet.dec.com (Dave Robbins) randy@tigercat.den.mmc.com (Randy Stafford) Hubert.Baumeister@mpi-sb.mpg.de (Hubert Baumeister) eliot@dcs.qmw.ac.uk (Eliot Miranda) dmm@aristotle.ils.nwu.edu (donald) amir@is.morgan.com (Amir Bakhtiar) Kurt Piersol sullivan@ticipa.ti.com (Michael Sullivan) terry@zoe.network23.com (Terry) brent@uwovax.uwo.ca (Brent Sterner) frerk@informatik.uni-kl.de nicted@toz.buffalo.ny.us (Nicole Tedesco) riks@ogicse.ogi.edu (Rik Fischer Smoody) marten@feki.toppoint.de (Marten Feldtmann) mst@vexpert.dbai.tuwien.ac.at (Markus Stumptner) Mario.Winter@FernUni-Hagen.de (Mario Winter) jaeck@alc.com (William A. Jaeck) **