About this page
This page is for developers and people learning about guile, who want
ideas to persue. If you've done any of these things, or want advice
about them then please feel free to do so on the guile mailing list!
What's this page is about
People occasionally ask if there's anything they can work on in
Guile. In the past, I've just been dumb and written up a separate
list of ideas in response to each message; so I forget things, I waste
a lot of time retyping, and ideas that occur to me when nobody's
asking me disappear into the ether. So I've decided to write them
Also, these are just the things that have come to my head over the
past few months. Each idea kind of argues for its own worldview, but
I don't actually claim that they're well-developed, or that the
functional boundaries are right, or anything. As a friend of mine
used to say, in a similar spirit: ``Why not take two --- they're
This is only meant as a source of ideas, not as a statement
of what I think is important, where I think Guile should go, or what
I'm willing to incorporate into some ``official'' distribution of
something. People should hack on what pleases them. These are
basically the things I wish I had time to do.
I hope this is helpful!
--- Jim Blandy
I've added a little here, mostly pointing to some additional
information on various bits, and removing some of the bits that've
been finished and added to guile, like guardians
--- Greg Harvey
If you've got your own list of Guile modules or features you'd
like to see, we'll put a link to it here.
Here are some things I'd like to see people write and distribute as
separate Guile modules.
Items are in most-recently-added-or-modified order.
- Clean SLIB integration
This may need to wait for the new module system, but it would be
nice to have a solid integration of SLIB into Guile, so that SLIB
files appear as first-class Guile modules.
- Server-side scripting
I'm thinking of something like a translator that reads a mix
of HTML-like tags and Scheme code --- basically, something like
quasiquote, except that it builds HTML structure, instead of list
structure. The translator would compile HTML-like things into
expressions that construct a tree which we can finally spit out as
PHP has become really popular, but there's really no such thing as
a little language. I bet we'll see the same pattern with that
that we've seen with Perl and Tcl --- the interpreter will start
out small and fast, but then as people use it for more serious
work, it'll start acquiring real control structures, real
functions, more datatypes, ... until it'll just be another
poorly-implemented lisp with studly libraries. That's basically
what Tcl 8.0 is.
I don't know for sure, but I suspect that MHTML invented their
own language too. If so, the same rules apply.
We should do it right, by taking the features of PHP and MHTML
and giving them to Guile. Someone who does web application
development should do this, so it'll get done right. I don't do
that much web hacking myself, so I'd get the details all wrong.
Note that it's really easy to hack something up that works
here and there. This project will only have a significant impact
if it's an industrial-strength job.
See Olin Shivers'
- IEEE 754 recommended functions and predicates
Guile should provide the functions described in the appendix to
IEEE 754. They're handy.
- Resurrect the Mesa interface
Long ago, Guile had an interface to the Mesa 3-D graphics library,
which is compatible with OpenGL. (See freshmeat for details.)
This code could probably be resurrected pretty easily.
Ideally, of course, this would be combined with a Mesa widget
for GTK+. :)
This is kind of stupid, but one thing you could use this for
is an origami program. I'd like to be able to see the
relationship between the folded and unfolded sheets. This is one
of those things that's not worth coding in C, but would be fun to
whip up in an interpreted language.
- Simple Network Management Protocol interface
There's a library called UCD-SNMP (search freshmeat for it) which
might be fun to hook up to Guile. I'd love to be able to combine
UCD-SNMP with the GTK+ canvas widget and get network utilization
graphs from my router.
- Bug-tracking system
I'd like something akin to Gnats, but written in Guile. I'm sure
we could do better than Gnats, and this would give us a chance to
focus on the needs of open source development --- for example, the
bug list should be open, and viewable from the web.
- Value history
When you're using Guile interactively, it's really common to want
to operate on a value returned by one of the previous
instructions. Mathematica and GDB automatically stash the result
of each expression in a variable (or something like a variable),
so you can refer to it later.
I have in mind something like this:
guile> (list 1 2 3)
$1 => (1 2 3)
guile> (cons 0 $1)
$2 => (0 1 2 3)
guile> (set-car! (cdr $2) 'yowza)
$3 => yowza
$4 => (yowza 2 3)
(Bengt Kleberg recommended the ``$1 => mumble'' syntax, on the
grounds that it resembles the notation RnRS uses to show what an
expression evaluates to. And indeed, $1 will evaluate to mumble.)
The current repl doesn't print the vale of expressions returning
#<unspecified>, and it shouldn't assign such values to value
history variables either.
- Directory walker with list of predicates
People have posted a bunch of functions for walking directory
trees, modeled more or less after the Unix `find' command.
They're essentially of the form (find DIRECTORY PROC), and apply
PROC to each file they find in the directory tree DIRECTORY.
It occurs to me that it might be convenient to actually have
`find' accept a list of procedures: (find DIRECTORY PROC ...), and
for each file found, apply each PROC successively, until one
returns a false value. That way, you could do things like this:
(find "." (glob ".deps") directory?
... now do something with the .deps directory d ...))
I'm imagining `glob' to be a function that accepts a filename
wildcard pattern and returns a predicate that likes filenames that
match the wildcard.
Maciej Stachowiak and others have suggested, as a convention,
that any function which accepts a predicate as an argument, and
applies that predicate to filenames, should also accept a string
as that argument, and treat that string as a wildcard pattern.
Thus, the code above would become:
(find "." ".deps" directory?
(lambda (d) ...))
Now we're starting to get friendly.
- smart dumping
This is kind of a dumb idea; I have no idea whether it's even
worth thinking about.
Would it be possible to write a function which walked data
structures in the Scheme heap, and wrote out, directly from
Scheme, a valid ELF shared library? Then you could load a module
and spit it back out as a shared library. While walking the heap,
we have all the information we need to emit the right relocs.
The code is certainly machine- and ABI-dependent.
Does this have any advantage over the freezer? The freezer
writes out initialized C data structures, which you can then
compile into shared libraries; what's wrong with that? I guess
the C compiler isn't involved, which is one more variable
eliminated. But freezing is much more portable.
- Sound/sample processing library
I'd love to use Guile to munge sound samples, try out various
effects, etc. The perfect complement to a sound-processing
library would be a nice interface to the audio hardware. Look
into the Enlightened Sound Daemon (esd), and audiofile, a library
that understands how to load a variety of different audio file
formats. (Tom Tromey pointed me at esd and audiofile.)
Bill Schottstaedt writes:
I've written a C library of basic audio functions (support for
hardware, headers, data types, etc) called sndlib
which is already tied into Guile in my sound
editor. And a ton of sound-processing functions can be found
(a Common Lisp and C based Music V implementation).
If there's interest, I could package all this up in some
pretty way; or perhaps help anyone else who wants to do something
along these lines. As I understand the copyright issue (I'm no
lawyer), Stanford owns the copyright since I'm doing this work as
their employee; they, however, have said the software author can
decide distribution policy (or whatever the word is), so I placed
the sound editor under the GPL, and have always made all the code
freeware available via anonymous ftp.
- Database engine interfaces
This is pretty important for web scripting, but also just a
generally handy thing.
Perl has DBI, which I think we should use a model; here's the
description from CPAN:
The Database Interface. The Perl DBI initiative has
standardized the interface to a number of commercial database
engines, so that you can move from, say, Oracle to Sybase with a
minimum of effort. You'll find DBD::DB2, DBD::Informix,
DBD::Oracle, DBD::QBase, DBD::Sybase, DBD::MySQL, and DBD:mSQL
inside the DBD module set.
The job here is to discern the important ideas in DBI's
design, figure out the nicest way to transpose those into Scheme,
write that up, and provide at least one implementation, so that
people can write database drivers that meet the spec, and use the
implementation as a reference.
- Database manager interfaces
These would be interfaces to Berkeley DB, GDBM, NDBM, DBM, etc.
I think the difference between a ``database engine'' and a
``database manager'' is that the former usually supports SQL, and
handles multiple readers and writers, whereas the latter just
implements the data structure on disk, and leaves questions of
synchronization and how to actually arrange data usefully in the
hands of the caller.
Just as discussed in the ``Database engine interface'' entry
elsewhere in this list, we want a common interface to these
libraries, so people can write code that will operate with any
- Query languages
Logic programming languages make great database query languages
--- much nicer and more consistent than SQL. It would be really
cool if someone could look into the work Richard Salter and Chris
Haynes did on embedding backtracking and unification in Scheme, or
Per Bothner's similar work, and then use that to produce the best
database front end ever.
Of course, this should ride on the
- GIMP interface
Peter Mattis did do a Guile/GIMP interface at one point, but we're
not the standard there; they're still using SIOD. Someone who
uses both Guile and the GIMP needs to pick this up, turn it into a
package people can use, and do whatever's needed to get the GIMP
people to accept it as their standard. If there are critical
Guile changes necessary (Make it smaller? Can do!), I want to
help with that.
Basically, I think all we need here is a module that exports
the GIMP's functions. Should be real easy.
- C parser
It would be cool to hook up a full parser for ANSI C to Guile.
Then we could use it to parse header files and generate Guile
interfaces, scan Guile code for missing argument typechecks, cases
where we need to call scm_return_first or scm_remember, unbalanced
calls to SCM_DEFER_INTS. We could even switch Guile to an
explicit-marking GC, and then use the checker to catch errors in
the explicit marking.
Tom Lord did this for Systas; perhaps
his work could be ported back to Guile.
For way extra points, implement a C++ parser. Ouch.
- A FastCGI interface
It would be nice to have Guile implement the FastCGI interface,
turning each HTTP request into a function call, with arguments and
environment broken out.
- ABI-conformant packed data structure interface
It would be nice if Guile could manipulate arbitrary C data
structures. Basically, you'd take the C declaration for a
structure, figure out its layout at the bit/byte offset level
(given a particular ABI), and generate a bunch of accessors, and a
new opaque type for pointers to that structure.
We're not going for type safety here --- there should be an
operation to turn these opaque values into integers and back, or
to treat the data in a string as one of them. But it would be
nice to provide some error checking.
If the ABI were something you could choose at run-time, that
could make Guile a powerful system for doing cross-platform
munging. "Sure, I know exactly how a 6-bit field would be laid
out on the i960!". Well, okay --- maybe that's not thrilling.
But I still think it would be cool.
- Henry Spencer's regexp matcher
I'd like to see someone package up Henry Spencer's latest regexp
engine (which supports Unicode's UTF-8 encoding!) as a dynamically
linked module for Guile. Actually, I'd like to incorporate this
into the Guile core. Our present code just uses whatever regexp
engine is in the system's C library, which is sometimes pretty
- HTTP routines
It would be cool if people could use Guile to implement web robots
and the like. Tim Pierce started to work on this, but it's not
- General URL functions
This would be a set of functions for just retriving any URL. I
think the WWW Consortium has a library which implements
- SGML and DSSSL
As long as we're supporting multiple languages, why not DSSSL?
Craig Brozefsky <firstname.lastname@example.org> has already attached an
SGML language parser to Guile.
- Mailbox support
Guile should have routines for doing full RFC 822 parsing, MIME
encoding, etc., to help people writing mailbots.
- Emacs-like buffers, for file handling.
Everyone is used to the sed/awk/perl model of file processing ---
you munge a line at a time, and maintain state to handle
multi-line things. That's just the way it's done.
But actually, there's a totally different system which works a
lot better for some applications, exemplified by Emacs. Emacs
lisp gives you buffers with very fast search, insert, and delete
operations. You don't have to process the data in any order;
there are no line boundaries to obscure the semantic structure of
the content, if the file isn't really line-structured; and so on.
So basically, this idea is, "implement Emacs buffers for Guile,
with all the searching, editing, and I/O facilities, but none of
the redisplay support."
(Greg) Initial code implementing these is available here
It's not complete, but could be used to build something bigger and
better (I intend to go back and reimplement these with goops
classes, to allow for more flexible usage).
- Cool I/O ports
Guile should be able to talk to compression libraries. You should
be able to hand an ordinary output port to a function, and have it
give you a new port, where data written to the new port gets
written compressed to the original port. And the reverse for
uncompression and input ports.
The same principle applies to any kind of stream transformation:
- uu/base64 encoding, or error correcting codes
- Unicode/JIS/ISO-8859 conversion
- CR/LF vs. LF conversion
- telnet (for implement FTP)
- line and column number counting (the port just passes data
through unchanged, but counts the number of characters and
newlines, and has extra functions that let you read and set the
Guile's port implementation already has the infrastructure
needed to implement ports that do arbitrary things with their
streams (see the scm_ptobfuns structure). It's just
waiting to be used.
The quintessential example of the above. SSLeay is a library that
implements the Secure Socket Layer protocol, the foundation of
secure http. It's basically a generic authentication and privacy
layer for network connections. I think PRMS uses it too.
- Adobe Document Structuring Conventions parser
Make it trivial to write psnup, and such. Make it trivial to
produce output that conforms to the DSC.
- Functional PostScript
Imagine taking the primitives of PostScript and providing them in
a more functional-language kind of way. That's what Olin
Shivers' group has done.
is a portable system for doing device-independent, resolution-
independent graphics from Scheme programs. It is PostScript, with
the Forth computational engine replaced with Scheme. At present,
it runs on SCSH.
- Occam-like thread control structures
The Occam language, designed for the INMOS transputer, made
parallalism as concise to use as `let'. It was a much nicer way
of thinking about threads, I think. It would be cool if someone
implemented the interesting Occam features in Guile:
And so on.
- These are one-deep message queues, with the right blocking
behavior; Guile's channels would carry objects.
- Well, this is just begin.
- Like begin, but execute all the subexpressions in
- RenderMan interface
It would be nice to be able to generate RenderMan scene
description files using Guile code.
Here are some languages I'd love to see translated into Guile.
Items are in most-recently-added-or-modified order.
Here are some improvements I'd like to see made to the core Guile
- Tcl 8.0
- Tcl's syntax is remarkably simple for a language its age; I
quite like it. Especially with the semantic cleanups made for
Tcl 8.0, we should be able to do a good integration here.
Ian Bicking <email@example.com> has
done some work on a translator.
The first cut was an interpreter, partially evaluated using Similix
to produce a compiler; the latest version has been rewritten by hand.
- The simple server-side scripting language.
- A more complex server-side scripting system, but a new
language, so still probably clean enough to tackle.
- Quite a pretty language, clean in both syntax and semantics.
("Pure in body and mind.") Datatypes seem very friendly with
Scheme's, so it should be possible to do a very satisfying
- Emacs Lisp
- An interesting challenge. I've got a decent solution for
reconciling the nil/()/#f issue, which this translator should
use. Mikael Djurfeldt has some ideas about this too.
- This is a herculean task, because Perl's syntax and semantics
are so complicated. Hats off to whoever even tries this.
Torbjörn Gannholm has kindly offered to work on this.
I just got back from a conference in Japan about multilingual
information processing in free software, organized by the MULE
folks. While there I put together a pretty clear idea of how
I want Guile to work:
- Guile should have a separate "byte array" and "string"
types. Probably the byvect stuff in unif.c is a decent
"byte array" type already, but we may need to beef up
support for it.
- All characters should be Unicode characters, and all strings
are strings of Unicode characters. (We'll need a read
syntax for Unicode characters; I think Marc Feely has a
proposal for this.)
- I/O ports should know what encoding they're reading or
writing (ISO Latin-1 by default), and do the appropriate
translations. Some structure that allows us to layer
translators on top of raw ports might be nice, to decouple
the character set support from the I/O source support.
- Internally, strings should be represented as UTF-8 encoded strings;
this is the representation that C code linked with Guile
will see, and operate on. Guile should provide convenient
functions to ease the complexity of handling these.
- However, Scheme code will still index strings by character,
not byte. The expression
(string-ref s 1)
will always return the second character of s, even
if the first character is several bytes long. The fact that
the elements of the string are of varying width will be
concealed from Scheme code.
This means that string-ref and
string-set! will no longer be constant time
operations. Oh well; people usually manipulate strings using
searching, substring extraction, and concatenation anyway;
the complexity of those operations is unaffected.
- Each string object should record the byte position of the
first non-single-byte character, so we can still index
strings containing only fixed-width characters (ASCII) in
This isn't perfect, but here's my rationale:
My driving concern is what I'll call the "pass-through"
problem. In the process of carrying out a user's request, each
piece of data will pass through many different modules. For
example, data might be read from an I/O port, stored in a
database, and then retrieved from that database and displayed by a
GUI toolkit. If any module fails to handle multilingual data
correctly, the user will experience the overall system as
This means that it's not enough to merely have most modules
handle multilingual text correctly. They must all do it, if we
are to earn the user's trust. We will need to police Guile
modules carefully, put pressure on the authors of non-multilingual
modules, support them with plenty of helpful routines and
documentation, mark entries in the public module archive as
"multilingual-safe", and so on.
But if we're to impose this burden on developers, it must be a
reasonable burden. We don't actually have any authority; we rely
on their good will. If we require them to become experts on every
character set and encoding on the planet, that's too much; they
simply won't bother. If we require developers to do too much,
they will do very little.
The proposal above presents the developer with a single
character set, whose semantics are clearly documented. Developers
working in C must also cope with a variable-width encoding, which
does complicate code, but it has some nice properties that
ameliorate the complexity somewhat.
It has been suggested that Guile simply use Unicode encoded in
sixteen-bit characters throughout, as Java does. However, this
isn't viable; the 16-bit space for Unicode is almost full, and the
Taiwanese have a ton of characters they want code points for. If
you represent them using the Unicode ``surrogate characters'',
then you've got a variable-width encoding again; you might as well
use UTF-8 and save memory, since you're not saving any complexity.
It has also been suggested that Guile use two or three
different string representations, with eight, sixteen, or
thirty-two bits per character. Guile could automatically select
the most dense representation capable of holding the data at hand.
However, this would require everyone working in C to write out
three copies of their string-processing code. Each copy would be
simpler than the code for handling UTF-8, because it would be
working with a fixed-width encoding, but it's my sense that a
single UTF-8 loop is less hair than three fixed-width loops.
So, I'd love to have routines to convert text between Unicode
and all the various local encodings --- the JIS standards, BIG5,
ISO 8859, and so on. Guile should try to use the gconv
interface in the GNU C library, then iconv, and then
whatever else is available.
Henry Spencer's latest regexp engine handles UTF-8, but as of
this writing, it hadn't been optimized yet.
Guile also needs some kind of gettext interface. We could add
a new syntax for translatable strings like
#"This is a translatable string."
- Use GMP for bignum arithmetic
At the moment, Guile uses a homebrew bignum implementation. GMP,
Torbjorn Granlund's multi-precision arithmetic package, is faster.
Guile should be changed to use GMP if it is installed, and omit
bignum support if GMP is absent.
It would be nice to implement rational numbers as part of
- R5RS support
Right now Guile only supports R4RS. We should be able to use the
module system to let the user choose which dialect they want to
use on a module-by-module basis. I think this should be tied in
with Chris Hanson's work on a byte compiler.
- Custom buffered I/O
Guile has several different kinds of I/O ports. Those that
talk to the outside world are implemented on top of the ubiquitous
C standard I/O FILE buffered streams. This leads to a
- We have to use fgets for speed, but it's difficult
to handle lines containing null characters, given
fgets's interface. So we use ftell to find
out how much we've read with fgets. But that doesn't
work on sockets. So on sockets we fall back to our old, slow
routine based on getc.
- We have to use unbuffered input sockets, because standard
I/O streams only promise you one buffer, so you can't mix read
and write operations, to implement a network protocol, say. You
have to do an fflush between a write and a read, and an
fseek or something equivalent between a read and a
write. This is stupid.
- There's no way to tell whether there's input immediately
available on a buffered stream. You can get the underlying file
descriptor and do fcntl magic on that, but that won't
tell you whether there are characters waiting in the buffer.
You get the idea. The thing is, every Unix system has
read, write, and seek, and I don't know
of any system that doesn't have select. So we could
actually implement our own buffering port implementation, and
address all these problems.
We'd actually do better, because people have had some good
ideas since standard I/O was implemented. For example, we could
follow the lead of the libio library, and expose the buffer
directly to the consumer, thus avoiding some copies. We could run
regexp matches directly in the buffer. We could implement our own
definition of line boundaries with little penalty. Each port
could have a magic writable shared substring object that gave
Scheme code direct access to the "current line", with no copies.
(Well, maybe that's not such a hot idea. But that's what Perl
Having our own buffered stream implementation would also allow
us to start acquiring cool optimizations strictly below the
interface. The presence of the interface would protect Guile from
whatever weird system-specific stuff we wanted to do for speed.
(Greg) Gary Houston has been working on this; patches against
cvs guile (for the brave) are available at Gary's web
- Fast string implementations
Guile should implement the substring operation by sharing memory
with the original string, and using copy-on-write (COW) to
preserve the right semantics in case something gets side-effected.
The garbage collector would need to have some policy for throwing
away large substrings referenced only by small substrings, perhaps
by copying strings if they are small enough.
Once we had the copy-on-write system in place, people wouldn't
need to use explicitly shared substrings for efficiency any more,
and we'd be able to make explicitly shared substrings writable.
Which is a useful thing if you're doing a lot of string munging.
This could be easily combined with a hack for a fast
string-append. Have strings carry a bit that says, "I was
constructed by string-append." If string-append notices that its
first argument has this bit set, then the user is probably in a
loop building up a long string by appending a bunch of strings, in
a pattern like (string-append (string-append (string-append ...)
...) ...). In this case, it could allocate extra space beyond the
end of the string in anticipation of the next string-append. The
next append would notice this extra space, copy the tail into it.
The original string becomes a COW-shared substring of the result.
Combined with buffer-doubling, this makes building up strings
linear in time and consumed storage, instead of quadratic.
The same trick should be applied in reverse: if string-append
notices that its last argument has this bit set, then the user is
doing (string-append ... (string-append ... (string-append ...))),
and it should allocate extra space before the beginning of the
- Generational garbage collection
(Greg Harvey has
started taking a shot at this; ((Greg) check out my personal
for news, notes, and code).
At the moment, anyone who profiles the Guile interpreter
notices that it's spending a lot of its time in gc_mark.
This is not too surprising. I'd like Guile to have a conservative
generational collector. The hard parts here are the write
barrier, and managing conservatism. I've put some of my ideas for
dealing with conservatism here.
The usual way to keep track of an object's generation is to
keep each generation in a separate region of memory, and then
check the object's address to see which generation it's in. To
age an object, you copy it from from a younger generation to an
older generation, and update all pointers to that object. There's
nothing magic about this; you could just as well have a field in
each object saying what generation it's in. However, using
address ranges saves space; you don't need that extra field per
Unfortunately, when you're using a conservative collector like
Guile's, you can't move an object that's pointed to by the stack.
You have no idea whether any given word on the stack is actually a
pointer, or just some integer, or a piece of a string, so you
can't fix up the pointer after you've moved the object. Which
complicates the collector a bit, if you want to copy objects.
One approach would be to assume that the stack doesn't contain
pointers to too many objects, so you could just leave those there.
After all, aging just affects performance; it's not necessary for
correctness. I think this is a variant of Joel Bartlett's
``Mostly-copying Collector'' idea. Guile uses a free list to
manage its storage, so having a few old objects sticking around
(is there some nice concise derogatory term for people who have
failed a grade in school and are sticking around for another
year?) doesn't affect the allocation strategy at all.
(Greg) This assumption is a pretty safe one; the number of
cells traced conservatively is generally a very small fraction of
the actual number of cells traced. However, it's worth mentioning
that Bartlett's method is patented
(don't get me started), so we have to be a little careful about
the gc we end up with.
Anyway, anyone considering this project should check out Paul
Wilson's survey papers on garbage collection.
Here are projects that don't fall into the above categories.
- GDB support for debugging Guile-using C code
GDB actually has some Scheme support in there; we should teach it
how to print Scheme values, how to print interpreter frames, and
This has been done in the past with a mixed GDB/Guile
solution, but I think it would be more robust to actually put
everything in GDB.
Negotiate design with the GDB group, so it can be merged into
- New Guile logo
I'd like to have a Guile logo which is the word ``Guile''
outfitted with some eyebrows and pupils looking schemeing (ahem)
2 Aug 2000 spacey
Copyright (C) 2000,2001,2002,2005 Free Software Foundation, Inc.,
59 Temple Place - Suite 330, Boston, MA 02111, USA
Verbatim copying and distribution of this entire web page is
permitted in any medium, provided this notice is preserved.