tclvsdb-1.0 - A very simple Tcl database facility for Tcl/Tk applications ========================================================================= Copyright (c) 1995 Cirque Labs, Inc. See the file "license.terms" for information on usage and redistribution of this file, and for a DISCLAIMER OF ALL WARRANTIES. Introduction: ============= I imagine that I am not alone having a need for a database package that works well with Tcl/Tk and is somewhere in sophistication between gets/puts and Oracle or Sybase...and of course, free. Qddb is a very nice package and it is quite flexible. Still, while not dis'ing Qddb, I found it required a bit more discipline than either I had or that my applications required. I suppose I could have written something on top of ndbm, and I may still do that someday. But, ndbm is still too close to gets/puts for my need and my liking. So I wrote tclvsdb. Tclvsdb presents a database consisting of one or more tables arranged in rows and fields with data elements consisting of native Tcl lists. Data elements may be a list of row sequence numbers for other tables allowing one to create hierarchical tries of data. Tclvsdb creates one global array per database which stores all state information for the database and the values of all fields for the current rows in the tables of the database. Opening, accessing and updating a data element in a table, and closing a database is as simple as: global db dbopen $mydbpath db dbgetrow db mytable $seqno set value $db(mytable,somefield) set db(mytable,somefield) $newvalue dbputrow db mytable dbclose db Installation: ============= 1) Uncompress the distribution file: $ gunzip tclvsdb-1.0.tar.gz 2) Extract the distribution from the tar archive file $ tar xvf tclvsdb-1.0.tar 3) Move into the distribution directory $ cd tclvsdb-1.0 3) Copy the tclvsdb Tcl script file to a common location, for example: $ cp tclvsdb.tcl /usr/local/lib/tcl Integrating with Tcl / Tk scripts ================================= In your Tcl/Tk application script files, source in the tclvsdb procedures: # -- load tclvsdb facility source $env(TCL_LIBRARY)/tclvsdb.tcl Goals and Features: =================== o Consistently moderate performance. Tclvsdb provides consistent performance for small-to-medium databases (two seeks and two reads to get to any row in a table). You probably won't want to implement a multinational financial database application set with it for a Fortune 500 company, but the performance is suitable for many small applications and for prototyping larger applications. All database files are ASCII. o High portability. Uses standard vanilla Tcl only without reliance on any Tcl/Tk extensions (known to work with 7.3 and 7.4 releases). The UNIX shell commands executed are fully qualified (e.g. /bin/rm) and are restricted to only a few basic favorites: cp, rm, sleep, echo, date). o Multi-user database with row-level locking. This requires that the users have common write access to the same directory and the database files (read this as a database that is "security-lite"). True locking requires some sort of atomic test-lock-then-set-lock operation or a race condition may occur where someone else may obtain a lock between the time your lock is tested and when it is set. Well, tclvsdb has this race condition so it should not be used for mission critical applications or applications with lots of concurrent users. Database locks are issued and cleared only during Database I/O operations. No locks are held while processing the application code outside of the tclvsdb procedures. o Multi-platform database (hey, NFS + Tcl!). o Simple to use. I have little patience with paradigms and technologies that require a Phd in some buzzword-de-jour. Tclvsdb does the job with a few procedures and flows within the context of your current Tcl work. It adds little nonsense to or overhead to the process; and that's all it does. You write the application. Error checking and error processing is set at an adult level; if you write an application intended to break tclvsdb, you'll probably succeed. o Data elements are native Tcl lists and strings. No translation is required between what your Tcl applications works with and how you store, retrieve, or manipulate the data with Tclvsdb. o Tables employ standard 2-D rectangular row and field addressing to navigate to any data element. o Multiple, hierarchical tables in a database. Row retrieval is based on a row sequence number. Reference other rows in other tables by setting a data element as a list of sequence numbers for the other table. o Fairly powerful but simple string, date, and numeric value search engines. o All database I/O operations are atomic and consistent. No "commit" or "checkpoint" is required to force an update to disk, so the database files are always in a consistent state (the flip side of this coin is that there is no "undo" or transaction processing facility). o Same licensing terms as Tcl. You basically can do whatever you want with Tclvsdb and don't have to involve lawyers over intellectual property rights, and don't have to restrict applications utilizing tclvsdb to some non-profit altruistic distribution to end-users. (Excuse my politics for showing). o minimal namespace polution. Twenty-eight Tcl procedures and only one global array per database (you may concurrently work with multiple databases). TODO: ===== o Reimplementing tclvsdb in C should increase performance quite nicely. o Implement a mechanism to sort or retrieve rows in a table in various sorted orders. o Implement the search engines to allow multiple search clauses linked by logical operations. o Finish and release (now 95% done) the Tk forms package underway which is built to work either by itself or with tclvsdb. o Provide a nice set of examples and documentation (Sure, someday). Databases: ========= A database is a set of ASCII files in a specified directory. Files ending with the extension ".idx" are index files. Files ending with the extension ".tbl" are data files. All ".idx" and ".tbl" files in a directory are part of the same database. Within Tcl, there is one global array associated with each database. Concurrently working with multiple databases simply requires having an uniquely named global array defined for each database. This global array contains all internal state information and the data values for the current rows of each table in the database. The "dbopen" procedure creates and populates the database global array variable. The "dbclose" procedure flushes, reclaims storage, and deletes the global array variable. The "dbcreate" procedure defines a new table in a database. Example: Create a database of two tables dbcreate $custdbpath db customers \ "Name Address City State Zipcode PhoneList" dbcreate $custdbpath db phones "PhoneType PhoneNumber" Example: Open and close a database dbopen $custdbpath db dbclose db Tables: ======= A tclvsdb database consists of one or more tables. A table is a 2-D data abstraction consisting of rows of data and fields (columns). Each table has two files: an index file (tablename.idx) and a data file (tablename.tbl). The index file allows tclvsdb to get to a row quickly. A database may have any number of tables, but is realistically limited by the number of file descriptors available (2 used per table). When a row is added to a table, it receives an unique retrival key which is its sequence number. Sequence numbering starts with the value of "1" for the first row in the table. Some database table operations require specification of the sequence number (such as dbgetrow) or return a list of sequence numbers (such as dbstringsearch). Retrieving random rows from a table is accomplished by "dbgetrow". Sequentially retrieving the rows of a table is accomplished by "dbfirst", "dbprev", "dbnext", and "dblast". The sequence number for the current row in a table may be retrieved from the global database array with: $dbvar($tablename,curseqno) The last defined sequence number for a table may be obtained by: $dbvar($tablename,lastseqno) Fields: ======= A field represents a data element found in every row of a table. A field name is a string without any embedded whitespace, and should be mnemonic of the data values that it represents. Retrieving and setting the data value for a field in the current row is a straight forward Tcl variable setting or reference: set value $dbvar($tablename,$fieldname) set dbvar($tablename,$fieldname) $newvalue Tclvsdb automatically creates two reserved fields per table: "seqno" and "DUMMY". "seqno" contains the sequence number of the row and should NEVER be modified by the user's application. "DUMMY" contains some blank characters for padding the row in the table data file. Minor expansion of the row is allowed to take place without relocating the row to the end of the file by having tclvsdb eliminate an equivalent number of padding characters as those added to the row by other fields. When a row shrinks in size, DUMMY grows proportionately. The user should never touch the contents of the "DUMMY" field. Data Values: ============ A data value for a field may be any valid Tcl scalar string or list (anything you can assign to a variable via the "set" command). If the field's data value contains newlines, these are mapped to the character string "" when stored in the actual table data file, but are restored to newlines upon retrieval from the table data file. Trailing newlines for a data value are truncated. Empty strings and empty lists are valid values for a field. Example of Hierachical Table Usage: =================================== Using the earlier example, this short and ugly little bit of code will walk through the rows of the customer table and print out all of the phone numbers that the customer has: source $env(TCL_LIBRARY)/tclvsdb.tcl dbopen $custdbpath db for {dbfirst db customers} \ {$db(customers,curseqno) <= $db(customers,lastseqno)} \ {dbnext db customers} { puts " " puts "Customer: $db(customers,Name)" foreach phoneseqno $db(customers,PhoneList) { dbgetrow db phones $phoneseqno puts " $db(phones,PhoneType): $db(phones,PhoneNumber)" } } dbclose db The results: Customer: Bill Smith Home: 303-555-0123 Work: 303-555-2543 Fax: 303-555-8876 Customer: Jim Doe Car Phone: 303-555-6666 Customer: Mary Jones Work: 303-555-3345 ext. 4436 After Hours: 303-555-9876 Reference for Procedures in tclvsdb: ==================================== tstodate ts convert numeric value in "ts" to a date in the form of "Mmm dd, yyyy" datetots date convert a date string (form = "Mmm dd, yyyy") into a sortable numeric value. todaydate return today's date in the form of "Mmm dd, yyyy" dbcreate path var tablename fieldnames define, create, and open a new database table. path = directory path to the database var = global array associated with the database (does not need to exist prior to first table definition). tablename = name of the table (no whitespace allowed) fieldnames = list of names for the fields in the table (no field name may have embedded whitespace) dbopen path var open an existing database at the specified directory, associate the database state and rows to the named global array variable. "var" does not need to exist prior to "dbopen" call. dbcleanup var tpath resequence and compress out voids in a database (database must be open, but have no other users prior to issuing this command). dbclose var flush and close all database tables, remove "var" and free up the resources it consumed. dbgetrow var tablename seqno get the row associated with the provided sequence number for the table; make it the current row of the table. dbputrow var tablename store the current row for the table to file. dbclearrow var tablename clear the data elements stored in the fields in the current row in the table (does not update table data file). dbnewrow var tablename creates a new row in the table with all fields empty, makes it the current row. dbdelrow var tablename delete the current row in the table dbsearchstring var tablename fieldname searchstring return a list of sequence numbers of rows who have a string in the given field that contains the searchstring (case-insensitive) as a substring. dbsearchdate var tablename fieldname date1 date2 return a list of sequence numbers of rows who have a date string (form: "Mmm dd, yyyy") in the indicated field that matches or falls between the two date arguments (date1 <= date2). dbsearchnum var tablename fieldname num1 num2 return a list of sequence numbers of rows who have a numeric value in the indicated field that matches or falls between the two numeric value arguemnts (num1 <= num2). dbfirst var tablename set the first row for a table as the current row. dblast var tablename set the last row for a table as the current row. dbnext var tablename set next row in the table (from the current row) as the next. row. dbprev var tablename set the previous row in the table (relative to the current row) as the current row. dbreglockhitproc var proc register the name of a user-defined procedure to call if a persistant lock on a row is encountered. dbregmodnowriteproc var proc register the name of a user-defined procedure to call if the current row has been modified and not written to the database and a request has been made move on to a different row. dbmarkrowdirty var element operation marks the current row as modified (set via trace on the fields of the current row). (INTERNAL USE ONLY) dbtracerowon var tablename activate modification trace on the fields of the current row. (INTERNAL USE ONLY) dbtracerowoff var tablename deactivate modification trace on the fields of the current row. (INTERNAL USE ONLY) dbsetrowlock var tablename seqno lock the row associated with the sequence number in the indicated table. (INTERNAL USE ONLY) dbfreerowlock var tablename seqno rowpos free the lock on the row associated with the sequence number in the indicated table. ("rowpos" is the offset in the index file, provided for a little efficiency). (INTERNAL USE ONLY) dbsettablelock var tablename lock the entire table. (INTERNAL USE ONLY) dbfreetablelock var tablename free the lock for the entire table. (INTERNAL USE ONLY) Database Global Array Element Definitions: ========================================== $var global array providing state information and current rows for all tables in the database. ${var}(lockhitproc) user defined procedure name to call when a persistant locked row is encountered. ${var}(modnowriteproc) user defined procedure name to call when the current row has been modified but the row was not written prior to moving on to a different row. ${var}(path) directory path to the database files. ${var}($tablename,DUMMY) special reserved field for row padding allowing in-place minor row modification (has no user significance). ${var}($tablename,currowlen) number of characters in the current row image after formatting, as read from the table file (has no user significance). ${var}($tablename,currowpos) table file offset to the current row's storage location (has no user significance). ${var}($tablename,curseqno) sequence number of the current row in the table. ${var}($tablename,dascount) count of number of date search full table scans made on the table. ${var}($tablename,dasmax) maximum number of rows matching date range search criteria for table. ${var}($tablename,delcount) count of the number of rows deleted in the table. ${var}($tablename,$fieldname) data for this field in this row in the table. ${var}($tablename,fieldnames) list of field names for this table. ${var}($tablename,getcount) count of the number of rows read in this table. ${var}($tablename,indexfilepath) path to this table's index file ${var}($tablename,indexhandle) file handle for this table's index file ${var}($tablename,lastseqno) sequence number of the last row in this table. ${var}($tablename,lckcount) count of the number of times a lock was encountered for I/O operations against this table. ${var}($tablename,modcount) number of rows written to this table. ${var}($tablename,newcount) number of new rows created in this table. ${var}($tablename,nextidxpos) file offset to the next available index record for the index file for this table. (has no user significance). ${var}($tablename,nexttblpos) file offset to the next available table record for the table file for this table. (has no user significance). ${var}($tablename,nuscount) count of the number of numeric search full table scans made against this table. ${var}($tablename,nusmax) maximum number of rows falling within the numeric search range for numeric searches made against this table. ${var}($tablename,rowdirty) flag to indicate that the current row has been modified (has no user significance). ${var}($tablename,seqno) reserved field for each row in the table containing the row's sequence number. ${var}($tablename,stscount) count of the number of string pattern search full table scans made against this table. ${var}($tablename,stsmax) maximum number of rows meeting glob-style pattern string search criteria for this table. ${var}($tablename,tablefilepath) path to this table's data file. ${var}($tablename,tablehandle) file handle for this table's data file. ${var}($tablename,tablelock) flag to indicate that this table has a full table lock. ${var}(tablenames) list of table names in this database. Author and Bug Reporting: ======================== Steve Wahl Cirque Labs, Inc. P. O. Box 1774 268 W. Third Street, Suite 1 Nederland, CO 80466 (303) 258-0100 Send bug reports (and code snipets to replicate problems) to: steven@cirque.com