Last updated September 27, 2004
use DBI; my $dbh = DBI->connect( "dbi:Teradata:some.host.com", "user", "passwd") or die "Cannot connect\n"; # more DBI calls...
'Lite' version of the Perl DBI driver for Teradata
(See DBI for details).
BE ADVISED: This is a 'Lite' version of the driver, provided with
limited functionality and maintenance, and subject to change at the whim of
the author. While every effort has been made to assure conformance to
DBI-1.13, some peculiarities of Teradata may not be 100% compatible.
In addition, some niceties of Teradata not found in DBI have been included
to make this library useful for common Teradata tasks (MACRO and multistatement
request support, support for summary rows, etc.). Look
here
for information regarding the fully functional and maintained version.
This driver is 100% pure Perl, and requires no CLI, ODBC, or other interface libraries, other than
NOTE:Due to numerous functional issues, PM/API, FASTLOAD, and EXPORT connection support have been removed from this "lite" version. The commercial version available at www.presicient.com retains that functionality, as well as supporting:
Release 1.20
Release 1.20
Release 1.13:
The dsn string passed to DBI->connect()
must be of the following form:
dbi:Teradata:host[:port]
where
Note that this driver will NOT perform the random selection algorithm for resolving the hostname(via the "hostnameCOPN" convention) when multiple paths are available. You should either use the full explicit hostname (e.g., "DBCCOP1"), the numeric IP address (e.g., "1.2.3.4"), create an alias for the explicit hostname, or, better still, use bind(3) the way God intended.
Multiple connections (aka sessions) to a Teradata database are supported.
This Lite version of the driver supports only Teradata mode (i.e., *not* ANSI mode). That means that DDL statements and multistatement requests implicitly finish a transaction, and AUTOCOMMIT is the default. We don't flag any non-ANSI SQL, either. See the Teradata SQL documents for all the differences between ANSI and Teradata behavior. The commercial version does support ANSI mode.
RunStartup execution is not supported.
Cursor SQL syntax (i.e., ...WHERE CURRENT OF...
) is not supported
in this Lite version, but is supported in the commercial version.
Session reconnection is not supported.
Teradata account strings can be provided by simply appending a
single comma, followed by the single-quoted account string,
to the password string, e.g.,
use DBI; my $dbh = DBI->connect( "dbi:Teradata:some.host.com", "user", "passwd,'\$H&Lmyaccount'") or die "Cannot connect\n"; # more DBI calls...HELP statements are not supported, due to an apparent bug in the data returned by the DBMS in RECORD mode.
The following list maps DBI defined data types to their Teradata equivalent (if applicable):
DBI Data Type | Teradata Data Type |
SQL_CHAR | CHAR |
SQL_NUMERIC | DECIMAL |
SQL_DECIMAL | DECIMAL |
SQL_INTEGER | INTEGER |
SQL_SMALLINT | SMALLINT |
SQL_FLOAT | FLOAT |
SQL_REAL | FLOAT |
SQL_DOUBLE | FLOAT |
SQL_VARCHAR | VARCHAR |
SQL_DATE | DATE |
SQL_TIME | TIME |
SQL_TIMESTAMP | TIMESTAMP |
SQL_LONGVARCHAR | LONG VARCHAR |
SQL_BINARY | BYTE |
SQL_VARBINARY | VARBYTE |
SQL_LONGVARBINARY | LONG VARBYTE |
SQL_BIGINT | N/A |
SQL_TINYINT | BYTEINT |
SQL_WCHAR | N/A |
SQL_WVARCHAR | N/A |
SQL_WLONGVARCHAR | N/A |
SQL_BIT | N/A |
NOTE: INTERVAL types are not yet supported.
CAUTION: Floating point and DECIMAL support within the Teradata DBMS requires specific knowledge of the client platform. Platforms this driver currently knows about are:
This driver attempts to auto-detect the platform on which it is running as follows:
Character Set | Integer format | Assumed Platform | ASCII | not network-order (MSB last) | Intel | ASCII | network-order (MSB first) | SPARC/MOTOROLA |
Autodetection can be overridden by setting the environment variable
TDAT_PLATFORM_CODE
to the appropriate platform code:
Platform | TDAT_PLATFORM_CODE Value |
Any Intel X86/Pentium | 8 |
Sun SPARC, Motorola 68XXX, ATT 3b2 | 7 |
VAX | 9 |
IBM 370/390 | 3 |
AMDAHL UTS | 10 |
Honeywell | 4 |
If your platform's floating point (specifically, double-precision) format does not match that of one of those platforms, AVOID FLOATING POINT VALUES! Substituting DECIMAL or CHARACTER values will often work well, though loss of precision may be a problem.
For some platforms (notably VAXen and mainframes), Teradata returns DECIMALs as binary-coded decimal (BCD) values. For most workstation platforms (Intel and most RISC), DECIMAL is returned as a simple fix-point integer. Currently, DBD::Teradata only converts the latter format. BCD encoded values may eventually be supported if sufficient hue and cry is raised...but it's currently not a priority.
Finally, note that double-byte character sets (i.e, UNICODE, Kanji, etc.) are not supported in this Lite version, but are supported (via UTF8) in the commercial version
This driver supports both USING
clauses and placeholders
to implement parameterized SQL; however, they cannot be mixed
in the same request. Also, when using placeholders, all parameter
datatypes are assumed to be VARCHAR(16)
unless explicit
datatypes are specified via a bind_param() call, or the environment
variable TDAT_PH_SIZE has been defined to another value. However,
the driver will adjust the parameter type declaration provided to the
DBMS upon execute()
so that parameters without explicit type
specifications which exceed the VARCHAR(16)
size, will be
presented to the DBMS as VARCHAR(<actual-param-length>)
.
Multi-statement and MACRO execution requests are supported. Stored procedure support is available in the commercial version of this driver.
Reporting the results of multi-statement and MACRO requests presents some special problems. Refer to the DRIVER SPECIFIC ATTRIBUTES section below for detailed descriptions of relevant statement handle attributes. The driver behavior is augmented as follows:
tdat_stmt_info
statement handle attribute. (Note that DBI's notion of a statement is equivalent to
a Teradata request, which may contain more than 1 SQL statement. For the
purposes of this discussion, the Teradata definitions of request and statement will be
used; when refering to DBI's definition of a statement, the term "DBI statement" will
be used).
tdat_stmt_info
returns an arrayref of hashrefs. Each array entry
is indexed by its associated statement number within a Teradata request. Please note
that the DBMS starts statement numbering with 1, not zero; thus, loop constructs
used to scan the statement info array should start their index values at 1.
The hashref has several keys, described in the following sections and in the
DRIVER-SPECIFIC ATTRIBUTES section.
fetchrow_XXX()
statement
handle method will always return an empty string result. The activity type,
activity count, and warning message of an individual statement
can be queried via the ActivityType
, ActivityCount
and
Warning
keys in the statement's hashref in the array returned by the
tdat_stmt_info
attribute.
tdat_stmt_info
attributes still apply as for single-SELECT multi-statement or MACRO requests. However, the
column (and summary) information for all SELECT statements are included in the
NAME, TYPE, PRECISION, SCALE,
and NULLABLE DBI statement handle
attributes, and each fetched row will include fields for all
SELECT statements, but only the fields for the current SELECT
statement will be be valid. All fields for non-current SELECT will be set to
undef
(not unlike the results of an OUTER JOIN).
In order to identify the SELECT statement that a fetchrow_XXX()
call
is processing:
tdat_stmt_num
attribute can be queried to get the
current statement number
StartsAt
in the hashref
located at the current statement's index in the array returned by the
tdat_stmt_info
attribute.
EndsAt
in the hashref
located at the current statement's index in the array returned by the
tdat_stmt_info
attribute.
An example of processing multi-SELECT requests:
$sth = $dbh->prepare('SELECT user; SELECT date; SELECT time;'); $names = $sth->{NAME}; $types= $sth->{TYPE}; $precisions = $sth->{PRECISION}; $scales = $sth->{SCALE}; $stmt_info = $sth->{'tdat_stmt_info'}; $sth->execute(); $currstmt = -1; while ($rows = $sth->fetch_array()) { if ($currstmt != $sth->{'tdat_stmt_num'}) { print "\n\n"; $currstmt = $sth->{'tdat_stmt_num'}; $stmthash = $$stmt_info[$currstmt}; $starts_at = $$stmthash{'StartsAt'}; $ends_at = $$stmthash{'EndsAt'}; for ($i = $starts_at; $i <= $ends_at; $i++) { print "$$names[$i] "; } print "\n"; } for ($i = $starts_at; $i <= $ends_at; $i++) { print "$row[$i] "; } }
Like multi-statement and MACRO requests, reporting the results of summarized SELECT requests requires some special processing. Refer to the DRIVER SPECIFIC ATTRIBUTES section below for detailed descriptions of relevant statement handle attributes. The driver behavior is augmented as follows:
fetchrow_XXX()
will be set to undef
until a
summary row is returned by the DBMS.
IsSummary
attribute of the
current statment hashref (stored at the current statement number index within the arrayref returned
by the tdat_stmt_info
statement handle attribute) returns the summary row
number of the current statement; otherwise, it will be set to undef
.
SummaryStarts
and
SummaryEnds
attributes,
which return arrays (indexed by summary row number) of starting and ending indexes, respectively,
within the DBI attribute and row data arrays for each summary row
(You're probably confused at this point, so review the
example below).
SummaryPosition
attribute,
which returns an arrayref of the column numbers associated with each summary field
within the current statement. NOTE: SummaryPosition information is not available
until after the execute() method has been called and a summary row has been fetched.
SummaryPosStart
attribute,
which returns an arrayref, indexed by summary row number, of the starting index within the
SummaryPosition
array for the current summary row.
NOTE: SummaryPosStart information is not available
until after the execute() method has been called and a summary row has been fetched.
An example of processing summarized SELECT:
$sth = $dbh->prepare('SELECT Name FROM Employees WITH AVG(Salary), SUM(Salary)'); $names = $sth->{NAME}; $types= $sth->{TYPE}; $precisions = $sth->{PRECISION}; $scales = $sth->{SCALE}; $stmt_info = $sth->{'tdat_stmt_info'}; $sth->execute(); $currstmt = -1; while ($rows = $sth->fetchrow_array()) { if ($currstmt != $sth->{'tdat_stmt_num'}) { # # new stmt, get its info # print "\n\n"; $currstmt = $sth->{'tdat_stmt_num'}; $stmthash = $$stmt_info[$currstmt]; $starts_at = $$stmthash{'StartsAt'}; $ends_at = $$stmthash{'EndsAt'}; $sumstarts = $$stmthash{'SummaryStarts'}; $sumends = $$stmthash{'SummaryEnds'}; $sumrow = $$stmthash{'IsSummary'}; for ($i = $starts_at; $i <= $ends_at; $i++) { print "$$names[$i] "; # print the column names } print "\n"; } if (defined($sumrow)) { # # got a summary row, space it # NOTE: this example uses simple tabs to space summary fields; # in practice, a more rigorous method to precisely align summaries with their # respective columns would be used # $sumpos = $$stmthash{'SummaryPosition'}; $sumposst = $$stmthash{'SummaryPosStart'}; print "\n-----------------------------------------------------\n"; for ($i = $$sumstart[$sumrow], $j = $$sumposst[$sumrow]; $i <= $$sumend[$sumrow]; $i++, $j++) { print ("\t" x $$sumpos[$j]); # tab each column for alignment print "$$names[$i]: $row[$i]\n"; } } else { # # regular row, just print the values # for ($i = $starts_at; $i <= $ends_at; $i++) { print "$row[$i] "; } }
Support for non-SQL connections has been removed from this version due to numerous bugs and functional limitations. The commercial version provides much improved support for these features, as well as adding multiload and remote console support.
Double buffering (i.e., issuing a CONTINUE to the DBMS while the
application is still fetching data from the last received set of rowdata)
is supported, but not yet thoroughly tested. Use at your own risk by
defining an environment variable TDAT_NO2BUFS=0
.
DBI does not support the notion of warnings;
therefore, the hashref provided by the driver specific statement handle attribute
tdat_stmt_info
provides a Warning
attribute that
can be queried to retrieve warning messages.
DBI provides the trace()
function to enable various levels
of trace information. DBD::Teradata uses this trace level to report
its internal operation, as well.
There are some additional attributes that the user can either supply to various DBI calls, or query on database or statement handles:
Write-only on statement handle creation; read-only on statement handle thereafter.
When set to either RecordMode
or IndicatorMode
in the attributes hash provided to a
$dbh-<prepare()
call, causes the resulting DBI statement handle to either provide the output rowdata, or
accept the input parameter data, in Teradata binary import/export format (i.e.,
<2 byte length><(optional) N bytes of indicators><N bytes of data><newline>
).
Specifying RecordMode
indicates data is provided without the NULL indicator
bits; IndicatorMode
indicates data is provided with indicator bits.
In raw input format, each row of parameter data should be bound as SQL_VARBINARY type; in raw output format, the row data will be returned as a single SQL_VARBINARY result column. This attribute is intended to provide a faster path for import/export pipelines by avoiding the translation to/from internal Perl datatypes. E.g.,
open (FLIMPORT, 'fload.data') || die 'Can't open import data file: $!\n"; $sth = $dbh->prepare('USING (col1 integer, col2 char(20), col3 float, col4 varchar(100)) ' . 'INSERT INTO MyTable VALUES(:col1, :col2, :col3, :col4);', { 'tdat_raw' => 'IndicatorMode' }); while (sysread(FLIMPORT, $len, 2)) { sysread(FLIMPORT, $buffer, $len+1); # remember the newline! $buffer = pack("SA*", $len, $buffer); $sth->bind_param(1, $buffer, { TYPE => SQL_VARBINARY, PRECISION => length($buffer) }); $sth->execute( $buffer ); }
Read-only on statement handle.
Returns the number of the current statement within the request associated with
the statement handle. Applies only for the fetchrow_XXX()
statement
handle method; for requests which do not include SELECT statements, the returned
value is the total number of statements executed by the request.
Read-only on statement handle.
Returns an arrayref of hashrefs of Teradata statement information for each Teradata statement
within the request associated with the DBI statement handle. Not valid on EXPORT or PM/PC sessions.
Please note
that the DBMS starts statement numbering with 1, not zero; thus, loop constructs
used to scan the statement info array should start their index values at 1.
The following attributes are included in each statement's hashref:
ActivityType
- indicates the type of activity ('Select', 'Insert', 'Update', etc.)
of the statement.
ActivityCount
- indicates the number of rows effected by the statement.
Warning
- indicates any warning message associated with the statement. Returns
undef
if none.
StartsAt
- returns the starting index of a statement's returned column info or
data within the DBI column info and data arrays
(NAME
, PRECISION
, etc., as well as the results
of fetchrow_XXX()
).
Each attribute and rowdata array includes entries for all columns of all SELECT statements
within a request. In order to isolate the array entries which apply to the statement
currently being fetched from, use the result of $sth->{'tdat_stmt_num'}
to index into the information and data arrayref's. See the
MULTI-STATEMENT AND MACRO REQUESTS
section above for details. For non-SELECT statements, undef
is returned.
EndsAt
- returns the (inclusive) ending index of a statement's
returned column attribute and data within the DBI attribute and row data arrays.
This does NOT include any summary columns information generated by the statement.
For non-SELECT statements, undef
is returned.
IsSummary
- returns the current summary
row number for the statement, if any, or undef
if not a summarized SELECT
statement, or if the current row is not a summary row. The returned value is used to index
into the arrays returned by SummaryStarts
and SummaryEnds
to locate the field values and attributes for the specified summary row.
SummaryPosition
- returns an arrayref of the column numbers
associated with the summary fields in each summary row.
Set to undef
for non-SELECT or non-summarized statements. SummaryPosition
information is not available until after the execute() method has been called and a
summary row has been fetched.
SummaryPosStart
- returns an arrayref, indexed by summary row number,
of the starting index within the SummaryPosition
array for each summary row.
Set to undef
for non-SELECT or non-summarized statements. SummaryPosStart
information is not available until after the execute() method has been called and a
summary row has been fetched.
SummaryStarts
- returns an array of starting indexes within the DBI
attribute and row data arrays for a statement's summary column info and data. Set to undef
for non-SELECT or non-summarized statements. When processing a summarized statement,
an application
tdat_stmt_info
statement handle attribute
IsSummary
attribute of the current statement hashref
SummaryStarts
and SummaryEnds
arrays from the
current statement hashref
IsSummary
attribute) to
get the starting and ending indexes (inclusive) of column attribute and row data
from the SummaryStarts
and SummaryEnds
arrays
SummaryEnds
- returns an array of ending indexes within the DBI
attribute and row data arrays for a statement's summary column info and data.
Set to undef
for non-SELECT or non-summarized statements.
$sth = $dbh->prepare("INSERT INTO table VALUES(?,?,?,?); " . "UPDATE table2 SET col1 = 'another value' WHERE col1 = 'some value';"); $rows = $sth->execute(1, 2, 3, 4); $stmtcnt = $sth->{'tdat_stmt_num'}; # no SELECT, so returns number of last stmt $stmt_info = $sth->{'tdat_stmt_info'}; for ($i = 0; $i < $stmtcnt; $i++) { $stmthash = $$stmt_info[$i]; $activity = $$stmthash{'ActivityType'}; $stmtrows = $$stmthash{'ActivityCount'}; $warn = $$stmthash{'Warning'}; if ($warn) { print "Statement $i: $warn\n"; } print "$activity at statement $i effected $stmtrows rows.\n"; }
Read-only on statement handle.
Returns arrayref's of field title and format information (e.g., from (TITLE...) and (FORMAT ...)
clauses on SELECT or table/view specifications), ala the NAME, TYPE, PRECISION, etc. attributes.
$i = $sth->func(@param_list, BindParamArray);
This function has been removed, since it was only usable for Fastload connections. For SQL connections, use the DBI standard execute_array() and bind_param_array() functions.
$i = $sth->func(@param_list, BindColArray);
$param_list[0] is the number of the column to be bound, $param_list[1] is an arrayref
that will receive the column values, and $param_list[2] is the maximum number of rows the
application expects to be returned per fetch().
This function allows a single fetch() operation to return multiple rows of data.
In order to make this driver useful for high-performance ETL applications, support for multiple concurrent sessions is needed. Unfortunately, native DBI doesn't currently support the type of asynchronous session interaction needed to efficiently move data to/from a MPP database system. (Note that the commercial version of this driver supports Perl threads, which reduces the need for this type of asynchronous operation) To address this need, the following functions have been provided:
$i = $drh->func(@param_list, FirstAvailable);
$param_list[0] is an arrayref of database handles, and $param_list[1] is a timeout specification
(in seconds, -1 or undef indicate infinite wait). Returns the index of the first session
within the supplied database handle array that is ready to be serviced. If none of the
sessions is ready for service, it waits up to the timeout
number of seconds (or forever if timeout is -1 or undef) for a session
to become ready. Returns undef
if no sessions are ready in the specified timeout.
@ary = $drh->func(@param_list, FirstAvailList);
$param_list[0] is an arrayref of database handles, and $param_list[1] is a timeout specification
(in seconds, -1 or undef indicate infinite wait). Returns an array of indexes of sessions
within the supplied database handle array that are ready to be serviced. If none of the
sessions is ready for service, it waits up to the timeout
number of seconds (or forever if timeout is -1 or undef) for a session
to become ready. Returns undef
if no sessions are ready in the specified timeout.
NOTE: This function is useful for more evenly distriubting the workload across multiple
sessions when all sessions respond at nearly the same time. Using FirstAvailable() in that
situation tends to favor the first 1 or 2 sessions in the list, thus underusing the remaining
sessions.
$i = $sth->func(undef, Realize);
Realizes the results of a non-blocking statement execution. FirstAvailable
and
FirstAvailList
only wait for and report that a session is ready; they do not process the results
on the session. Realize
performs the actual processing of the database
response, including returning the success or failure of the operation, and any returned
rows.
NOTE: some platforms (notably Windows 95 and 98) may limit the total number of TCP/IP connections which you can initiate.
An example use of these functions to bulkload a table:
my $drh; my @dbhlist; my @sthlist; open(IMPORT "$infile") || die "Can't open import file"; binmode IMPORT; for (my $i = 0; $i < 10; $i++) { $dbhlist[$i] = DBI->connect("dbi:Teradata:dbc", "dbc", "dbc"); if (!defined($drh)) { $drh = $dbhlist[$i]->{Driver}; } } my @fa_parms = (\@dbhlist, -1); for (my $i = 0; $i < $sesscount; $i++) { $sthlist[$i] = $dbhlist[$i]->prepare( 'USING (col1 INTEGER, col2 CHAR(30), col3 DECIMAL(9,2), col4 DATE) ' . 'INSERT INTO mytable VALUES(?, ?, ?, ?)', { tdat_nowait => 1, tdat_raw => IndicatorMode }); sysread(IMPORT, $buffer, $len)) { $sthlist[$i]->bind_param(1, $buffer); $sthlist[$i]->execute(); } while (sysread(IMPORT, $buffer, $len)) { $i = $drh->func(@fa_parms, FirstAvailable); $rowcnt = $sthlist[$i]->func(undef, Realize); if (!defined($rowcnt)) { print STDERR " ** INSERT failed: " . $sthlist[$i]->errstr() . "\n"; } $sthlist[$i]->bind_param(1, $buffer); $sthlist[$i]->execute(); } while (some statements still active) { $i = $drh->func(@fa_parms, FirstAvailable); $rowcnt = $sthlist[$i]->func(undef, Realize); if (!defined($rowcnt)) { print STDERR " ** INSERT failed: " . $sthlist[$i]->errstr() . "\n"; } $sthlist[$i]->finish(); }
DBD::Teradata 1.20 requires a minimum Perl version of 5.6.0, and a minimum DBI version of 1.13.
The following DBI functions are not yet supported:
DBI->data_sources() $dbh->prepare_cached() $sth->table_info() $dbh->tables() $dbh->type_info_all() $dbh->type_info()Also be advised that using either
selectall_arrayref()
or
fetchall_arrayref()
is probably a bad idea unless you know
the number of rows returned is reasonably small.This driver have been successfully tested against both Teradata V2R5.0 and V2R5.1, running on Windows 2000. Prior versions have also been tested against V2R3.0 through V2R5.0 on MPRAS.
The following table lists the client platforms (hardware and O/S)
for which successful reported use of this driver have been either
tested or reported.
Hardware | OS | Perl Version | DBD::Teradata Version |
---|---|---|---|
Intel/AMD PC | Win98 | 5.005 (ActivePerl) |
1.12 |
Intel PC | WinNT 4.0 SP6 | 5.6 (ActivePerl) |
1.12 |
Intel PC | Linux | 5.?? | 1.00 |
NCR 4XXX (Intel) | MPRAS | 5.005 | 1.10 |
iMac DV (PowerPC) | LinuxPPC | 5.005 | 1.12 |
Sun SPARC | Solaris 4.3 | 5.005 | 1.10 |
IBM RS/6000 | AIX | 5.005 | 1.10 |
HP 9000 | HP-UX 11.0 | 5.6.0 | 1.10 |
Intel PC | FreeBSD 3.4-4.2 | 5.6.0 | 1.10 |
sth->finish()
first. The O'Reilly DBI book indicates it isn't essential,
but DBD::Teradata currently needs it to cleanup and cancel the current request.undef
and zero are
interpretted as false, so statements which effect no rows may be interpretted as
errors if you use the simple $sth->execute || die $sth->errstr
check.
Try defined($sth->execute) || die $sth->errstr
instead.PrintError
and RaiseError
to zero during DBI->connect()
, and explicitly checking for errors
yourself; otherwise, you may exit unexpectedly or get spurious error messages in the output
when you're just doing a "precautionary" DROP on a non-existant database object.
Copyright (c) 2000, Dean Arnold, USA
Copyright (c) 2001-2004, Presicient Corp., USA
Permission is granted to use this software according to the terms of the Artistic License, as specified in the Perl README file, with the exception that commercial redistribution, either electronic or via physical media, as either a standalone package, or incorporated into a third party product, requires prior written approval of the author.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Presicient Corp. reserves the right to provide support for this software to individual sites under a separate (possibly fee-based) agreement.
Teradata® is a registered trademark of NCR Corporation.