• You have to load data over a network—in other words, when the input file is not on the database server itself. Because the access driver is part of the Oracle software, only the files accessible to the database can be access as external tables.
• Multiple users must concurrently work with the same external table processing different input files.
• You have to work with LOB types. External tables do not support LOBs.
Oracle external tables may help you to import external flat file data like comma seperated values (CSV-Format).
We briefly discuss the concept of oracle external tables and then define a sample external table of type ORACLE_LOADER connected to a flat CSV-file.
Concept of External Tables
Since Oracle 9i external tables provide a comfortable mean for directly accessing external source data with SQL as it were in a table of your database. With external tables you can load data into your database, e.g. as part of an ETL process.
External tables provide an alternative for using SQL*Loader, a CLI-based utility provided by oracle, which is a more powerfull but more complicated, too. Typically you will use SQL*Loader for migrating a full scale data base into your oracle database server. By contrast External tables are the means to enter specifically external data channels into your data model. Often this is done periodically as part of a staging process.
See for a comparative list of diffrences between external tables and SQL*Loader.
The concept of external tables let you provide an external file in the filesystem of the database server, define the flatfiles row structure and then access this data like any other table directly with SQL. Compared to querying a relational data base table, such access is slow. Every access to a external table performs a ‘full table scan’ through the entire file. Thus normally you should access an external table only once, to insert the external data into a relational data table inside your DB tablespace.
The external table is defined with a specific DDL-command of the form SQL CREATE TABLE…ORGANIZATION EXTERNAL statement. In the reminder of this article we go through the standard case of importing a flatfile with comma seperated values, a so-called CSV-File.
External tables access is bound via file driver. There are two drivers types. The driver ORACLE_LOADER only support upload of flatfile data into the relationals database. If you wan’t to INSERT or UPDATE data in a flatfile you need to use the DATAPUMP Driver. Note that even is External tables and SQL*LOADER are different instruments to work with, both may still use the same driver underneath.
Creating an External Table
This DDL-creates an external table ext_csv_emp, that access the flat file data with driver type ORACLE_LOADER.
CREATE TABLE ext_csv_emp
(
employee_number NUMBER(8,0),
employee_job VARCHAR2(64 BYTE),
employee_last_name VARCHAR2 (40),
employee_first_name VARCHAR2 (40),
employee_middle_name VARCHAR2 (40),
employee_hire_date DATE )
ORGANIZATION EXTERNAL
(
TYPE ORACLE_LOADER
DEFAULT DIRECTORY "TMP_LPNO_DIR"
ACCESS PARAMETERS (
RECORDS DELIMITED BY NEWLINE
badfile TMP_LPNO_DIR: 'ext_csv_emp.bad'
logfile TMP_LPNO_DIR: 'ext_csv_emp.log'
skip 1
FIELDS TERMINATED BY ';' MISSING FIELD VALUES ARE NULL
( employee_number CHAR (8),
employee_job CHAR (20),
employee_last_name CHAR (18),
employee_first_name CHAR (11),
employee_middle_name CHAR (11),
employee_hire_date CHAR (10) date_format DATE mask "mm/dd/yyyy"
)
)
LOCATION ( 'ext_csv_emp.csv' )
);
|
Such a external table is visible in SQL Developer like
Syntactically spoken, an external table is a relational table with specific
physical propertiesdefined by the keywords
ORGANIZATION EXTERNAL followed by an
external table clause.
We choose the file access driver by the type parameter ORACLE_LOADER, then we define the external data properties. First we specify the server path by referring an Oracle Directory name, we defined before. Next we define the access parameters. CSV data rows are presented line by line, thus records are delimited by NEWLINE character.
Badfile is a logfile that contains all data rows that do not conform to the interface specification given in these access parameters. Bad rows are appended to badfile each time an SQL access the external table with SQL SELECT. Watch the size of this file and your free disk space on the database servers file system, when working on large CSV-datasources during setup of a new data fed process.
The logfile logs each access to the external table, by any SELECT on the database. What the size of this file. As the file may be looked by the database, you be obliged to delete the file while the database is shutdown.
In Switzerland the default field separator is a semicolon. Enter comma or whatever separator you want to use. Often it’s wise to use the local country defaults. Anyhow watch for the escape character problem. The field separator should not be part of the field data. If this can sometime not be avoided, we specify that fields may optionally be enclosed in double quotes, thus allowing the field separator be part of the field data.
The last part of the access parameters is the field list. These type definitions are opaque to SQL. They are not evaluated when SQL processes this DDL. You will note that type checking of the FIELDS list is deferred to runtime, when it is done by the driver specified in the file access type. As we have chosen file access driver type ORACLE_LOADER we have to use
SQL*LOADER field type here. Watch out that SQL*LOADER field type have similar names as SQL field types, but not the same. Don’t confuse and mix these in your external table definitions.
Sampe Input File
The sample inputfile ext_csv_emp.csv in path TMP_LPNO_DIR may be
EMPLOYEE_NUMBER;EMPLOYEE_JOB;LAST_NAME;FIRST_NAME;MIDDLE_NAME;HIRE_DATE
7412;Boss;Bedretti;June;Peter;01.09.2001
7413;Clerk;Moser;Mike;;01.09.1992
7415;DB Administrator;Miller;John;Jon;03.09.2008
The first line acts as a header line to hold the columns name.This is useful when checking the file with texteditors, EXCEL and the like. As we defined it, this first line is skipped. Of course it is your responsibility to assert, that this headerline is present, otherwise any first dataline might get skipped.
The external table can now be queried like
SELECT
EMPLOYEE_NUMBER, EMPLOYEE_JOB,
EMPLOYEE_LAST_NAME, EMPLOYEE_FIRST_NAME,
EMPLOYEE_MIDDLE_NAME, EMPLOYEE_HIRE_DATE
FROM EXT_CSV_EMP ;
|
resulting in
Choosing Types in Access Parameters Definitions
You using other type than CHAR in the FIELDS-Clause like DATE or INTEGER you may spend a lot time with strange error messages in the LOG-File, that are difficult to interpret and impossible to debug or trace. Therefore I prefer to use CHAR in the access parameter FIELDS. Define it wide enough to hold any possible row values. Try to expect the unexpected. In the sample above I rely on implicit type conversion of SQL.
If an erroneous DDL of an external table is accepted by SQL, you might get confused. Because of deference of access field type checking from compile time to runtime, you might get some unexpected, and maybe difficult to interpret messages when trying to access the external table.
If your input is more unstable, or if you want to make your data pipeline more bullet-prof define the external table like this and defer any type checking und conversion to a later step controlled by our own procedures:
CREATE TABLE ext_csv_emp2
(
employee_number VARCHAR2 (8),
employee_job VARCHAR2(64 BYTE),
employee_last_name VARCHAR2 (40),
employee_first_name VARCHAR2 (40),
employee_middle_name VARCHAR2 (40),
employee_hire_date VARCHAR2 (12))
ORGANIZATION EXTERNAL
(
TYPE ORACLE_LOADER
DEFAULT DIRECTORY "TMP_LPNO_DIR"
ACCESS PARAMETERS (
RECORDS DELIMITED BY NEWLINE
badfile TMP_LPNO_DIR: 'ext_csv_emp.bad'
logfile TMP_LPNO_DIR: 'ext_csv_emp.log'
skip 1
FIELDS TERMINATED BY ';' MISSING FIELD VALUES ARE NULL
( employee_number CHAR (8),
employee_job CHAR (64),
employee_last_name CHAR (40),
employee_first_name CHAR (40),
employee_middle_name CHAR (40),
employee_hire_date CHAR (12)
)
)
LOCATION ( 'ext_csv_emp.csv' )
);
|
Possible Errors
When setting up an external table the posiiblities for errors are manifold, most of them appear at runtime, an thus are painful.
Runtime Errors while driver is parsing Access Parameter Clause
As said the access parameter clause is only parsed at runtime. No checks are done at compile time.
When you get something like this
ORA-29913: error in executing ODCIEXTTABLEOPEN callout
ORA-29400: data cartridge error
KUP-00554: error encountered while parsing access parameters
KUP-01005: syntax error: found "optionally": expecting one of: "column, (, reject"
KUP-01007: at line 6 column 8
29913. 00000 - "error in executing %s callout"
*Cause: The execution of the specified callout caused an error.
*Action: Examine the error messages take appropriate action.
recheck you DDL. This type of error message is from parsing a syntactical wrong access parameter clause. This message generator’s line number counting oddly starts at the beginning of the access parameter clause, not at the total line number of you DDL statement in the editor.
Thinking about these implicit given rules may help:
• Order and naming of the SQL Table and it’s access parameter fields must match
• Syntax of all access parameter pass unchecked at compile time. Study the syntax diagrams carefully.
• comment lines are accepted at compile time but rejected by the driver at runtime
Reject Limit Reached
When you get this error your DDL and access parameters seems to be well defined. Parsing of CSV input data has started, but as it does not match the field definitions these rows are rejected. After reaching the reject limit the external data access is aborted.
With ORA-30653 it is time to check you LOG file. You will find further information like
LOG file opened at 09/03/12 13:53:50
Field Definitions for table EXT_CSV_EMP
Record format DELIMITED BY NEWLINE
Data in file has same endianness as the platform
Rows with all null fields are accepted
Fields in Data Source:
EMPLOYEE_NUMBER CHAR (2)
Terminated by ";"
Trim whitespace same as SQL Loader
EMPLOYEE_JOB CHAR (20)
Terminated by ";"
Trim whitespace same as SQL Loader
EMPLOYEE_LAST_NAME CHAR (18)
Terminated by ";"
Trim whitespace same as SQL Loader
EMPLOYEE_FIRST_NAME CHAR (11)
Terminated by ";"
Trim whitespace same as SQL Loader
EMPLOYEE_MIDDLE_NAME CHAR (11)
Terminated by ";"
Trim whitespace same as SQL Loader
EMPLOYEE_HIRE_DATE CHAR (12)
Terminated by ";"
Trim whitespace same as SQL Loader
KUP-04021: field formatting error for field EMPLOYEE_NUMBER
KUP-04026: field too long for datatype
KUP-04101: record 1 rejected in file C:\oraclexe\app\oracle\product\11.2.0\server\TMP_LPNO_DIR\ext_csv_emp.csv
KUP-04021: field formatting error for field EMPLOYEE_NUMBER
KUP-04026: field too long for datatype
KUP-04101: record 2 rejected in file C:\oraclexe\app\oracle\product\11.2.0\server\TMP_LPNO_DIR\ext_csv_emp.csv
KUP-04021: field formatting error for field EMPLOYEE_NUMBER
KUP-04026: field too long for datatype
KUP-04101: record 3 rejected in file C:\oraclexe\app\oracle\product\11.2.0\server\TMP_LPNO_DIR\ext_csv_emp.csv
Increase your fields characters widths or make sure your input columns in CSV are truncated to their maximum width before being delivered.
Prerequisited Directory Missing
The referenced directory must be defined and point to an server path accessible to the database server. Otherwise you get
SQL-ERROR: ORA-06564: object TMP_LPNO_DIR2 does not exist