LevSelector.com |
Oracle Bulk Operations
Intro | home - top of the page - |
The titles below are the basics you should know about Oracle Utilities. Don’t read the article in their entirety, just look through them to get the keywords and look at the examples.
Regarding Perl and Oracle. From Perl I can call a Windows batch file which executes a stored procedure from SQL*Plus. After the procedure executes, I send an email through the MIME:Lite module.
SQL*Loader - To load data into the database:
Export Utility
http://download-west.oracle.com/docs/cd/B10501_01/server.920/a96652/ch01.htm#1004671 - Export Utility
http://download-west.oracle.com/docs/cd/B10501_01/server.920/a96652/ch01.htm#1006108 - Export Sessions
--------------------------------------------
Import Utility
http://download-west.oracle.com/docs/cd/B10501_01/server.920/a96652/ch02.htm#1004685
Using Export and Import to Partition a Database Migration: http://download-west.oracle.com/docs/cd/B10501_01/server.920/a96652/ch02.htm#1006883
===========================================
Partitioning
http://download-uk.oracle.com/docs/cd/B10501_01/server.920/a96524/c12parti.htm#429785 - Intro
http://download-uk.oracle.com/docs/cd/B10501_01/server.920/a96520/parpart.htm#98124 - types of partitionning
Range Partitioning Example
CREATE TABLE sales_range (salesman_id NUMBER(5), salesman_name VARCHAR2(30), sales_amount NUMBER(10), sales_date DATE)
PARTITION BY RANGE(sales_date) (
PARTITION sales_jan2000 VALUES LESS THAN(TO_DATE('02/01/2000','DD/MM/YYYY')),
PARTITION sales_feb2000 VALUES LESS THAN(TO_DATE('03/01/2000','DD/MM/YYYY')),
PARTITION sales_mar2000 VALUES LESS THAN(TO_DATE('04/01/2000','DD/MM/YYYY')),
PARTITION sales_apr2000 VALUES LESS THAN(TO_DATE('05/01/2000','DD/MM/YYYY'))
);
Partitions
1. You can't alter existing table to add partitions. The one possibility is to create a new one with partitions and later on transfer data from old to new one. Partitions could be 2 kinds - range and hash .You can choose one of them or composition as well. Range key splits data into partitions by value (often by date ), Hash distributes data in almost even containers. It makes sense to have separate tablespace for each partition.
Because partition could be logically treated as separate table, all queries will be speed up, because of pruning unnecessary partitions.
For instance, composite partitioned table:
CREATE TABLE PAYMENT ( Payment_Id NUMBER(10) NOT NULL, Execution_Date date NULL, State VARCHAR2(20) NULL, )
PARTITION BY RANGE(execution_date)
SUBPARTITION BY HASH(payment_id) SUBPARTITIONS 4
STORE IN (USERS1,USERS2,USERS3,USERS4)
( PARTITION r1 VALUES LESS THAN (TO_DATE('2001/06/20', 'YYYY/MM/DD'))
( SUBPARTITION r1_h1 TABLESPACE USERS1, SUBPARTITION r1_h2 TABLESPACE USERS2),
PARTITION r2 VALUES LESS THAN (TO_DATE('2001/06/21', 'YYYY/MM/DD'))
( SUBPARTITION r2_h3 TABLESPACE USERS3, SUBPARTITION r2_h4 TABLESPACE USERS4) )
2. If you will make a partitioned table, oracle handles internal which record to which partition should go - you don't need to care about it, so simply take the usual file to SQL loader. Later on , you can test, how the data were distributed into partitions with SELECT * FROM payments SUBPARTITION (r2_h4 ).
More on Blk Operations in Oracle | home - top of the page - |
Examples and discussions on how to do bulk operations in Oracle can be found here:
http://asktom.oracle.com/pls/asktom/f?p=100:1:564207913312600::NO:RP::
1) Unloading - http://asktom.oracle.com/tkyte/flat/index.html
1.1) Fastest - Tom Kyte's Pro*C utility (VP Oracle) array_flat.
minuses: requires C compiler and Pro*C libraries installed for
compilation (Windows/Unix)
$./array_flat userid=scott/tiger 'sqlstmt=select * from emp' arraysize=100
(thats on 1 line) it will produce:
Connected to ORACLE as user: scott/tiger
Unloading 'select * from emp'
Array size = 100
EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO
7369,SMITH,CLERK,7902,17-DEC-1980 00:00:00,800,(null),20
7499,ALLEN,SALESMAN,7698,20-FEB-1981 00:00:00,1600,300,30
7521,WARD,SALESMAN,7698,22-FEB-1981 00:00:00,1250,500,30
7566,JONES,MANAGER,7839,02-APR-1981 00:00:00,2975,(null),20
7654,MARTIN,SALESMAN,7698,28-SEP-1981 00:00:00,1250,1400,30
7698,BLAKE,MANAGER,7839,01-MAY-1981 00:00:00,2850,(null),30
7782,CLARK,MANAGER,7839,09-JUN-1981 00:00:00,2450,(null),10
7788,SCOTT,ANALYST,7566,09-DEC-1982 00:00:00,3000,(null),20
7839,KING,PRESIDENT,(null),17-NOV-1981 00:00:00,5000,(null),10
7844,TURNER,SALESMAN,7698,08-SEP-1981 00:00:00,1500,0,30
7876,ADAMS,CLERK,7788,12-JAN-1983 00:00:00,1100,(null),20
7900,JAMES,CLERK,7698,03-DEC-1981 00:00:00,950,(null),30
7902,FORD,ANALYST,7566,03-DEC-1981 00:00:00,3000,(null),20
7934,MILLER,CLERK,7782,23-JAN-1982 00:00:00,1300,(null),10
14 rows extracted
Details - http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:459020243348
1.2) PL/SQL utl_file package can write file to directory on Oracle
server (or it should be mapped/mounted on Oracle server)
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:235814350980
Average speed, requires directory to be precreated with proper rights
for oracle user and also directory should be also "registered" with
Oracle database by DBAs.
http://download-uk.oracle.com/docs/cd/B19306_01/server.102/b14200/statements_5007.htm
================================================================
1.0) Super-puper unloading in Oracle 10g - unloading into external tables
-
http://64.233.169.104/search?q=cache:aYvI4TgAulIJ:www.quest-pipelines.com/newsletter-v6/0705_D.htm+sqlplus+unloader&hl=en&ct=clnk&cd=2&gl=us
minus - can't write files to clien't directory, have to be written to
the directory accessible by Oracle server. Need to have "Create
directory" executed by DBAs
Pluses: can have parallel unloading, but instead of one you will get
segmented self-contained files:
http://64.233.169.104/search?q=cache:aYvI4TgAulIJ:www.quest-pipelines.com/newsletter-v6/0705_D.htm+sqlplus+unloader+PARALLELISM+FOR+PERFORMANCE&hl=en&ct=clnk&cd=1&gl=us
1.3) sqlplus script (all version of Oracle)
Not as fast as other ways, but still fast, script is readable and easy
to modify, could be executed on any client computer.
http://asktom.oracle.com/tkyte/flat/unloader.zip
Usage:
$ sqlldr_exp scott/tiger dept
^^^^^^^^^^^^^^ user/password table_name
script looks like this:
--sqlldr_exp1.sql
--settings (e.g. set echo off verify off etc.)
set linesize 9999
-- ^^^^^ potential max size of the row.
-- double quoted CSV
spool c:\temp\dataCSV.txt
-- spool /home/user/data.txt
select '"'column1||'","'||column2||'"' from table where ...
/
spool off
-- ^^^^^ turns off spooling into file, starts spooling to the screen
-- fixed length format
spool c:\temp\dataFIXED.txt
set linesize 128
select rpad(text_column1,100)||'
'||lpad(text2,50)||rpad(to_char(number_field),10)||to_char(sysdate,'YYYYMMDDHH24MISS')
from table
where ...
/
spool off
You can run it from sqlplus (have to login to Oracle first)
or from command line:
sqlplus user/password -s @sqlldr_exp1.sql
-s (run script)
1.0.1) DBAs utilities:
Oracle datapump (expdp/impdp). In Oracle 10g replaces "old" utilities exp/imp
Raw tables and/or schema and/or user export / import:
http://64.233.169.104/search?q=cache:UWYkRB_78I0J:www.orafaq.com/wiki/Oracle_Datapump+wiki+ORACLE_DATAPUMP&hl=en&ct=clnk&cd=1&gl=us
2) Loading for new tables
2.1) Fastest way - external tables (work for Oracle 9i and above
You create special table (called external that has path and
instructions how to parse flat_file.
SQL> connect user_bulkload
Enter password:
Connected.
SQL> create table xtern_empl_rpt
2 ( empl_id varchar2(3),
3 last_name varchar2(50),
4 first_name varchar2(50),
5 ssn varchar2(9),
6 email_addr varchar2(100),
7 years_of_service number(2,0)
8 )
9 organization external
10 ( default directory xtern_data_dir
11 access parameters
12 ( records delimited by newline
13 fields terminated by ','
14 )
15 location ('employee_report.csv')
16 );
Table created.
Then you just select from this table as it is usual table.
http://www.orafaq.com/node/848
2.2) Loading for existing tables.
sqlloader (sqlldr) - has two modes direct=TRUE (triggers get disabled)
and conventional (regular inserts)
You have to create *.ctl file to describe file format, where to load etc.
Example 9-1 Setting the Date Format in the SQL*Loader Control File
LOAD DATA
INFILE 'data.dat'
INSERT INTO TABLE emp
FIELDS TERMINATED BY "|"
(
EMPNO NUMBER(4) NOT NULL,
ENAME CHAR(10),
JOB CHAR(9),
MGR NUMBER(4),
HIREDATE DATE 'YYYYMMDD',
SAL NUMBER(7,2),
COMM NUMBER(7,2),
DEPTNO NUMBER(2)
)
Control file might be created during data unload using your script.
http://64.233.169.104/search?q=cache:YmtuCJqSRhMJ:www.oracleutilities.com/OSUtil/sqlldr.html+sqlldr+direct%3DTRUE+position&hl=en&ct=clnk&cd=4&gl=us
Format can be also fixed, then loading is faster.
E.g.
LOAD DATA
INFILE 'table_with_one_million_rows.dat'
INTO TABLE TABLE_WITH_ONE_MILLION_ROWS
( COL1 POSITION(1:4) INTEGER EXTERNAL
, COL2 POSITION(6:9) INTEGER EXTERNAL
, COL3 POSITION(11:46) CHAR
, col4 position(48:83) CHAR
, col5 position(85:120) CHAR
, COL6 POSITION(122:130) DATE "MMDDYYYY"
)
http://64.233.169.104/search?q=cache:od_STSP412EJ:download-uk.oracle.com/docs/cd/B10501_01/server.920/a96652/ch09.htm+sqlldr+direct%3DTRUE+triggers&hl=en&ct=clnk&cd=3&gl=us
xxxxxxxxxxxxxxxxxxxxxxxxxx
Subject: Loading data into partially loaded table.
3) Merge
3.1) Oracle 9i and above. Merging regular table with external (it
doen't matter that "external", just because first you create external
table from flat file)
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:1615330671789
Let's have table e1= table e2
Let's assume we accidentally deleted from table e1 some rows. Here is
how to recover.
merge into EMP e1
using EXTERNAL_TABLE e2
on ( e2.empno = e1.empno )
when matched then
update set e1.sal = e2.sal
when not matched then
insert ( empno, ename, job, mgr, hiredate, sal, comm, deptno )
values ( e2.empno, e2.ename, e2.job, e2.mgr, e2.hiredate, e2.sal,
e2.comm, e2.deptno
)
/
commit;
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:1615330671789
1.5) Loading (fifth way)
Inserting million rows is faster with /*+ append*/ hint:
insert /*+ append */ into emp select * from external_table;
3.2) Sqlldr can reject dublicate row and create *.bad file for records
not loaded.
Also it can merge (as above) or load data from one file into 3
different tables (each row can be split into 3 tables, or diferent
rows can go to differnt tables)
4) Loading-unloading from database1 to database2. You can create link in database1 to the database2 (it's like client connection to database2. usual user/password is used. you have to setup grants/privelegs)
then if you are connected to db1 you can say: select * from
table@dblink2 (<-name of the link). or insert into table@dblink2
select * from table_db1.
Fast, but not fastest. Requires DBAs setup.
==========================================================
Latest in Oracle 11g - Oracle Warehouse Builder (OWB)
http://www.oracle.com/technology/products/warehouse/index.html
I don't know anyone who used it yet.
1.6) Create materialized view in database2. Must have dblink to database1. Then you can schedule a oracle job how often do you want to syncronize table between databases.
2) If the table you are loading have a lot of records, sometimes update will not be as productive as unloading/deleting data and loading it again from file.
3) Partition Exchange
http://64.233.169.104/search?q=cache:XRVqIWSpc7AJ:www.dbazine.com/oracle/or-articles/jlewis17+alter+table+%22exchange+partition%22&hl=en&ct=clnk&cd=3&gl=us
You can load data into another table and then exchange partition (you
have to have both tables partitioned, and Oracle Server should have
partition feature (option))
What is Partition Exchange?One of the great features about
partitioning, and most specifically range-based partitioning, is the
ability to load in new data quickly and easily with minimal impact on the current users. The manuals indicate that the critical command to
achieve this is as follows:
alter table pt_tab1 exchange partition p_9999 with table new_data--
including indexes -- optional-- without validation -- optional;
This command "hacks" into the data dictionary and swaps over the definitions of the named partition and the new_data table, so that the data suddenly exists in the right place in the partitioned table. Moreover, with the inclusion of the two optional extra clauses, index definitions will be swapped and Oracle will not check whether the data actually belongs in the partition so the exchange is very quick. Or so the story goes.
Typically an article to demonstrate this point about partition
exchange will create a table with a handful of partitions and a couple
of hundred rows per partition, quite possibly without indexes, and
almost certainly without any related tables.
Typically, if you've paid for the partitioning option, you will have
very large tables, a few indexes, and things like related tables with
foreign keys defined. So what happens in real life when you do a
partition exchange?
=============================================