LevSelector.com |
DB2
intro, links | home - top of the page - |
DB2 traditionally was running on
Mainframe computers (since 1983). But in mid-90s it was ported on Unix
and other OSs - and now it is called: UDB
- Universal DataBase
- http://www-3.ibm.com/cgi-bin/db2www/data/db2/udb/winos2unix/support/document.d2w/report?fn=db2v7s0frm3toc.htm
- SQL Reference - very good place to use
- ftp://ftp.software.ibm.com/ps/products/db2/
- download FTP site
- www-4.ibm.com/software/data/db2/linux/
- download version for linux
- directory.google.com/Top/Computers/Software/Databases/IBM_DB2/
- on Google
- www-4.ibm.com/software/data/db2/udb/
- Official DB2 home page on IBM web site - including downloads, and technical
libraries.
- www-4.ibm.com/cgi-bin/db2www/data/db2/udb/winos2unix/support/v7pubs.d2w/en_main
- manuals (html & pdf)
- www-3.ibm.com/software/data/db2/os390/v7books.html
- HTML & PDFs books for download
- http://www-4.ibm.com/software/data/db2/library/
- Manuals
- http://www.redbooks.ibm.com/
- Redbooks
- http://www7b.boulder.ibm.com/dmdd/
- db2 developer domain
- http://www-3.ibm.com/services/learning/spotlight/db2/db2fastpath.html?c=IBM_ChannelNewswatch&n=various&t=aff
- Fastpath - self-study, e-learning course on DB2 UDB (download).
- www.webcom.com/raberd/db2info.html- Ron Rabe's DB2 Page.
- www.db2mag.com
- DB2 magazine
- www.esj.com
- Enterprise Systems Journal (see also www.research.ibm.com/journal/
)
- www.idug.org
- IDUG: International DB2 Users - IDUG is the International
DB2 Users Group. IDUG provides information and support to DB2 Users and
Vendors about DB2 database products, DB2 conferences, other DB2 user groups,
and other DB2 vendors.
- members.aol.com/rexxauthor/sqlbk.htm
- SQL Books - Descriptions and ordering information on books
about SQL for DB2.
- ourworld.compuserve.com/homepages/Graeme_Birchall/HTM_COOK.HTM
- DB2 UDB Cookbook Downloads - Graeme Birchall.
- jupiter.ryci.com/archives/db2-l.html
- Search Archives of DB2-L discussion list.
- www.multimania.com/db2usa/eliendb2.htm
- DB2usa: DB2 for OS/390 - links.
- www.dblinks.net/DB2/
- Links and resources.
- www.gseukdb2.org.uk/
- GSE UK DB2 Working Group -
OLAP:
- OLAP
functions - OLAP Functions - from SQL reference
- http://www7b.software.ibm.com/dmdd/library/techarticle/lyle/0110lyle.html
- good review of OLAP functions
- http://www.almaden.ibm.com/cs/people/peterh/b15.pdf
- Peter J. Haas DB2 UDB Advanced Analytics for Business Intelligence
- http://www.almaden.ibm.com/cs/people/peterh/b16.pdf
- Peter J. Haas Speeding up DB2 UDB Using Sampling
- http://www.ibm.com/search?en=utf&v=11&lang=en&cc=us&lv=c&q=db2+olap+function+reference - search IBM site
- http://www.ibm-siebel.com/us/ - IBM - Siebel
books | home - top of the page - |
DB2 books:
- DB2 Fundamentals Certification for Dummies
- by Paul C. Zikopoulos, Jennifer Gibbs, Roman B. Melnyk (2001)
- DB2 : The Complete Reference - by Roman B. Melnyk, Paul C. Zikopoulos (2001) - DB2 Developer's Guide, Fourth Edition (2000) by Craig S. Mullins - A Complete Guide to DB2 Universal Database - by D. D. Chamberlin (1998) - DB2 Universal Database v7.1 for UNIX, Linux, Windows and OS/2 Database Administration Certification Guide (4th ed.) - by George Baklarz, Bill Wong, Jonathan Cook (2000). - DB2 Universal Database in the Solaris Operating Environment - by Tetsuya Shirai et al (1999) |
DB2 Magazine - www.db2mag.com/subscribe/
DB2 resources - www.idug.org/db2/index.html
- www.db2mag.com/resources/
tools | home - top of the page - |
Tools:
You can work fom the pompt (unix or dos). This is good, because you
can use many convenient shell features - and run commands from a file -
like this: db2 -tvf myfile .
Or you can use DB2's own set of graphical tools
You can also use DBArtisan
and RapidSQL
from embarcadero .
Porting from Sybase to DB2 | home - top of the page - |
As DB2 now (starting ver.7) has Transact SQL, you can easily port from
Sybase to DB2.
Read this book: "DB2 UDB V7.1 Porting Guide" - order it from FatBrain.com
or read / download PDF:
* http://publib-b.boulder.ibm.com/Redbooks.nsf/RedbookAbstracts/sg246128.html?Open
Sybase | DB2 UDB |
Sybase server | DB2 UDB Instance |
database | database |
device | table space and containers |
segment | |
login to server | attach to instance (a) |
use database | connect to database |
master database | system catalogs |
empdb | temporary table spaces |
(a) You can execute the ATTACH command to attach to the DB2 instance on the same workstation or remote workstation. To perform instance administrative tasks like creating a database, updating the database manager, and killing connected database user, you have to attach to the DB2 instance. If you have not executed the ATTACH command, all the instance level commands are executed against the current instance, specified by the DB2INSTANCE environment variable.
Next table shows common admin tasks (most of them can be also invoked
from the Sybase Central or, for DB2 - from the Control Center, the Command
Center or the Stored Procedure Builder (SPB)).
Task | Sybase | DB2 |
View the server options | sp_configure | GET DBM CFG |
View the database options | GET DB CFG FOR dbname | |
Update server options | sp_configure 'option_name', new_value | UPDATE DBM CFG USING 'config_parameter', new_value |
Update database options | UPDATE DB CFG FOR dbname USING 'config_parameter', new_value | |
Display active servers or instances | showserver (executed from UNIX shell) | db2ilist (executed from UNIX shell) |
Access server or instance | isql -Uuser -Ppswd -Sserver | ATTACH TO instance_name user user_name using pswd |
Access database | use dbname | CONNECT TO dbname user user_name using pswd |
List databases in server or instance | sp_helpdb | LIST DB DIRECTORY |
List devices or files used by the databases | sp_helpdevice | LIST TABLESPACES or LIST TABLESPACE CONTAINERS |
Find space used or available space | sp_helpdb dbname or sp_helpsegment segname or sp_spaceused | LIST TABLESPACE CONTAINERS FOR (tsid) SHOW DETAIL or LIST TABLESPACES SHOW DETAIL |
List database tables | sp_help | LIST TABLES |
List table characteristics | sp_help tablename | DESCRIBE TABLE tablename |
List source for stored procedures | sp_helptext procname | Use 'Get Source' function of the DB2 Stored Procedure Builder |
Administer security | grant revoke sp_helpuser sp_addlogin sp_adduser sp_addalias sp_dropalias sp_dropuser sp_droplogin sp_addgroup sp_helpgroup sp_changegroup sp_password | GRANT REVOKE UPDATE DBM CFG USING SYSADM_GROUP group_name UPDATE DBM CFG USING SYSCTRL_GROUP group_name UPDATE DBM CFG USING SYSMAINT_GROUP group_name All the authentication set up is done by an external security mechanism such as Operating System (mkuser, chuser, mkgroup, chgroup, passwd) |
Start a server or instance | startserver -fsrvr_name (from UNIX shell) | db2start (from UNIX shell) |
Backup database | dump database db_name to '/path/file' | BACKUP DATABASE db_name TO /path/file |
Restore database | load database db_name from '/path/file' | RESTORE DATABASE db_name FROM /path/file |
Export a text file from a table | bcp table_name out filename
for example: bcp dbase..tab1 out tab1.dat -S DBSERV1 -U someuser -P somepasswd -c > /dev/null |
EXPORT TO filename OF type SELECT
for example: export to names.del of del select * from names |
Load a text file into a table | bcp table_name in filename
for example: bcp dbase..tab1 in tab1.dat -S DBSERV2 -U someuser -P somepasswd -c > /dev/null |
LOAD FROM filename OF type INSERT INTO tablename
or IMPORT FROM filename OF type INSERT INTO tablename for example: import from names.del of del replace into names |
List connected users | sp_who | LIST APPLICATIONS |
Kill connected users | kill spid_number | FORCE APPLICATION |
Generate DDLs | defncopy | db2look -e |
db2 Getting Help, Simple Commands | home - top of the page - |
add your username into /etc/group file:
db2fadm1:x:104: db2iadm1:x:103:db2as,user1,user2,user3 db2asgrp:x:105:db2inst1,user1,user2,user3 Command-line processor - consists of 2 processes:
Getting help:
db2 list command options db2 create database test
db2 connect to acct
db2 connect to ACCT user db2inst1 using ibmdb2 db2 "create table test ( col1 varchar(60), col2 int, col3 timestamp)"
db2 "create table test1 ( fname varchar(60),lname varchar(60), age int)"
db2 "insert into test1 values ('Jayson','Couper',44)"
|
update and delete with joins - correlated query:
When you need to update one table based on the
other table -
it is called "correlated update". You have to repeat the same "where" logic 2 times. update maintab m
or variation: use "in" expression: update maintab m
or delete expression: delete from maintab m
Note: If you don't include the 2nd "where" clause - then ALL rows
in the
|
How to insert a row - but only if the row with the value of a file doesn't exist:
insert into mytable (mycolumn)
select '12345' from table (values 1) as dummy where not exists (select 1 from mytable where mycolumn='12345') |
Many examples:
insert into test1 (LNAME,FNAME,AGE) values ('LL','FF',7);
insert into test1 (AGE,LNAME,FNAME) values (8,'LL','FF'); insert into test1 (age,lnameE,fname) values (9,'LL','FF'); select * from test1 db2 -r file1.txt "select * from test3 order by fname" db2 "select * from test3 order by fname" > file2.txt db2 'export to test1.txt of del select * from test1'
Saving DDL for the database
db2 list database directory
db2 "select * from test1"
db2 describe table A_ACCOUNT
vi test.sql
-- run many commands from a file.
db2 "create table session.ss ( fname varchar(60),lname varchar(60),
age int)"
db2 list tables for user
-- calling shell commands from inside db2 using '!':
|
Some commands for Windows:
db2ic - inform.center
db2cca -client configuration assistant db2cmd - db2 command window |
Import / Export | home - top of the page - |
There are three utilities for bulk load/unload: Export, Import, Load
Supported file types:
DEL - delimited ASCII format - this is what
you mostly use.
ASC - for import (1 line => 1 row, column
selection based on byte position in the line).
IXF - Integrated Exchange Format (contains
table def. and data) - you will probably never use it.
WSF - Work Sheet Format - used with some
old spreadsheets.
EXPORT:
export to names.del
of
del select * from names
IMPORT:
insert - add (append) rows of data
insert_update - add rows (or update existing data)
replace - deletes all data from the table - then inserts
new data
replace_create - (only with IXF files) - deletes / re-creates
the table and fill it with data
import from names.del
of
del insert into names
import from names.del
of
del replace into names
Here is how to truncate a table:
import from /dev/null of del
replace into mytab
LOAD:
Load utility is faster, than import utility, but import utility
is logged and is checking constraints
load from names.del
of
del replace into names
Modifying a delimiter
export ...... modified by chardel *
Backup / Recovery | home - top of the page - |
Recovery is a serious issue - consult with administration guides.
Two simple commands:
BACKUP DATABASE db_name TO /path/file
RESTORE DATABASE db_name FROM /path/file
get_help | home - top of the page - |
db2 ?
db2 ? some_command
db2 ? options
db2 ? help
db2 ? error_code
for example:
> db2 crocodile
DB21034E The command was processed as an SQL statement because it was not a valid Command Line Processor command. During SQL processing it returned: SQL1024N A database connection does not exist. SQLSTATE=08003 > db2 ? SQL1024N SQL1024N A database connection does not exist. Explanation: There is no connection to a database. Other
SQL
The command cannot be processed. User Response: If the error occurred when disconnecting from the
sqlcode: -1024 sqlstate: 08003 |
Another way to get help fast is to copy/paste the error string into google (internet search engine). Especially Google Groups is very helpful. You usually get the answer on the very first screen.
Download documentation:
- www-4.ibm.com/software/data/db2/udb/
- Official DB2 home page on IBM web site - including downloads, and technical
libraries.
- www-4.ibm.com/cgi-bin/db2www/data/db2/udb/winos2unix/support/v7pubs.d2w/en_main
- manuals (html & pdf)
- www-3.ibm.com/software/data/db2/os390/v7books.html
- HTML & PDFs books for download
- http://publib.boulder.ibm.com/cgi-bin/bookmgr/download/dsnmch11.pdf
- Messages and Codes (~ 7 MB in size)
- http://publib.boulder.ibm.com/cgi-bin/bookmgr/download/dsnsqh11.pdf
- SQL reference (~ 5 MB in size)
- http://publib-b.boulder.ibm.com/Redbooks.nsf/RedbookAbstracts/sg246128.html?Open - "DB2 UDB V7.1 Porting Guide"
SQL procedure | home - top of the page - |
-- This is a CLP script that creates an
SQL procedure.
-- To create the SQL procedure using this script, perform the following steps: -- 1. connect to the database -- 2. issue the command "db2 -td@ -vf <script-name>" -- where <script-name> represents the name of this script -- -- To call this SQL procedure from the command line, perform the following steps: -- 1. connect to the database -- 2. issue the following command: -- db2 "CALL create_dept_table ('D11', ?)" -- -- The sample "dynamic.sqc" demonstrates how to call this SQL procedure using -- an embedded C client application. CREATE PROCEDURE create_dept_table
-- continue if sqlstate 42704 ('undefined object
name')
SET new_name = 'DEPT_'||deptNumber||'_T';
SET table_name = new_name;
|
db2 SQL: value(), values(), with, recursive SQL | home - top of the page - |
comment:
-- comment should be on a separate line outside an SQL statement
select 1 from (values 1) as aa -- this is not a valid db2 comment
|
way to create simple tests against dynamic table:
select 'a' as col1,'b' as col2 from table (values 1) as dummy |
or shorter:
select 'a','b' from (values 1) qq |
In fact, for simple tests you don't even need select and a dummy table.
values ( 'mama', 1) |
values (cast (2.5 as decimal(10,3)) * cast (2.5 as decimal(10,3)) ) |
using cast( ) and values( ), building null values on the fly:
select
'abc' as col1, cast(null as varchar(80)) as col2 from table (values (1,2),(3,4)) as dummy |
Hint how to make running tests from the prompt easy:
I create 2 aliases:
alias vv='vi test.sql'
Type 'vv' to edit test.sql file (remember - you should end each SQL
statement with a semicolon ';')
|
simple insert:
insert into inst1.test (A) values ('a'), ('b'), ('c') |
using case:
select
case when 1<2 then 'mama' when 1>2 then 'papa' end as person from table (values 1) as qq |
concatenating:
select 'mama' concat ' papa' from table (values 1) as qq;
select 'mama' || ' papa' from table (values 1) as qq; |
union, intersect & except:
select .. from T1 union T2 - remove duplicates
select .. from T1 union all T2 -- preserve duplicates select .. from T1 intersect T2 - remove duplicates select .. from T1 intersect all T2 -- contains min number of repetition select .. from T1 except T2 - remove duplicates, then do except select .. from T1 except all T2 do except, then remove duplicates |
value( ) function:
-- value() function - accepts a variable number of parameteres and
returns a
first non-null value -- parameters should be of compatible types -- the actual name for this function is coalesce() - means "to arise from a combination of distinct elements" select value(1,2) from (values 1) as aa -- returns 1 select value(cast(null as int),2) from (values 1) as aa -- returns 2 select value(cast(null as int),cast(null as int),3) from (values 1) as aa -- returns 3 |
------------------------------------------------------------
2 ways (common table and inline table expression) to define table dynamically
in the SQL statement:
with tree (id,pid) as (
values (1,2), (2,3), (3,4), (4, cast(null as int)), (5,3), (6,5) ) select * from tree ID
PID
|
Getting root of the tree using recursive SQL:
with
tree (id,pid) as (
rr (id,pid) as (
select id from rr where rr.pid is null |
More recursive SQL:
We have table storing a tree (each row has id and parent_id).
To get root id for a given id (12345) we fill out temporary table rr: first with given id and its parent_id. Then union with recursively calculated id and parent_id as we go up the tree: with rr (parent_id, id) as
Here is an example going in the oposite direction (down the tree). We showing the whole tree under a given id: with rr (id, level) as
Result:
|
Here is how to combine 2 above queries together:
----------------------------------------------------------------------------
-- Given any node in any tree, this query will drill up to the top -- level of the tree, and then query down to give all nodes that exist -- within the tree (i.e. given any node_id, show the whole -- tree that contains it). -- -- Note: This query will never go deeper than 'stop_level' levels (10). ---------------------------------------------------------------------------- with rec_root (parent_id, child_id, sub_query,
level, stop_level) as
|
Example comparing DB2 and Sybase:
DB2...
with tt (is_asst) as ( SELECT 1 is_asst FROM TABLE (VALUES 1) AS temp_table_1 where exists ( select 1 from coverage.desk_sales where user_id = 'myuser' and rel_type in ('D', 'C', 'L', 'A', 'N') ) or exists ( select 1 from coverage.regional_sales where user_id = 'myuser' and rel_type in ('D', 'C', 'L') ) union SELECT 0 is_asst FROM TABLE (VALUES 0) AS temp_table_0 ) select * from tt order by is_asst desc fetch first 1 rows only; Sybase...
select "is_asst" = @is_asst |
starting with db2 | home - top of the page - |
Starting working with DB2 on unix:
You have to install dbd2 client software on your unix workstation and
reboot it.
You have to source a db2profile file with all necessary environment You have to catalog properly, for example: db2 catalog tcpip node xxx remote somer_server server some_port
db2 catalog db sample at node xxx
db2 -tvf myfile.sql - to run many commands from a file. Don't forget to separate SQL statments with ";" |
db2 "select char(reviewed_date, ISO) from oe.OE_contact fetch first
10 rows only"
-------------------------------------------------
db2 instance | home - top of the page - |
Instance:
Creating / updating / dropping the instance (be careful)
db2icrt <instance_owner_id> db2icrt -u <fenced_user_id> <instance_owner_id> db2iupdt <instance_owner_id> db2idrop <instance_owner_id> db2ilist - list all instances on the machine creating an instance of DAS (db admin server)
on Windows:
configuring the instance:
db2 update dbm cfg using diaglevel 4
restart the instance:
instance config commands:
|
db2 database | home - top of the page - |
Database:
create database:
db2 create database test db2 create database test on /mydir db2sampl <location> -- create a sample database db2 create database test using codeset IBM-1252 territory US using database directories:
The database directory has the following structure:
Configuring the database:
db2 update db cfg for <dbname> using locktimeout
10
deleting a database:
|
db2 tablespaces | home - top of the page - |
Tablespaces:
Tablespace - logical space grouping
containers to hold db stuff
Tablespace containers (physical) - directories, files, raw devices SMS - System Managed Space - directory containers managed by the operatin system. Automaticall create a file for each new table or index to hold the data - as long as there is enough disk space. DMS - Database Managed Space - managed by the database (use command 'add container' to change their size) Three (3) default tablespaces (created
with one container each by default):
db2 create tablespace myDMSspace managed by database
using (FILE 'file1' 1000, FILE 'file2' 2000)
|
db2 tables, datatypes, indexes, views | home - top of the page - |
Tables:
system tables (SYSCATSPACE) may have one
of four (4) schema names: SYSIBM - read-only system tables (e.g. SYSIBM.SYSTABLESPACES, SYSIBM.SYSTABLES, etc.) SYSCAT - read-only views on the system tables (e.g. SYSCAT.TABLESPACES, SYSCAT.TABLES, etc.) SYSFUN - UDF (User Defined Functions) SYSSTAT - hold statistics (e.g. SYSSTAT.TABLES, SYSSTAT.COLUMNS, SYSSTAT.INDEXES, etc.) db2
list tables for system Compare:
User tables - simply tables created by users (permanent tables) db2 'select typename from syscat.datatypes' TYPENAME 17 record(s) selected.
Identity columns: create table test_identity ( select * from test_identity Note: when defining identity column, you can also provide attributes, for example:
You can also provide a identity "generation-expression" not null , with default: Table Constraints: Insert rule create table tab1 ( id, ...) |
Indexes:
db2 create index i1 on tab1 ( col1 desc)
db2 create index i2 on tab1 (col1, col2) db2 create unique index i1 on tab1(col1) db2 create index i1 on tab1(col1) cluster db2 create index i1 on tab1(col1) allow reverse scans db2 alter index ... db2 drop index ... |
Views:
db2 "create view as select col1 from tab1 where col1 like 'A%' " |
Bufferpools - cache for database pages (see in SYSCAT.BUFFERPOOLS table)
Log Files - logging can be turned off to speed up operations
Authentication | home - top of the page - |
attach - get access to the instance
connect - get access to the database
db2 attach to db2inst1 user myuser using mypassword
db2 connect to mydatabase user myuser using mypassword db2 connect to mydatabase user myuser using mypassword new newpass confirm newpass changing auth.type of the database server:
Auth.types:
|
TRUST_ALLCLNTS parameter - YES, NO, DRDAONLY
YES:
On the server:
On the client:
NO: On the server:
On the client:
|
TRUST_CLNTAUTH parameter - CLIENT, SERVER
CLIENT:
On the server:
On the client:
SERVER: On the server:
On the client:
|
Gaining authorization = gaining group membership in one of 5 groups:
SYSADM - instance king,
SYSCTRL - instance control - almost as powerful as SYSADM, but needs explicitly granted privileges, SYSMAINT - instance maintenance, can not access user data DBADM - database-level authority (db - not instance). LOAD - database-level authority to insert large amounts of data. SYS* authorities - can be assigned only to groups, are instance-level
db2 update dbm cfg using SYSCTRL_GROUP db2grp
db2 grant dbadm on database to user <username>
db2 revoke dbadm on database from user <username>
db2 grant load on database to user <username>
db2 revoke load on database from user <username>
|
Privileges - are stored in the database system catalog views:
SYSCAT.DBAUTH - databases authorities (db2 get authorization
- will report the database authorities of the current user)
(DBADM, CREATETAB, BINDADD, CONNECT, CREATE_NOT_FENCED, IMPLICIT_SCHEMA, LOAD) SYSCAT.TABAUTH - table and view priveleges (CONTROL, ALTER, DELETE, INDEX, INSERT, SELECT, REFERENCE, UPDATE) SYSCAT.INDEXAUTH - index privileges (CONTROL) SYSCAT.SCHMAAUTH - schema privileges (ALTERIN, CREATEIN, DROPIN) SYSCAT.PACKAGEAUTH - package privileges (CONTROL, BIND, EXECUTE) |
Here is how to grant/revoke those privileges:
db2 grant select on mytable to public
db2 grant <db_privilege> on database to user <username>
db2 revoke <db_privilege> on database from user <username>
More examples:
|
Privileges to an object are also granted implicitly when somebody creates an object. (database, table, view, etc.).
concurrency | home - top of the page - |
package - executable form of one or more queries - SYSCAT.PACKAGES
a package is created using PREP and BIND commands.
You can set level of isolation (or concurrency) of the package
There are 4 levels: UR (Uncommitted Reads), CS
(Cursor Stability), RS (Read Stability), RR
(Repeatable Read)
The lowest isolation (highest concurrency) is UR, the highest isolation
- RR. Default - CS.
db2 prep something.sqc isolation RS
db2 bind something.bnd isolation RR db2 prep something.sqc bindfile using something.bnd isolation RS db2 bind something.bnd |
Locks
- mode (state) - IN (Intent None), IS (Intent Share), NS (Next
Key Share), S (Share), IX (Intent Exclusive), SIX (Share with Intent Exclusive),
U (Update), NX (Next Key Exclusive), NW (Next Key Weak Exclusive), X (Exclusive),
W (Weak exclusive), Z (Super exclusive)
- Object,
- Duration.
db2 lock table test in share mode
db2 alter table test locksize table db2 alter table test locksize row |
More SQL hints | home - top of the page - |
More SQL: temp.tables, where, order by, aliases, subqueries, column
& scalar functions, group by, having, union, except, intersect
create table tab1 ( ... ) in userspace1;
alter table tab1 add primary key (...) drop table tab1 |
declare global temporary table temp1
like tab1 on commit preserve rows not logged in mytempspace |
where clause:
x = y, x <> y, x < y, x > y, x <= y, x >= y,
IS NULL, IS NOT NULL IN, NOT IN BETWEEN, NOT BETWEEN LIKE, NOT LIKE EXISTS, NOT EXISTS select name, salary from stuff where salary > 20000 ... where years in (1,2,3)
|
Aliasses can be used in the "order by":
Bad news is that the name aliases for fields (simple or calculated)
can not be used in the "where" clause, because the "where" clause is processed
before the alias name is applied to the result. So you may need to
repeat the whole expression in the where clause.
But good news is that the name aliases can be used in the "order by"
clause.
select value(x.param1, x.param2) myparam
from test1 x where (x.param1, x.param2) >= 40 order by myparam; |
Subquery:
select lname
from employee where lname in (select sperson from sales where sales_date < '01/01/1996') |
Column Functions:
SUM, AVG, MIN, MAX, COUNT
note: don't try to specify column functions in a where clause - they will NOT work, because where clause is evaluated before the select clause. select name, salary
|
Scalar Functions:
ABS, HEX, LENGTH, YEAR, MONTH, DAY, LCASE or LOWER, UCASE or UPPER
select min(length(deptname) as min, max(length(deptname)) as max from dept |
group by & having:
select sales_date, max(sales) as max_sales
from sales group by sales_date where max(sales) > 25 |
union, except, intersect:
select sales_person from sales where region = 'reg1'
union select sales_person from sales where sales > 3 note: union processes both queries eliminates duplicates between them, and returns the final combined result set select sales_person from sales where region = 'reg1'
select sales_person from sales where region = 'reg1'
|
insert, update, delete:
insert into tab1 values (123,'something',....)
insert into tab1 select ... from ... where .. update tab1 set (a,b) = (123,'something') where ... delete from tab1 where ... |
SQL_2 - recursive queries, combined outer joins | home - top of the page - |
select col1, cast(round(avg(col2),2) as decimal(9,2))
as aver_col2
from tab1
group by col1
order by col1
recursive query:
with exp1 (col1, col2, col3) as
( -------- select top nodes select ... from tab1 where ... union -------- select other nodes recursively
)
Here is an example: with exp1 (row_id, par_postn_id, X_POSITION_NAME,
PARENT_NAME)
select
union all select
) select
|
combined outer joins:
select empno, deptno, projname
from ( employee left outer join project on respemp=empno
)
left outer join department on mgrno =
empno
SQL OLAP | home - top of the page - |
- OLAP
functions - OLAP Functions - from SQL reference
- http://www7b.software.ibm.com/dmdd/library/techarticle/lyle/0110lyle.html
- good review of OLAP functions
- http://www.almaden.ibm.com/cs/people/peterh/b15.pdf
- Peter J. Haas DB2 UDB Advanced Analytics for Business Intelligence
- http://www.almaden.ibm.com/cs/people/peterh/b16.pdf
- Peter J. Haas Speeding up DB2 UDB Using Sampling
- http://www.ibm.com/search?en=utf&v=11&lang=en&cc=us&lv=c&q=db2+olap+function+reference - search IBM site
star schema grouping:
select empno, firstname, lastname, sum(sales)
as tot_sales, (salary + bonus + comm) as compensation
from employee, sales
where sex='M' and year(sales_date) = 1996 and
lastname = sales_person
group by empno, firstname, lastname, (salary
+ bonus + comm)
rollup grouping:
select year(sales_date) as year, count(*) as
tot_sales
from sales
group by rollup (year(sales_date))
year tot_sales
----- -------------
-
41
1995 5
1996 36
select year(sales_date) as year, region, count(*)
as tot_sales
from sales
group by rollup (year(sales_date), region)
select year(sales_date) as year, region, sales_person,
count(*) as tot_sales
from sales
group by rollup (year(sales_date), region, sales_person)
cube grouping:
select year(sales_date) as year, region, count(*)
as tot_sales
from sales
group by cube (year(sales_date),
region)
select year(sales_date) as year, region, sales_person,
count(*) as tot_sales
from sales
group by cube (year(sales_date), region, sales_person)
other OLAP features:
some useful functions: sum, count,average, stddev, corr, regr_* rownumber,
rank, windows aggregates
simple statistics:
select country, year, count(*) as count, sum(amount)
as sum, avg(amount) as avg, max(amount) as max, stddev(amount) as stddev
from mytable
group by country, year
Detect transactions which are larger than usual:
create view profile(cust_id, avg_amt, sd_amt)
as
select cust_id, avg(charge_amt), stddev(charge_amt) from trans where date between '2002-0101' and '2002-03-31' group by cust_id Create trigger big_chrg
|
Equi-width histogram:
with dt as (
select
t.transid,
sum(amount) as trans_amt,
case
when
sum(amount)/3000 < 0 then 0
when
sum(amount)/3000 >19 then 19
else
int(sum(amount)/3000)
end as bucket
from trans t, transitem ti
where t.transid=ti.transid
group by t.transid
)
select bucket,count(bucket) as height, (bucket+1)*3000
as max_amt
from dt
group by bucket;
Equi-Height histogram:
with dt as (
select
t.transid,
sum(amount) as trans_amt,
rownumber( ) over(order
by sum(amount))*10/(select count(distinct transid)+1 from stars.transitem)
as bucket
from stars.trans t, stars.transitem ti
where t.transid=ti.transid
group by t.transid
)
select bucket,count(bucket) as b_count, max(trans_amt)
as part_value
from dt
group by bucket;
Note:
There are 3 ranking functions: rank(), denserank(), rownumber().
- row_number() & rownumber are synonyms. Also
dense_rank
& denserank are synonyms.
The 3 ranking functions are ranking rows, that is, they assign numbers
(1,2,3, etc). If all results are different - all 3 functions produce the
same result. If we have duplicates - then:
- rank() - will give them the same number - and skip the next number(s)
- denserank() - will give them the same number - and not skip the next
number(s)
- rownumber() - will not give same number and will not skip. Will simply
give different numbers to every row. This is the only ranking function
that does not require an ordering.
select empnum, dept, salary,
rank() over(partition by dept order by salary desc nulls last) as rank, dense_rank() over (partition by dept order by salary desc nulls last)as denserank, row_number() over (partition by dept order by salary desc nulls last)as rownumber from emptab; EMPNUM DEPT SALARY RANK DENSERANK
ROWNUMBER
|
over() - to specify sets. This includes partitionning
and ordering inside sets (see many examples below).
partition by
order by ( asc
, desc , nulls last )
rows
between ... and ...
N preceding & N following
unbound preceding &
unbound following (to include
the entire preceding/following partition(s))
current row
range between unbound preceding
and unbound following
range
between current row and unbound following
Example: Smoothed Time Series (over 1 week : 3 preceding days
and 3 following days):
select
date, symbol, close_price,
avg(close_price) over(order
by date rows between 3 preceding and 3
following)
as smooth_cp
from stocktab
where symbol = 'IBM' and date between '1999-08-01'
and '1999-09-01';
Attention: adding ordering to the set changes the behaviour of calculations to become "cumulative". That means, calculating on the current row - and all rows in the set that precede it with respect to the ordering.
scalar-aggregate functions - (ver.7 and up) - perform
scalar calculations on values from multiple rows within a set to compute
the result
select dept, salary,
sum(salary) over(partition by dept) as deptsum, avg(salary) over(partition by dept) as avgsal, count(*) over(partition by dept) as deptcount, max(salary) over(partition by dept) as maxsal from emptab; DEPT SALARY DEPTSUM AVGSAL
DEPTCOUNT MAXSAL
|
Note: the query above contains no GROUP BY clause. Instead, the OVER
clause is used to partition the data so that the sum function is
computed over rows in the same department, and the sum of all
the salaries in each department is returned for each row within the department.
Adding ordering into a set turns calculations into cummulative
over the set:
select
date, sales, sum(sales) over(order by date) as cume_sum, count(*) over(order by date) as setcount from sales where year(date) = 2000; DATE SALES
CUME_SUM SETCOUNT
Example for several years:
|
Note: the order of evaluation of a query is as follows:
1.From Clause
2.Where Clause
3.Group By Clause
4.Having Clause
5.Select List (over...)
Example taking advantage of this order:
select
year(date) as year, sum(sales) as sum, sum(sum(sales)) over(order by year(date)) as cume_sum from sales where year(date) >= 1995 group by year(date); |
Correlation: ( 1 (or -1) - perfect positive (or negative) relationship
):
select country, state, correlation(annual_purchases,
income) as correlation
from mytab
group by country, state
having abs(correlation(annual_purchases,
income)) > 0.1;
Note: similar function - covariance()
select a.custid as custid1, b.custid as custid2,
corr(a.amount,
b.amount) as corr
from mytab a, mytab b
where a.prodid=b.prodid and a.custid < b.custid
broup by a.custid, b.custid
having corr(a.amount, b.amount) >=0.5
and count(*) > 100
order by corr desc;
======
with
dt (prod, year, sales0, sales1,sales2)
as (
select
prod, year, total_sales,
max(total_sales)
over(partitionby
prod order by year rows between 1 preceding and 1 preceding),
max(total_sales)
over(partitionby
prod order by year rows between 2 preceding and 2 preceding)
from mytab
)
select
prod,
corr(sales0,sales1)*100 as "corr1(%)",
corr(sales0,sales2)*100 as "corr2(%)"
from dt
group by prod;
Least-Squares Fit (Linear Regression): y=ax+b
for a set of non-null (y,x) values:
select
regr_count(sales, ad_budget) as num_cities,
regr_slope(sales, ad_budget) as a,
regr_icpt(sales, ad_budget) as b,
regr_r2(sales, ad_budget) as r-squared
from ad_camp;
Here R-squared is a measure of the quality of the fit (roughly, the square of the corr of x and y)
If you need y = ax*x + b - use regr_slope(y,x*x)
etc.
Note: for non-linear curves you can not compute R-squared like this:
select regr_r2(log(hits), log(days)) as
r2 from traffic_data;
because it will give you 0.99, instead of 0.95.
A correct way is to use the with ... expression to:
1. calculate the regr_slope and regr_icpt for the model
2. calculate the residuals (differences between data and the
model for each point)
3. calculate the r2 like this:
select 1e0 - (sum(error*error)/regr_syy(hits,days))
as r2
To find if one or two points strogly influence the model - we can use
HAT Matrix.
Here is how to do the HAT Diagonal Computation:
with stats(mx,mx2,sxx) as (
select
regr_avgx(sales,ad_budget),
regr_avgx(sales,ad_budget*ad_budget),
regr_sxx(sales,ad_budget)
from mytab
)
select d.label as city, (s.mx2 - 2*s.mx*d.x +
d.x*d.x) / s.sxx as HAT
from xy_data d, stats s
order by HAT desc;
Fit the line in the form y=ax :
We need to computer a as following: a = (x1*y1 + x2*y2 +..+xn*yn)/(x1^2
+ ... + xn^2)
Note that regr_sxx = (x1-mx)^2 + ... + (xn-mn)^2
(And similarly regr_sxy)
So:
(x1^2 + ... + xn^2) = regr_sxx + n*mx^2
(x1*y1 + x2*y2 +..+xn*yn) = regr_sxy + n*mx*my
Thus we get:
select
regr_count(kwh,hours_run) as num_machines,
(regr_sxy(kwh,hours_run) + regr_count(kwh,hours_run)
* regr_avgx(kwh,hours_run) * regr_avgy(kwh,hours_run))
/
(regr_sxx(kwh,hours_run) + regr_count(kwh,hours_run)
* regr_avgx(kwh,hours_run) * regr_avgx(kwh,hours_run))
as a
from power_data;
Example: we want to find cities where add campaign is particularly
effective.
We use the with ... construct and calculate a,b, and sigma as following:
regr_slope() as a
regr_icpt() as b
sqrt((regr_syy() - (regr_sxy()*regr_sxy()/regr_sxx()))/regr_count()
- 2)) as sigma
where all regr functions have 2 arguments: (sales, ad_budget)
Now we can select ... where sales > a*ad_budget
+ b + 2e0*sigma
Same regression functions can also be used to computer other statistical data, like F-statistics.
To compare 2 sets and verify a hypothesis - people have always used
t-test (which presumes normal distribution).
Or more modern procedure called Wilcoxon Rank Test which avoids above
restrictions.
It is calculated using the rank()
function ( or dense_rank() function
for dealing with duplicate totals)
Here are how ranks are calculated:
with
ranked_sales(city,ranks) as (
select city, rank( ) over(order
by sales) from feb_sales
)
select sum(ranks) as W from ranked_sales
where city = 'B'
Then one needs tables from some statistics books to compute theresult.
----- Removing duplicates:
create view aaatemp(rn) as select rownumber()
over(partition by col1 || '_' || col2) from mytab;
delete from aaatemp where rn > 1;
drop view aaatemp;
Note: the view doesn't have any columns of the original table. Still deleting from the view does the deletion from the original table
----- Finding duplicates without creating a view:
with
mytemp(name, desc_text, rn) as (
select name, desc_text, rownumber()
over(partition by desc_text) as rn
from s_org_ext where cust_stat_cd
!='Deleted'
)
select * from mytemp where rn > 1
----- Delete 100 rows at a time by hand:
create view aaatemp(rn) as select rownumber()
over() as rn from mytab where ....;
delete from aaatemp where rn < 100;
----- Very-very slow way to go 10 at a time:
select a.row_id from yourtable a
where 10 > (select count(*) from youtable b where
a.row_id < b.row_id)
order by row_id desc
----- Paging through table:
select myname
from (select myname, rownumber() over(order by
myname) as rn from mytable) as tr
where rn between 10000 and 10020
----- rank:
select name, dept, salary, rank() over(partition
by dept order by salary desc) as salary rank
from staff
order by dept, salary_rank
----- moving average:
select sales_date, avg(sales) over(order by sales_date
rows between 1 preceding and 1 following) as smoothed_sales
from sales
where sales_date > '03/28/1996' and sales_person
= 'LEE' and region =
"some-region'
----- grouping sets:
select year(sales_date) as year, region, sales_person,
count(*) as tot_sales
from sales
group by grouping sets (year(sales_date), region,
sales_person, ( ))
order by year(sales_date), region, sales_person
More SQL: examples, with, hexadecimal, optimizer | home - top of the page - |
Examples,
Some SQL examples:
select count(*) from tab1.s_employee a
inner join inst1.s_emp_postn b on a.row_id = b.emp_id where not login like 'NO-LOGIN%' |
select count(*) from inst1.s_employee a
inner join inst1.s_emp_postn b on a.row_id = b.emp_id inner join inst1.s_postn c on b.position_id = c.row_id where c.created > '2002-03-12-00.00.01.000000' and c.created < '2002-03-17-23.59.59.999999' and not a.login like 'NO-LOGIN%' |
select distinct a.login KERBEROS,
e.name POSITION , value(c.yes,'NO') WITH_COVERAGE from inst1.s_employee a inner join inst1.s_emp_postn b on a.row_id = b.emp_id inner join inst1.s_postn e on b.position_id = e.row_id left outer join (select 'YES' yes, position_id from inst1.s_accnt_postn) c on e.row_id = c.position_id where not a.login like 'NO-LOGIN%' |
select
value(z.KERBEROS,Y.KERBEROS) MY_KERBEROS, value(Z.FIRST_NAME,Y.FIRST_NAME) MY_FIRST_NAME, value(Z.LAST_NAME,Y.LAST_NAME) MY_LAST_NAME, value(Z.REP_ID,Y.REP_ID) MY_REP_ID, value(Z.DESK_NAME,Y.DESK_NAME) MY_DESK_NAME from ( select a.login KERBEROS, a.FST_NAME FIRST_NAME, a.Last_Name LAST_NAME, a.PAGER_PIN REP_ID, c.X_POSITION_NAME DESK_NAME from inst1.s_employee a, inst1.s_postn b, inst1.s_emp_postn d, inst1.s_postn c where a.row_id = d.emp_id and d.position_id = b.row_id and b.PAR_POSTN_ID = c.row_id and a.pager_pin is not null ) y left outer join (
|
select col1 from tab1 group by col1 having count(*)
> 1
select * from tab1 where lcase(col1) = 'o''brien' |
Some (non)-trivial examples:
In Sybase we frequently
use temp. tables and stored procedure when we need to get a result from
many tables.
For example, create temp table #temp1
in DB2 this can be all expressed as one SQL statement like this: with query1 ( col1, col2, col3, ..., colN) as
( select ... ),
---------------------------- one more query using the 'with' construct with
---------------------------- return 1 if something exists, otherwise return 0 with t1 (is_mgr) as (
---------------------------- select distinct ... , cast(NULL as char) as first_name,
cast(NULL as char) as last_name
---------------------------- select col1 from tab1 group by col1 having count(*) > 1 select * from tab1 where lcase(col1) = 'o''brien' ---------------------------- Get top ten: select name, salary from employee a
select name, salary from employee a order by salary
desc fetch first 10 rows only
---------------------------- Get list of pairs with equal values: select a.name, a.birthday, b.name
|
Working with hexadecimal:
values( hex(1024) )
values ( X'3233' ) values ( cast(X'3233' as integer) + 2 ) values ( cast(X'3233' as char(2)) || ' - twenty three' ) values ( X'4672616E6B' ) -- Frank values ( X'30' ) -- 0 values ( X'30313233' ) -- 0123 -- now let's get names with a digit 8 in them:
|
Optimizing queries:
db2 -tvf test.sql
db2 runstats on table mytab with distribution
and detailed indexes all shrlevel change
visual explain | A graphical tool available in the db2 Command Center (or Control Center) - to examine single queries, whether static or dynamic. Shows a color-coded tree. Clicking on a node allows you to view the arguments, statistics, and cost estimate of the node. You can also jump directly to DB2's documentation on that node type. Visual Explain can also be run from the command line as db2vexp.exe, though it then lacks static SQL ability. |
db2expln | The "bare bones" tool, giving text output from static SQL packages only. This can optionally include a character mode graph. |
dynexpln | Gives a text-mode analysis of a dynamic SQL query. It actually packages
the dynamic query and calls db2expln to do the work. From unix command
line invoke it as:
dynexpln -d mydb -f test.sql dynexpln -h |
db2exfmt | A formatter for previously stored explain data. |
- http://www.idug.org/member/journal/Nov01/articl05.cfm
-
- http://www.student.math.uwaterloo.ca/~cs448/db2_doc/html/db2n0/frame3.htm
-
- http://gethelp.devx.com/techtips/db2_pro/10Min1101/gn1101-2.asp
-
- http://www.cs.umb.edu/~db2/db2n0/db2n0.htm
- IBM DB2 Command Reference -
--------------------------------------------