How to Find the Size of a DB2 Database

发布时间 2023-10-29 23:15:38作者: 耀阳居士

How to Find the Size of a DB2 Database

How big is it? How much space do we need if we copy it? How big will it be in a year? These are common questions for a DBA. Sometimes we know the answer intimately, and sometimes we have to look it up. If the person asking the question is not a DBA, they expect a hard number that won’t change significantly or will change in a linear, predictable way over time. There are a few tricks you can use to help answer them.

What is Included in “Database Size”

The obvious part here is the data that resides in tables and indexes and the meta data that describes the database. The not so obvious areas include:

  • Transaction Logs: Both active logs, and archived ones if they are retained and stored on disk. Sometimes archived logs are stored on TSM or other Tape Library.
  • Backup Images: Again, may be on disk or to tape library or other location
  • Scripts and Script output that you use to administer the database – sizes of these can vary wildly depending on what you retain
  • DB2 Diagnostic log (technically, this is DB2 instance level, and not database level): If you have a problem that is writing a lot of information, this can be significant
  • Event Monitor output: This can be significant, but is frequently stored in the database directory
  • Auditing output: Like Event Monitor output, this can be very significant
The methods I’ll describe in this post may include different subsets of this information – so depending on the method you use, you may have to then add in or subtract out other information to get the number you really want.

DB2’s get_dbsize_info Function

This is one of my favorite little functions for quick answers to the simple questions on database size. I have the syntax actually memorized. It is executed like this:

$ db2 connect to d01

   Database Connection Information

 Database server        = DB2/LINUXX8664 9.7.4
 SQL authorization ID   = DB2INST1
 Local database alias   = D01

$ db2 "call get_dbsize_info(?,?,?,-1)"

  Value of output parameters
  --------------------------
  Parameter Name  : SNAPSHOTTIMESTAMP
  Parameter Value : 2012-06-18-10.34.39.907630

  Parameter Name  : DATABASESIZE
  Parameter Value : 1157869568

  Parameter Name  : DATABASECAPACITY
  Parameter Value : 1307672571

  Return Status = 0

The numbers returned are in bytes, so with a bit of math, we can see that this small database is a bit less than 1.1 GB. These numbers include only used pages  in the tablespaces. This means that it essentially covers table data and index data and not much else. It also means that if you have DMS tablespaces with a large amount of free space, you may have more space actually taken up on disk than the value of DATABASESIZE as reported here.

The DATABASECAPACITY is calculated by adding up pages that have been allocated but not used in DMS tablespaces and filesystem free space for filesystems that hold SMS tablespaces. There are problems inherent in that, of course – you may have a lot of filesystem space where DMS containers reside that would not be reflected in DATABASECAPACITY until you extend the tablespace(s) to use them, and you may also have SMS tablespaces in a filesystem where you really shouldn’t use the filesystem up to 100%.

Be Aware: if you use this function, and have not created the SYSTOOLS tablespace, DB2 will automatically create it for you. This can be handy so you can control its creation by using this, but on the other hand, if it creates this tablespace, your incremental backups will fail until the next full backup, because you must take a full backup after creating a new tablespace before incremental backups will succeed.

If you want details on what those parameters are that you’re passing in, see the info center entry:

http://www.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.sql.rtn.doc/doc/r0011863.html

Database Backup Size

This is a common cheat for finding approximate database size. Looking at the backup size using a simple ls or whatever tool your OS has for listing files and their size. BUT, if you’re taking compressed backups, it doesn’t work. If you compress backups after they’re taken, this can also be difficult.

It may also be difficult if you take backups directly to TSM or another tape library.

Personally, since I’m nearly always backing up to disk, I nearly always take compressed backups, so this is rarely an option for me.

This method is similar to the first one, of course, in what it includes – essentially all data in the database, though it includes things that are usually small like the history file, database configuration, etc. For online backups, this will also contain an unknown number of transaction log files.

Looking at Database Size by Looking at What is Used on the Filesystem(s)/Directories

This can be the most difficult, but is the most comprehensive. To gather information, you look in each area where the database stores things, and simply look at that area from a filesystem level. It is essential to have a more complete picture if you’re planning on say cloning the server or something like that. I have a standard set of filesystems that makes this easier. It’s not that it’s technically difficult – it just requires that high level of attention to detail that is critical for a DBA to have.

Table/Index (tablespace) data:

To look at this at the filesystem level, you first have to find all of the filesystems involved. One way to do that is:

$ db2pd -d d01 -tablespaces

Database Partition 0 -- Database D01 -- Active -- Up 0 days 00:41:41 -- Date 06/18/2012 11:15:00

Tablespace Configuration:
Address            Id    Type Content PageSz ExtentSz Auto Prefetch BufID BufIDDisk FSC NumCntrs MaxStripe  LastConsecPg Name
0x00002B4FD762FE20 0     DMS  Regular 4096   4        Yes  24       1     1         Off 1        0          3            SYSCATSPACE
0x00002B4FD7631580 1     SMS  SysTmp  4096   32       Yes  192      1     1         On  1        0          31           TEMPSPACE1
0x00002B4FD7634CA0 2     DMS  Large   4096   32       Yes  192      1     1         Off 1        0          31           USERSPACE1
0x00002B4FD7636400 3     DMS  Regular 8192   32       Yes  192      2     2         Off 1        0          31           TAB8K
0x00002B4FD7637B60 4     DMS  Regular 16384  32       Yes  192      3     3         Off 1        0          31           TAB16K
0x00002B4FD76392C0 5     SMS  SysTmp  8192   32       Yes  192      2     2         On  1        0          31           TEMPSYS8K
0x00002B4FD763C9E0 6     SMS  SysTmp  16384  32       Yes  192      3     3         On  1        0          31           TEMPSYS16K
0x00002B4FD7640100 7     SMS  SysTmp  32768  32       Yes  192      4     4         On  1        0          31           TEMPSYS32K
0x00002B4FD7643820 8     DMS  Large   4096   4        Yes  24       1     1         Off 1        0          3            SYSTOOLSPACE

Tablespace Statistics:
Address            Id    TotalPgs   UsablePgs  UsedPgs    PndFreePgs FreePgs    HWM        Max HWM    State      MinRecTime NQuiescers PathsDropped
0x00002B4FD762FE20 0     81920      81916      67772      5704       8440       77192      77192      0x00000000 0          0          No
0x00002B4FD7631580 1     1          1          1          0          0          0          0          0x00000000 0          0          No
0x00002B4FD7634CA0 2     172032     172000     168448     64         3488       168544     168544     0x00000000 0          0          No
0x00002B4FD7636400 3     16384      16352      14688      0          1664       14688      14688      0x00000000 0          0          No
0x00002B4FD7637B60 4     6144       6112       4192       0          1920       4192       4192       0x00000000 0          0          No
0x00002B4FD76392C0 5     1          1          1          0          0          0          0          0x00000000 0          0          No
0x00002B4FD763C9E0 6     1          1          1          0          0          0          0          0x00000000 0          0          No
0x00002B4FD7640100 7     1          1          1          0          0          0          0          0x00000000 0          0          No
0x00002B4FD7643820 8     8192       8188       304        0          7884       304        304        0x00000000 1340030079 0          No

Tablespace Autoresize Statistics:
Address            Id    AS  AR  InitSize    IncSize     IIP MaxSize     LastResize                 LRF
0x00002B4FD762FE20 0     Yes Yes 33554432    -1                   No  None        None                       No
0x00002B4FD7631580 1     Yes No  0           0                    No  0                    None                       No
0x00002B4FD7634CA0 2     Yes Yes 33554432    -1                   No  None        None                       No
0x00002B4FD7636400 3     Yes Yes 33554432    -1                   No  None        None                       No
0x00002B4FD7637B60 4     Yes Yes 33554432    -1                   No  None        None                       No
0x00002B4FD76392C0 5     Yes No  0           0                    No  0                    None                       No
0x00002B4FD763C9E0 6     Yes No  0           0                    No  0                    None                       No
0x00002B4FD7640100 7     Yes No  0           0                    No  0                    None                       No
0x00002B4FD7643820 8     Yes Yes 33554432    -1                   No  None        None                       No

Containers:
Address            TspId ContainNum Type    TotalPgs   UseablePgs PathID     StripeSet  Container
0x00002B4FD7631340 0     0          File    81920      81916      0          0          /db_data/db2inst1/NODE0000/D01/T0000000/C0000000.CAT
0x00002B4FD7632A40 1     0          Path    1          1          0          0          /db_data/db2inst1/NODE0000/D01/T0000001/C0000000.TMP
0x00002B4FD76361C0 2     0          File    172032     172000     0          0          /db_data/db2inst1/NODE0000/D01/T0000002/C0000000.LRG
0x00002B4FD7637920 3     0          File    16384      16352      0          0          /db_data/db2inst1/NODE0000/D01/T0000003/C0000000.USR
0x00002B4FD7639080 4     0          File    6144       6112       0          0          /db_data/db2inst1/NODE0000/D01/T0000004/C0000000.USR
0x00002B4FD763A780 5     0          Path    1          1          0          0          /db_data/db2inst1/NODE0000/D01/T0000005/C0000000.TMP
0x00002B4FD763DEA0 6     0          Path    1          1          0          0          /db_data/db2inst1/NODE0000/D01/T0000006/C0000000.TMP
0x00002B4FD76415C0 7     0          Path    1          1          0          0          /db_data/db2inst1/NODE0000/D01/T0000007/C0000000.TMP
0x00002B4FD7644D40 8     0          File    8192       8188       0          0          /db_data/db2inst1/NODE0000/D01/T0000008/C0000000.LRG

What you’re looking for in that copious output is actually the last section – “Containers”. In that section, it lists the path to all containers, and that’s the path you want to look for the sizes on. Alternately, if you don’t mind connecting to the database, you can do it this way:

$ db2 connect to d01

   Database Connection Information

 Database server        = DB2/LINUXX8664 9.7.4
 SQL authorization ID   = DB2INST1
 Local database alias   = D01

$ db2 "select substr(container_name,1,100) container_name from sysibmadm.container_utilization"

CONTAINER_NAME
----------------------------------------------------------------------------------------------------
/db_data/db2inst1/NODE0000/D01/T0000000/C0000000.CAT
/db_data/db2inst1/NODE0000/D01/T0000001/C0000000.TMP
/db_data/db2inst1/NODE0000/D01/T0000002/C0000000.LRG
/db_data/db2inst1/NODE0000/D01/T0000003/C0000000.USR
/db_data/db2inst1/NODE0000/D01/T0000004/C0000000.USR
/db_data/db2inst1/NODE0000/D01/T0000005/C0000000.TMP
/db_data/db2inst1/NODE0000/D01/T0000006/C0000000.TMP
/db_data/db2inst1/NODE0000/D01/T0000007/C0000000.TMP
/db_data/db2inst1/NODE0000/D01/T0000008/C0000000.LRG

  9 record(s) selected.

You have to add up the sizes of all of the files, or if using SMS, all of the directories here. In my case, I can do:

[db2inst1@luxec2 cron]\-DEV-$ du -sh /db_data/db2inst1/NODE0000/D01
1.3G    /db_data/db2inst1/NODE0000/D01

Database home directory:

This is usually pretty small, but you have to be careful, because your database data could be here too – so be sure not to double count. You first have to find the whole path for the database you’re looking at, and then you can determine the space:

$ db2 list db directory

 System Database Directory

 Number of entries in the directory = 1

Database 1 entry:

 Database alias                       = D01
 Database name                        = D01
 Local database directory             = /db_data
 Database release level               = d.00
 Comment                              =
 Directory entry type                 = Indirect
 Catalog database partition number    = 0
 Alternate server hostname            =
 Alternate server port number         =

$ db2 list db directory on /db_data

 Local Database Directory on /db_data

 Number of entries in the directory = 1

Database 1 entry:

 Database alias                       = D01
 Database name                        = D01
 Database directory                   = SQL00002
 Database release level               = d.00
 Comment                              =
 Directory entry type                 = Home
 Catalog database partition number    = 0
 Database partition number            = 0

$ du -sh /db_data/db2inst1/NODE0000/SQL00002
5.2M    /db_data/db2inst1/NODE0000/SQL00002

There can also be data like event monitor output or transaction logs included on this path

Active Transaction Logs:

To look at the filesystem level at how much space is being used here, you can do this to find the right path:

$ db2 get db cfg for D01 |grep "Path to log files"
 Path to log files                                       = /db_logs/D01/NODE0000/

The space taken up here can vary wildly depending on your settings for LOGFILSZ and LOGSECOND.

$ db2 get db cfg for D01 |grep -i log
...
Log file size (4KB)                         (LOGFILSIZ) = 10000
Number of primary log files                (LOGPRIMARY) = 12
Number of secondary log files               (LOGSECOND) = 50
...

In this case, my active log files will normally be about 470 MB. But could easily vary up to nearly 2 GB. If the database is unable to archive log files, it could be even more, of course, and I could easily change LOGSECOND to 200 or more, magnifying the amount of space that could be used.

$ du -sh /db_logs/D01/NODE0000/
431M    /db_logs/D01/NODE0000/

Archived Transaction Logs:

Assuming you’re archiving transaction logs to disk (if you are retaining them, you should always archive them somewhere else, even if it’s on the same filesystem), you can find where they’re going using this:

$ db2 get db cfg for D01 |grep ARCH
 First log archive method                 (LOGARCHMETH1) = DISK:/db_arch_logs/D01/
 Options for logarchmeth1                  (LOGARCHOPT1) =
 Second log archive method                (LOGARCHMETH2) = OFF
 Options for logarchmeth2                  (LOGARCHOPT2) =
 Failover log archive path                (FAILARCHPATH) =
 Number of log archive retries on error   (NUMARCHRETRY) = 5
 Log archive retry Delay (secs)         (ARCHRETRYDELAY) = 20

In this case, we have just one location for archived transaction log files:

$ du -sh /db_arch_logs/D01/
530M    /db_arch_logs/D01/

Backup Images:

You can take backups to any path on the server. In practice, you probably use just one or two locations. You can find the location for each individual backup using:

$ db2 list history backup all for d01

                    List History File for d01

Number of matching file entries = 3

 Op Obj Timestamp+Sequence Type Dev Earliest Log Current Log  Backup ID
 -- --- ------------------ ---- --- ------------ ------------ --------------
  B  D  20120614135343001   F    D  S0000000.LOG S0000000.LOG
 ----------------------------------------------------------------------------
  Contains 5 tablespace(s):
  00001 SYSCATSPACE
  00002 USERSPACE1
  00003 TAB8K
  00004 TAB16K
  00005 SYSTOOLSPACE
 ----------------------------------------------------------------------------
    Comment: DB2 BACKUP D01 OFFLINE
 Start Time: 20120614135343
   End Time: 20120614135404
     Status: A
 ----------------------------------------------------------------------------
  EID: 7 Location: /db_bkup

 Op Obj Timestamp+Sequence Type Dev Earliest Log Current Log  Backup ID
 -- --- ------------------ ---- --- ------------ ------------ --------------
  B  D  20120614135833001   F    D  S0000001.LOG S0000001.LOG
 ----------------------------------------------------------------------------
  Contains 5 tablespace(s):
  00001 SYSCATSPACE
  00002 USERSPACE1
  00003 TAB8K
  00004 TAB16K
  00005 SYSTOOLSPACE
 ----------------------------------------------------------------------------
    Comment: DB2 BACKUP D01 OFFLINE
 Start Time: 20120614135833
   End Time: 20120614135850
     Status: A
 ----------------------------------------------------------------------------
  EID: 9 Location: /db_bkup

 Op Obj Timestamp+Sequence Type Dev Earliest Log Current Log  Backup ID
 -- --- ------------------ ---- --- ------------ ------------ --------------
  B  D  20120618094410001   N    D  S0000002.LOG S0000002.LOG
 ----------------------------------------------------------------------------
  Contains 5 tablespace(s):
  00001 SYSCATSPACE
  00002 USERSPACE1
  00003 TAB8K
  00004 TAB16K
  00005 SYSTOOLSPACE
 ----------------------------------------------------------------------------
    Comment: DB2 BACKUP D01 ONLINE
 Start Time: 20120618094410
   End Time: 20120618094438
     Status: A
 ----------------------------------------------------------------------------
  EID: 12 Location: /db_bkup/full

In this small example, I’ve got two locations for backups – /db_bkup and /db_bkup/full. I happen to know these are both on the same filesystem, and that I don’t keep other things there, so I can do:

$ du -sh /db_bkup
241M    /db_bkup

Scripts and Script output that you use to administer the database:

There’s no way to determine this through database level commands -you just have to know where you put it. Assuming you keep at least a bit of snapshot data, which I really recommend (and espeically if you’ve left your package cache at automatic), this can be of significant size. In my case, I know that all of my data of this nature is in /db_adm:

$ du -sh /db_adm
142M    /db_adm

DB2 Diagnostic log (technically, this is DB2 instance level, and not database level):

It’s not technically part of the database size, but is at the instance level. But if you’ve got scrolling errors, it can blow up quickly, and also your archiving strategy here makes a difference.

$ db2 get dbm cfg | grep DIAG
 Diagnostic error capture level              (DIAGLEVEL) = 3
 Diagnostic data directory path               (DIAGPATH) = /db2diag/
 Alternate diagnostic data directory path (ALT_DIAGPATH) =
 Size of rotating db2diag & notify logs (MB)  (DIAGSIZE) = 0

In my case:

$ du -sh /db2diag
269M    /db2diag

Event Monitor output:

You only need to worry about this if you’ve specified a location that is outside of the database path. If you specified a relative path, or only have the default deadlock event monitor in place, then it’ll be included in the database path info above. Remember that event monitors can output very large amounts of data, especially activity or statement event monitors – you can easily generate more data than you have data in the database as it tracks every statement or activity in the database and metrics about those.

Auditing output:

If you’re using db2audit or the new audit facility, you can use db2audit to specify or check the path. This data includes some instance level information as well.

$ db2audit describe
DB2 AUDIT SETTINGS:

Audit active: "FALSE "
Log audit events: "FAILURE"
Log checking events: "FAILURE"
Log object maintenance events: "FAILURE"
Log security maintenance events: "FAILURE"
Log system administrator events: "FAILURE"
Log validate events: "FAILURE"
Log context events: "NONE"
Return SQLCA on audit error: "FALSE "
Audit Data Path: ""
Audit Archive Path: ""

AUD0000I  Operation succeeded.

If no path is set, like in the above, then the default path is sqllib/security/auditdata.

Data Compression

Data compression (both page level and row compression) is a pay-for-use feature. It’s effects are reflected in all of the methods shown above.

Tracking Database Growth

If you have space, it’s nice to track database growth over time. This can help you make somewhat accurate estimates of future growth and be ready for tables that may exceed the maximum size (less of an issue on v9 with large tablespaces). It can also help you answer questions like “When was all this data deleted?”. SYSIBMADM.ADMINTABINFO or ADMIN_GET_TAB_INFO_V97 can be useful for this. I generally write table sizes out to a table once a day – that granularity seems to work well for me.

 

So, readers, what methods and details have I missed here? Leave a comment, I want to hear from you!

 
Ember Crooks
Ember Crooks

Ember is always curious and thrives on change. She has built internationally recognized expertise in IBM Db2, spent a year working with high-volume MySQL, and is now learning Snowflake. Ember shares both posts about her core skill sets and her journey learning Snowflake.

Ember lives in Denver and work from home

 
ARTICLES: 546

15 Comments

  1. Ember Crooks

    Santosh B

    Thank You. This is a useful post for DBAs.

  2. Ember Crooks

    leo

    Good article,

    Side question…
    Ember, Are you really a crook?

    Cheers

    • Ember Crooks

      Ember Crooks

      The really funny thing is that my husband’s family has at least one member in law enforcement in each generation. Think “Sheriff Crooks” and “Officer Crooks”.

  3. Ember Crooks

    nobantu

    Thank you so much ,

    I am kindly looking for history of the size of database for past months , and what are option that i can use.

    • Ember Crooks

      Ember Crooks

      You must keep/store that data yourself in order to have it. If you have not been doing so, but have regular backup images, the backup sizes can give you a rough idea.

  4. Ember Crooks

    Gerardo

    thanks, most useful ?

  5. Ember Crooks

    Jon Ostrowski

    Thanks for this information!

  6. Ember Crooks

    Hell Demons

    Good post for DBAs

  7. Ember Crooks

    Nitesh

    can we get details about growth of a SMS tablespace from any SQL ? My idea is if we will get information how we expanded our tablespaces we can do a reverse calculation from current size and can provide not accurate but a near about database size. I am able to fetch DMS tablespace extension detail from SYSIBMADM.DB_HISTORY[db2 -x “select CMD_TEXT from SYSIBMADM.DB_HISTORY where OPERATION=’T’ and SQLSTATE is NULL with ur “] but unable to find out details for SMS tablespace.

    • Ember Crooks

      Ember Crooks

      The data you’re getting is only about DMS tablespaces that are auto resized. SMS tablespaces do not record such data – they simply increase or decrease in size without noting it. Really, you should be recording data over time to get this kind of information. Remember that for DMS tablespaces, they do not auto-decrease, so if you delete a lot of data, that may not be reflected in your database size unless you also decrease the DMS tablespace sizes manually. I like to look at size information at the table level using the sysibmadm.admintabinfo system view.

  8. Ember Crooks

    vineet

    Is it necessary to have statistics up-to-date for accurate size determination using get_dbsize_info Function?

    • Ember Crooks

      Ember Crooks

      No. get_dbsize_info adds up the pages used by tablespaces. Runstats collects data at the table, index, and possibly the statistical view level. get_dbsize_info is actually pretty expensive when it runs because it calculates the sizes fresh (assuming it has not been run in the last 30 minutes, and you specify the -1 parameter in the last position).

  9. Ember Crooks

    David Colton

    Excellent post. Just what I needed as a part time DBA looking to get database size statistics.

  10. Ember Crooks

    Raj S

    good article. Thanks.
    Recently, I noticed that in our case over the period of time database is growing and now DBSIZE > DBCAPACITY which leads me to search other options to calculate DB size and capacity since get_dbsize_info is not giving correct information.

    Any ideas or admin views which report both DBSIZE and DBCAPACITY correctly rather than going so many hoops or scripts. Thanks in advance

    Value of output parameters
    ————————–
    Parameter Name : SNAPSHOTTIMESTAMP
    Parameter Value : 2015-01-19-10.44.57.600772

    Parameter Name : DATABASESIZE
    Parameter Value : 3331877879808

    Parameter Name : DATABASECAPACITY
    Parameter Value : 3959737793024

    Return Status = 0