'; zhtm += ''; zhtm += '

' + pPage + ''; zhtm += '
'; figDoc.write(zhtm); figDoc.close(); popUpWin.focus(); } //--> Using Oracle8 -- Chapter 7 - Adding Segments for Tables

Using Oracle8


Chapter 7

Adding Segments for Tables


Table Structure

Tables are the most common structures in almost all relational databases. They consist of rows (also known as tuples or instances in the worlds of relational theory and modeling, respectively) and columns (attributes). When queried by Oracle's SQL*Plus tool, they're displayed in a table format, with the column names becoming the headings. Such a display gives the illusion that the data is stored in the database just the way it appears onscreen:

A query against a table in SQL*Plus could result in the following output:

    CUST_NUMBER COMPANY_NAME          PHONE_NUMBER LAST_ORDER
----------- ----------------------- ------------ ----------
        100 All Occasion Gifts        321-099-8642 12-MAR-96
        103 Best of the Best          321-808-9753 05-MAY-98
        110 Magnificent Mark's        322-771-3524 11-DEC-97
        111 Halloween All Year        321-998-3623 25-FEB-98
...
Helping SQL*Plus format queries
SQL*Plus may not always produce query output just the way you want to see it. For example, a column that's only one character wide will have a one-character column heading by default. Most users won't find the first character of the column name sufficient for identifying the column's contents. Similarly, some columns may be defined to hold many more characters than you need to see when casually querying the table. These longer columns can cause each row to wrap over multiple output lines, making it difficult to read the results. SQL*Plus provides formatting commands to help you produce output that has meaningful column names and column widths. You can even use its advanced features to create subtotals, grand totals, page titles and footers, and other standard reporting features. These are all covered in the Oracle8 SQL*Plus User's Guide.

Although this formatting by SQL*Plus is very convenient and emphasizes the notion that relational databases store data in the form of two-dimensional tables, it doesn't really represent the true internal structure of the database tables. Inside Oracle's data files, the rows are stored very efficiently on Oracle database blocks, leaving very little free space (unless it's believed to be needed) and with little regard for how the data will look if displayed onscreen or in a report.

The blocks themselves are stored in a data segment consisting of one or more extents. A very simple table will have a single extent, the first block containing its header information and the other blocks storing the rows themselves. A larger table may contain many extents, and a very large table may have additional header blocks to support the more complex structure. The data dictionary maintains the table definition, along with the storage information and other related object definitions, such as views, indexes, privileges, and constraints, some of which can be created with the table itself. (Views are covered at the end of this chapter, indexes in Chapter 8 "Adding Segments for Different Types of Indexes," privileges in Chapter 10, "Controlling User Access with Privileges," and constraints in Chapter 17, "Using Constraints to Improve Your Application Performance.")

The simplest version of the CREATE TABLE command names the table and identifies a single column by name and type of data it will hold. Its syntax is as follows:

CREATE TABLE table_name (column_name datatype);

Another trick to create meaningful column names
It's becoming more and more common to prefix the field names with letters that indicate which datatype is stored in the field-for example, dtmStartDate indicates a datatype for the field.

You should, of course, choose a name for the table that's meaningful to you and your user community. Similarly, the column name should provide some useful information about the purpose of the data it will hold.

Choosing a Column Datatype and Length

Before building a table, you should know what type of information each column will need to hold. By knowing this, you can select an appropriate datatype and, possibly, a length. In some cases, you have a choice of datatypes that could be used for a particular column.

Character Data

You can store freeform character data in a number of formats. Table 7.1 shows the relateddatatypes and characteristics.

Table 7.1  Definitions of Oracle datatypes

Datatype:

Length:
Max
Preferred Uses:

Notes:
BFILE4GBBinary data stored outside the database, allowing fast byte stream reads and writes
1
BLOB4GBVariable-length binary objects
1
CHAR2,000Short fields or fields that need fixed-length character comparisons
2
CLOB4GBVariable-length, single-byte character fields exceeding 2GB
1
DATE7Dates and times
3
LONG2GBVariable-length character fields that exceed 4,000 bytes
1,4
LONG RAW2GBVariable-length, uninterpreted binary data
1,4
NCHAR2,000Multibyte characters in short fields or fields that need fixed-length character comparisons
2,5
NCLOB4GBVariable-length, multibyte character fields exceeding 2GB; support only one character width per field
1
NUMBER38Number, having precision of 1 to 38 digits and scale of -84 to 127
 
NVARCHAR24,000 Variable-length fields that store single or multibyte characters that don't need fixed-length comparisons
5
RAW2,000Variable-length, uninterpreted binary data
4,6
ROWID10Extended row Ids
 
VARCHAR4,000 Variable-length fields that don't need fixed-length comparisons
6,7
VARCHAR24,000 Variable-length fields that don't need fixed-length comparisons
6,8

  1. There's no default length and no mechanism to define a maximum length.
  2. Trailing blanks are stored in the database, possibly wasting space for variable-length data. Default length is 1 character; to provide a maximum field length, add the required length in parentheses following the datatype keyword.
  3. Dates are always stored with seven components: century, year, month, day, hour, minute, and second. They can range from January 1, 4712 BC to December 31, 4712 AD.
  4. Supported for Oracle7 compliance and may not continue to be supported; large objects (LOBs) are preferred.
  5. The maximum length is the maximum number of bytes. For multibyte characters, the total number of characters will be less, depending on the number of bytes per character.
  6. There's no default length. You must always supply your own maximum length value in parentheses following the datatype keyword.
  7. Will stay compliant with ANSII standard definition for variable-length character fields.
  8. Will stay compliant with the definition from Oracle7.

Internally, with the exception of the CHAR and DATE datatypes, Oracle stores only the characters provided by the application in the character fields. If you specify a maximum length (when allowed) or use a predefined type at its maximum length, you don't waste any storage space when your records have fewer characters than the field can hold. The CHAR datatype, however, always adds trailing blanks (if needed) when the supplied data is less than the defined field length. The DATE datatype always uses 7 bytes, one for each date/time component, applying a default specific to each missing component.

CHAR Versus VARCHAR2

Besides the storage differences-before being written to the database, CHAR fields are always blank-padded to the full defined length, whereas VARCHAR and VARCHAR2 fields are never padded automatically-the two types of fields sort differently and compare differently. CHAR fields are sorted and compared using their full (padded) length while the variable character fields are sorted and compared on just the characters included in the string.

Space management with CHAR and VARCHAR2
Some people prefer to use CHAR rather than VARCHAR2 datatypes to reduce the likeli-hood that rows will grow in length when updates increase the number of characters in a field. Such growth can cause the row to become too long to fit into the block. However, Oracle provides a PCTFREE parameter to allow for row growth. I prefer to use PCTFREE to manage space rather than force all character fields to be padded blank characters, which I consider to be wasted space. See the "Creating Tables for Updates" in this chapter for details on the PCTFREE parameter.

A simple test is shown in the following few statements:

CREATE TABLE test_padding (fixed_col CHAR(5), var_col VARCHAR2(5));
INSERT INTO test_padding VALUES ('A','A');
INSERT INTO test_padding VALUES ('ABCDE','ABCDE');
SELECT * FROM test_padding WHERE fixed_col = var_col;
FIXED_COL VAR_COL
--------- -------
ABCDE     ABCDE

Only the row where all five characters have been filled in by the VALUES clause is displayed. The row with the single letters doesn't show the two columns having equal values because the FIXED_COL column is comparing all five characters, including trailing blanks, to the single character from the VAR_COL column.

Numeric Data

Numbers are stored by using the NUMBER datatype. By default, a number field can contain up to 38 digits of precision along with, optionally, a decimal point and a sign. Positive and negative numbers have a magnitude of 1.0 × 10-130 to 9.9…9 × 10125. A number can also have a value of 0, of course. To restrict the magnitude of a number (the number of digits to the left of the decimal point) and its precision (the number of digits to the right of the decimal point), enclose the required value(s) inside parentheses following the NUMBER keyword.

If you include only the magnitude, you actually define an integer. Any numbers with decimal values are rounded to the nearest integer before being stored. For example, NUMBER(3) allows numbers in the range of -999 to +999, and an inserted value of 10.65 is stored as 11. If you provide a precision and scale, you can store a number with as many digits as provided by the precision, but only precision-scale digits before the decimal point. Table 7.2 shows some examples of numbers that you can and can't store in a column defined as NUMBER(5,2).

Using a negative precision value
If you use a negative value in the precision field of a number column's length definition, the numbers will be rounded up to that power of 10 before being stored. For example, a column defined as NUMBER(10,-2) will take your input and round up to the nearest 100 (10 to the power of 2), so a value of 123,456 would be stored as 123,500.

How Oracle stores numbers
Oracle stores all numbers, regardless of the definition, by using a mantissa and exponent component. The digits of the mantissa are compressed two digits per byte, so the actual space required to a store a number depends on the number of significant digits provided, regardless of the column definition.

Table 7.2  Valid and invalid numbers for a column defined as NUMBER(5,2)
Valid Numbers:
Stored As:
00
11
12.312.3
-12-12
-123.45-123.45
123.456123.46 (rounded to 2 decimal digits)
-12.345-12.35 (rounded to 2 decimal digits)
Valid Numbers:Stored As:
123.4567890123456789123.46 (rounded to 2 decimal digits)
12345Invalid; exceeds precision (5-2 digits before decimal)
-1234.1Invalid; exceeds precision (5-2 digits before decimal)

Handling the year 2000 problem
Oracle has always stored both the century and the year for any date value in the database. To help distinguish dates in the 20th and 21st centuries when you provide only the last two digits of the year for the TO_DATE function, Oracle provides the RR date format mask. In a statement that stores a date by using the function TO_DATE('12/12/03',' DD/MM/RR'), the stored date will have the current century if the current year's last two digits are less than 50, and will have the next century if the current year's last two digits are 50 or greater. Full details of the RR format are given in the Oracle8 Server SQL Reference manual.

Date Data

Use the DATE format to store date fields or time information. Oracle has a single 7-byte internal format for all dates, and a 1-byte interval for each century, year, month, day, hour, minute, and second. Depending on the format your applications use to store dates, some fields may be left to default. For the time fields, the defaults result in a time of midnight. For century, either the current year (taken from the operating system date setting) is used, or a choice of 1900 or 2000, depending on the year value. The RR format mask causes the latter behavior, with a year in the range 50 to 99 resulting in a value of 19 for the century, and a year in the range 00 to 49 resulting in a value of 20 for the century.

Default date formats
Oracle uses a format mask when dealing with dates so that each of the seven components of the combined date/time fields can be uniquely identified. The database runs with a default date mask dependent on the setting of the initialization parameters NLS_TERRITORY and NLS_DATE_FORMAT. Generally, these hide the time component so that it defaults to midnight, unless the application or individual statement decides to override the mask and provide its own values for one or more of the time fields. When you're using date arithmetic and date functions, the time component may not be obvious to you or your users and can cause apparent problems.

Entering just the time component causes the date portion to be derived from the current operating system date, which uses the RR format mask process as described for the century as well as the current year and month, and defaults the day to the first day of the month.

Oracle can perform extensive date field operations, including date comparisons and date arithmetic. If you need to manipulate dates, check the Oracle8 Server SQL manual for detailed descriptions of the available date operators and functions.

One common problem occurs when you use the SYSDATE function to supply the current date when inserting a new record into the database. This would seem to be straightforward, allowing a query such as to select all orders placed on October 10, 1997, assuming that the date is provided in the correct format.

SELECT * FROM orders WHERE order_date = '10-OCT-97'

However, because the SYSDATE function by default always inserts the current time as well as the current date (whereas the query provides only the date, meaning that midnight on October 10 is being selected), there will be no matching records in the ORDERS table. The solution would be to store the data as if the time were midnight by applying the TRUNC function to the SYSDATE on insert.

Binary Data

Binary data is stored without any interpretation of embedded characters. For compatibility with Oracle7, the RAW and LONG RAW datatypes are still usable. However, the LONG RAW datatype is being deprecated, meaning that it's gradually becoming unsupported. Oracle8 offers the BLOB and BFILE datatypes to store binary data; these can be used in place of the RAW and LONG RAW datatypes.

SEE ALSO
For details on large objects (LOBs),

The only internal difference between RAW and LONG RAW is the maximum number of bytes they can store. RAW has a maximum length of 2,000 bytes, and you must define the maximum length you need as part of the column definition, even if you need all 2,000 bytes. The LONG RAW datatype can hold a maximum of 2GB. You can't limit this size as you can with RAW column, but, as with variable-character fields, Oracle stores only the characters you supply in RAW and LONG RAW fields, regardless of the maximum possible length.

By using some of the datatypes discussed above, you could create a multicolumn table, SAMPLE1, as follows:

CREATE TABLE sample1 (
    sample_id          NUMBER(10),
    sample_name        VARCHAR2(35),
    owner_id           NUMBER(4),
    collection_date    DATE,
    donor_gender       CHAR(1),
    sample_image       BLOB);
The new syntax isn't very complicated
Compared with the CREATE TABLE command included at the beginning of this chapter, the only other new syntax introduced in the listing, in addition to the datatypes, is the comma that separates each column definition.

Tables defined with a variable-length datatype in one or more columns-that is, any datatype other than CHAR or DATE-may need some special consideration when they're created if these columns are likely to be updated during the lifetime of any given row. This is because Oracle packs the data into table blocks as tightly as possible. This tends to result in very little, if any, space being left on the block if a row grows in length due to an update that adds more bytes to an existing field. Before creating a table in which you anticipate updates being made to variable-length columns, read the later section "Setting Space Utilization Parameters" to see how to avoid some of the problems this can cause.

Sizing and Locating Tables

Appendix A of the Oracle8 Server Administrator's Guide provides a detailed description of calculations you can use to compute how big a particular table will be, based on the column definitions, the average record size, and the expected number of rows. Before the most recent releases of Oracle, a table could contain only a limited number of extents, so it was important to know how big the table would be before creating it in order to provide sufficient extents space. With Oracle8, you can have unlimited extents in a table. Consequently, the only reason to be concerned with table size is to determine if you have sufficient disk space to store it.

Although I won't try to dissuade you from using Oracle's provided sizing calculations, I want to suggest an alternative approach to predicting table size. One key to either approach is to know the average row size. This generally requires that you find or generate some valid sample data. If you have such data, I recommend that you simply load it into your table, measure how much space is used, and then extrapolate the final size based on the ratio of the number of rows in your sample data to the total number of rows you expect the table to contain.

Use sample data to predict table size

  1. Put your sample data into a flat file.
  2. Create a SQL*Loader control file and run SQL*Loader to load the data. (See Chapter 25, "Using SQL*Loader and Export/Import," for details on SQL*Loader.)
  3. Execute the following command to collect current storage information:
ANALYZE TABLE table_name COMPUTE STATISTICS

  1. Execute the following query to find the number blocks now storing data:
SELECT blocks FROM user_tables
WHERE table_name = 'table_name'

  1. Compute the total number of blocks required to store the full table by using this formula:
    BLOCKS
    X
    (total number of rows) / (number of rows in sample)

You can use the following SQL*Plus script to perform these steps after you load your sample rows:

SET VERIFY OFF
ANALYZE TABLE &table_name COMPUTE STATISTICS
/
SELECT blocks * &total_row_count / num_rows
  AS "Total blocks needed"
  FROM user_tables
  WHERE table_name = UPPER('&&table_name')
/

After you determine your table's maximum size, you can identify an appropriate tablespace in which to store it. Your choice should be based on the factors discussed in Chapter 5 "Managing Your Database Space," concerning tablespace usage. These include a recommendation to use a limited number of different extent sizes, such as small, medium, large, and huge, for all objects in a given tablespace. You should be able to determine from its maximum size which category of extent sizes would best suit it, assuming that you follow our recommendations. For a very large table, the largest extent size is usually preferable, although if the table is going to grow very slowly, you may want to use smaller extents so that you can conserve disk space in the interim.

Other factors in deciding on a tablespace include the frequency of backup, the likelihood of dropping or truncating the table in the future, and which other segments exist in the candidate tablespaces. The latter might influence your decision when you consider what else the application might need to have access to, besides the new table, if a data file in the tablespace should become unusable.

Permissions when creating a table
To create a table successfully, you must have the necessary privileges and permissions, including the CREATE TABLE privilege and the right to use space in the named tablespace. See Chapter 9 "Creating and Managing User Accounts," and Chapter 10 for more information on these topics.

When you've determined which tablespace to use, you should add the tablespace name to the CREATE TABLE statement. By using the SAMPLE1 table-creation script shown in the preceding command, let's put the table into the SAMPLE_DATA tablespace:

CREATE TABLE sample1 (
    sample_id          NUMBER(10),
    sample_name        VARCHAR2(35),
    owner_id           NUMBER(4),
    collection_date    DATE,
    donor_gender       CHAR(1),
    sample_image       BLOB)
TABLESPACE sample_data 
/
Moves sample1 table into sample.data tablespace

Of course, if the person creating the table has the SAMPLE_DATA tablespace as his or her default tablespace, the TABLESPACE clause isn't needed. If you include it, you guarantee that the table will be created in the desired tablespace no matter who runs the script.

Setting Storage Parameters

You need to concern yourself with a table's storage parameters only if you haven't designed your database's tablespaces according to the guidelines in Chapter 5 These guidelines, along with the correct tablespace selection discussed in the previous section, should allow the table to use the default storage information defined for the tablespace. However, we'll look at each storage option in turn for those of you who may need to consider overriding your defaults. You include these options with the required values in a STORAGE clause as part of the CREATE TABLE or ALTER TABLE commands.

Overriding the defaults
If you have a database that's not designed as rigorously as discussed earlier, you may need to override one or all of the storage parameters. You can find the current default settings for the storage options in a tablespace by querying the DBA_TABLESPACES data dictionary view.

INITIAL

This parameter sets the size, in bytes, of the first extent built for the table. Some possible criteria for choosing a size include the following:

NEXT

This parameter sets the size, in bytes, of the second extent. Some possible criteria for choosing a size include the following:

PCTINCREASE

This parameter defines a multiplier to compute the size of the next extent to be created. It's applied to the third, and every subsequent, extent. If you set it to zero (0), each extent will be the same size as defined by NEXT; if you set it to 100, each subsequent extent will double in size. A value of zero is generally to be preferred. You may want to use a non-zero value if you don't know how much your table will grow, so that each extent will be larger than the previous one. This should eventually result in a sufficiently large extent to hold the remainder of the table.

PCTINCREASE options
Oracle allows large values of PCTINCREASE to reduce the number of additional extents that might be needed if a table's size was seriously under-estimated when it was first created. In earlier releases, this feature was essential because the number of extents that could be added to an existing table was a finite, limited number. With the UNLIMITED option now available, the only drawback to having many extents is the overhead associated with adding each new one. In general, I recommend leaving the value at zero whenever the table must share a table-space with at least one other segment, to preserve uniform extent sizes. In other cases, you should set it to a reasonable value so that it doesn't begin requiring extents significantly larger than the available disk space.

The drawbacks to this include the following:

MINEXTENTS

This parameter sets the number of extents built by the CREATE TABLE command. The sizes of the extents are determined by the values for INITIAL, NEXT, and PCTINCREASE. There are some possible reasons for creating only one extent initially:

There are some possible reasons for creating multiple extents initially:

MAXEXTENTS

This parameter sets the maximum number of extents the table will be allowed to use. You don't usually need to worry about this value initially because it can be changed later. Of course, you should be prepared to monitor the use of the space as the number of extents in a table approaches this value, no matter how you've set it. In some situations, however, you may need to set a specific value of MAXEXTENTS, including

Additional storage options that don't affect extent sizes
You can use other keywords in the STORAGE clause of the CREATE TABLE command-FREELISTS, FREELIST GROUPS, and BUFFER POOL. However, these values can't be set at the tablespace level and don't affect the allocation of table extents. The impact of these storage options is discussed in other sections of this book.

SEE ALSO
To see how to use the BUFFER POOL keyword,
For more on free lists and free list groups,
The buffer pool and table options to use it effectively are covered on

We end this section by showing the additional lines added to the CREATE TABLE sample1 script to include a STORAGE clause:

CREATE TABLE sample1 (
     sample_id           NUMBER(10),
     sample_name         VARCHAR2(35),
     owner_id            NUMBER(4),
     collection_date     DATE,
     donor_gender        CHAR(1),
     sample_image        BLOB)
TABLESPACE sample_data

The STORAGE clause

Setting Space Utilization Parameters

Each block in an Oracle table can hold as many rows as will fit into the block. When the block is full, Oracle removes it from the list of blocks into which new rows can be inserted. At some point in the future, if enough rows are deleted from the block, it may be added back onto the list of available blocks so that more new rows can be added to it, using the space freed up by the dropped rows.

Some special parameters associated with the table's definition, known as space utilization parameters, influence these events. In particular, they control just how full a block becomes before it's moved off the list of available blocks (known as the free list), how much space must be made available before it's moved back onto the free list again, and how much space is reserved for multiple transactions to access the block concurrently. These settings affect how updates are managed, how much space might be going to waste in a table, and how much transaction concurrency can occur on a block.

Creating Tables for Updates

Unless your table contains all DATE and CHAR columns, any update to a record can cause that record to grow or shrink in overall length. This is because Oracle stores only the bytes that contain information in its variable length fields. A new value, if it has more or less data, changes the amount of storage required. Updating a record with smaller field values doesn't cause any problems, unless this happens repeatedly and the block becomes almost empty, thus wasting space. However, if you add to a field's length, you may run out of space on the block. Even if this doesn't happen the first time you perform such an update, it may occur if you continue to increase the lengths of different columns or rows.

If you run out of space, Oracle will move the row to another block in the table through a process known as migration. Although this may sound like a benign solution, it does have performance ramifications. A migrated row leaves behind a forwarding address (pointer) so that it can still be found by an index lookup or by an ongoing query. Subsequent access to that row results in a probe of the row's original home block, which finds the forwarding address, and then a probe of the block where it now resides, known as an "overflow block." Rarely will the original block and overflow block be contiguous blocks in the table, so the disk retrieval for such a row will be slow. Migrated rows, particularly if they are numerous, can affect the overall database performance.

To help you avoid massive row migration, Oracle lets you reserve space on a block into which the data can expand. By default, this space is 10 percent of the data area of the block. When a change to the block leaves less free space than is reserved for row expansion, the block is taken off the free list and no further inserts will occur. For some tables, the 10 percent default may be perfectly adequate. For other tables, 10 percent may be a completely inadequate amount of free space or far more than is needed.

The space is reserved with the PCTFREE parameter, and you should determine a good value for it as best you can before you build any production table. As soon as a table is built you can change this value, but only blocks that aren't being used will adopt the new value and reserve the desired amount of free space.

If you know two pieces of information about the data being stored in the table, you can use the following formula to compute a good value for PCTFREE. The information you need is the average length of the rows when they're first inserted and the average length of the rows when they're at their maximum length. In the following formula, the terms avg_insert_length and max_length refer to these values, respectively:

                 (max_length - avg_insert_length)
PCTFREE = 100 * ---------------------------------------
                          (max_length)

In determining the average row lengths, you need to consider only the number of bytes of data per row, not the internal overhead associated with stored rows and fields. If you use just data lengths, the result will be slightly higher and have a built-in margin of error. The size of this error varies depending on the block size, number of rows in the block, and number of columns per row. For a table with 10 columns and a database block size of 4KB, if 10 rows fit into the block, this margin of error will be just over 3 percent.

Example of computing the PCTFREE value
If a table has a row with an average length of 153 bytes when it's initially inserted, and it grows by an average of 27 bytes over the course of its time in the table, the average maximum length of a row is 180 bytes. By using these two values in the formula PCTFREE = 100 * (max_length -avg_ insert_length) / (max_length), we find that this table should be created with PCTFREE = 100x(180-153) / 180 = 100x27 / 180 = 100x3/20 = 15.

For rows that don't change over time, or change only fixed-length fields, the expression (max_length - avg_insert_length) reduces to zero, which in turn causes the entire formula to result in zero. If you're really certain that there will be no updates or just updates that change record lengths, you can set PCTFREE equal to zero without concern for row migration problems.

If you have a table in which the value of (max_length - avg_insert_length) is negative, you also shouldn't have to worry about migration if you set PCTFREE to zero. However, in such a table, there will be a tendency for the amount of data on each block to become less than is optimal. This will occur when the block gains sufficient empty space, due to record shrinkage, to hold a whole new row. With many blocks in this state, you'll suffer some inefficiency because of this wasted space; more blocks are being taken to store rows than are really needed. To overcome this, you should consider the table in the same category as tables that undergo record deletions over time, and follow the approach to deal with these in the next section.

Creating Tables with High Delete Activity

Over time, tables that have rows deleted or rows that shrink in size can become inefficient. The empty space on the blocks represents additional disk reading and writing that must be done because empty space, rather than data, is being transferred between disk and memory. Oracle provides the space utilization parameter PCTUSED to help you control the amount of empty space allowed to remain on a block.

As mentioned earlier, a block is taken off the free list when it's full so that no further attempts are made to insert more rows into it-that is, when it has less empty space than PCTFREE of the block. PCTUSED sets a threshold value at which the amount of free space becomes sufficient for the block to hold one or more new rows, so it can be placed back on the free list. By default, Oracle sets PCTUSED at 40 percent. In other words, a block that has less than 40 percent of its data area occupied by rows will be put back on the free list.

You can change the value of PCTUSED at table creation time or anytime thereafter. As with PCTFREE, the impact of a change to PCTUSED may be delayed. A block that already contains less than the new PCTUSED amount of data, unless it's already on the free list, won't be placed there until another change is made to it.

What makes a good value for PCTUSED? The first criterion is to set it so that a block goes back on the free list only when there's room to store at least one more new row. In a very volatile table, where rows are frequently added and dropped, it may be worth wasting space on a block until there is room to fit three or four rows. Moving blocks on and off the free list requires some overhead that may not be worth incurring unless more than one row is affected by the change. After you decide how many rows to leave room for before placing a block back on a free list, you can use the following formula to compute a good value for PCTUSED:

PCTUSED = 100 - PCTFREE - 100 * row_space / block_space
   where row_space = avg_insert_length * rows_needed
   and block_space = DB_BLOCK_SIZE - 90 - INITRANS * 24

Constants used in computing PCTUSED value
The constant 90 is an imprecise measure of the space used by Oracle's header information in a table block, but it has proven to be sufficiently accurate for this calculation. The constant 24 is the number of bytes used to store a transaction entry on a typical hardware platform, and should be adequate for this calculation.

Following from the example we used to demonstrate the computation for PCTFREE, let's see how this formula would work if we wanted to insert new rows when a block had room for three new rows. In the earlier example, the average length of a row when initially inserted was 153 bytes, and the value for PCTFREE was calculated at 15. Let's use a block size of 4KB and an INITRANS value of 4 to complete the PCTUSED calculation. So we need to compute

PCTUSED = 100 - 15 - 100 * (153 * 3) / (4096 - 90 - 4 * 24)

This simplifies to the following:

PCTUSED = 85 - 100 * 459 / (4006 - 96) = 85 - 100 * 459 / 3910

If we round the quotient 459/3910 (= 0.1173913) up to 0.12, the result becomes the following:

PCTUSED = 85 - 100 * 0.12 = 85 - 12 = 73

The second consideration is how much space you can afford to spare. The lower the PCTUSED value you use, the more empty space will accumulate on a block before it's recycled onto a free list for more data to be added. In very large tables, you may not be able to afford to store blocks with more than a minimal amount of free space. In such cases, even though you may cause additional overhead by moving blocks back onto the free list more often than you might think you need from the preceding formula, you may gain some benefits. Not only will you save disk space, but if the table is queried extensively-particularly when using full table scans-you'll need to read less blocks into memory to retrieve the same number of rows.

Creating Tables for Multiple Concurrent Transactions

For a transaction to add, alter, or drop a row from an Oracle table, it must first obtain a lock on that row. It does this by first registering itself on the block where the row will reside or now resides. The registration is made by updating a special area of the block called the transaction entry slot, also known as an intent to lock (itl) slot. If a block contains a lot of rows, it's conceivable that more than one transaction will want to work on the same block at the same time. To do this, each must obtain a transaction slot for its own use.

Oracle allows up to 255 transaction slots to be created on a single block, but by default it builds one only when a block is added to a table. When additional slots are needed, the Oracle server process needing the slot has to rearrange the contents of the block to make room for the new slot. If you want to avoid this behavior, you can create a table that will contain more transaction slots on each block as the blocks are added. You can also limit the upper number of slots that can be created, preserving the space for additional row data, albeit at the cost of possibly making users wait for a transaction slot on a very busy block.

You control the allocation of transaction slots with the INITRANS and MAXTRANS space utilization parameters. With INITRANS, you set the number of transaction slots that each block acquires by default. With MAXTRANS, you set an upper limit on the total number of such slots that can be assigned to the block. The difference, MAXTRANS minus INITRANS, is the number of slots that can be added dynamically if needed.

Usually, there's no real need to change the default value for MAXTRANS. Even if you have hundreds of concurrent transactions working against the same table, they're likely to be working on different blocks simply because most blocks don't have room for that many rows. In the rare situation where tens of concurrent transactions all need the same block, they'll probably have to wait for one of the other transactions to release the row-level lock before they can do any work. It's in this case that you might want to set MAXTRANS. Otherwise, each transaction will build itself a transaction slot that it will then occupy idly until it can get to the row it needs. These slots represent wasted space on the block.

You might want to change INITRANS, however, if your can predict that more than one transaction will likely need the same block at the same time. By preallocating the necessary number of transaction slots on each block, you'll help the second and subsequent user get to their resources sooner. Each slot requires about 24 bytes, so don't set the value of INITRANS too high. Otherwise, you'll be taking space that could be occupied by row data.

Adding space utilization parameters to the example SAMPLE1 table requires further modifications to our table-creation command:

CREATE TABLE sample1 (
    sample_id          NUMBER(10),
    sample_name        VARCHAR2(35),
    owner_id           NUMBER(4),
    collection_date    DATE,
    donor_gender       CHAR(1),
    sample_image       BLOB)
TABLESPACE sample_data
STORAGE (
    INITIAL            5M
    NEXT               5M
    PCTINCREASE        0
    MAXEXTENTS         50)

Space utilization parameters

Building Tables from Existing Tables

One option you may want to exercise is to build a table from the definition of, or the full or partial contents of, another table. Oracle allows you to do this through an AS SELECT clause to the CREATE TABLE command.

The table definition can look just the same as the examples you've seen to this point, except that the datatype isn't included in the column list because the type is inherited from the original table. In fact, if you also want to keep the column names in the new table the same as they are in the original table, you can omit the column list completely. As with the versions of the CREATE TABLE command you've seen listed, you can also include or omit the entire STORAGE clause, or just include it with the required parameters; you can include or exclude any or all space utilization parameters; and you need to include the TABLESPACE clause only if you want the new table to be created somewhere other than your default tablespace.

The AS SELECT clause can include any valid query that will retrieve columns and rows to match the new table's definition. The columns named in the SELECT clause must match the column list, if any, for the new table. If the new table doesn't have a column list, all the columns from the original table are used in the new table. The SELECT clause can optionally include a WHERE clause to identify which rows to store in the new table. If you don't want any rows stored, include a WHERE clause that never returns a valid condition, such as WHERE 1 = 2.

The following shows three different variations of the CREATE TABLE...AS SELECT statement, each one producing a different table definition from the same base table we have been using, SAMPLE1.

REM Create SAMPLE2, an exact copy of SAMPLE1, in tablespace
REM SPARE, using default storage and no free space in the
REM blocks.
REM


SAMPLE2 is based on entire SAMPLE1 table

REM Create SAMPLE3, containing just the ID and IMAGE
REM columns, renamed, from SAMPLE1, placing it in the IMAGE
REM tablespace with unlimited l00MB extents and default
REM space utilization parameters.
REM

SAMPLE3 is based on two renamed columns from SAMPLE1 table

REM Create SAMPLE4 containing all but the IMAGE column from
REM SAMPLE1, and only selecting records from the past year.
REM Use the DEMOGRAPHIC tablespace with default storage,
REM zero free space, a block reuse threshold of 60 percent,
REM and exactly 5 transaction slots per block.
REM

SAMPLE4 is based on all but one column from the SAMPLE1 table and includes only a subset of rows

Monitoring Table Growth

Although you try to determine a table's overall size when it's created to evaluate disk requirements, the actual size may well vary from the predicted size. You should monitor the database tables to ensure that they won't run of out space in the course of normal business. Similarly, you may want to check that the space taken by the table's extents is being used efficiently and that there aren't a lot of empty or near-empty blocks. You may also want to confirm that the PCTFREE value is set appropriately by looking for migrated rows.

The ANALYZE command collects statistics and stores them in the data dictionary for you. These statistics include the number of blocks used, the amount of unused space per block, the number of empty blocks, and the number of migrated or chained rows. A chained row is one that's simply too large to fit into a single block, and thus will always be spread across multiple blocks. The data dictionary, unfortunately, doesn't distinguish between chained rows and migrated rows, the latter being rows that get longer through the use of UPDATE commands and don't have sufficient space on their block for the growth. If the average row length, also shown in the dictionary, is less than the space available on a block (the block size minus the header and transaction slot space), the rows are most likely migrated, not chained.

Row chaining versus migration
If a row is too big to fit into a single block, the first part of the row is stored in one block and the rest of the row is stored in one or more overflow blocks. Each part of the row is known as a row piece, and the first row piece is counted as the row's location for any index entry or when it's examined by the ANALYZE command. When a row migrates because it no longer fits into its original block, a pointer is left behind in the original block to identify the row's new location. This pointer is treated as the row's location so that any index entries pointing to the row don't have to be updated. The ANALYZE command treats this pointer as an initial row piece and doesn't distinguish it from a row piece belonging to a chained row. This is why the results of the ANALYZE command don't distinguish between chained and migrated rows.

You can use ANALYZE TABLE table_name COMPUTE STATISTICS or ANALYZE TABLE table_name ESTIMATE STATISTICS to collect the statistics stored in the DBA_TABLES (and USER_TABLES) data dictionary views. The former command will always give accurate results; the latter will be a good, but not precise, estimate. The COMPUTE option takes longer as the table grows in size, so you may prefer to estimate statistics for your large tables. You can select the percentage of the table or the number of rows you want to include in the estimate with the SAMPLE clause, using SAMPLE x PERCENT or SAMPLE x ROWS (where x is the percentage or the row count, respectively).

When you collect statistics, you may affect how the database performs optimization to determine statement execution plans. If you want to ensure that rule-based optimization is used by default, you should execute the ANALYZE TABLE...DELETE STATISTICS command after you examine the statistics. If you want to spend more time reviewing the statistics, you can save the results by executing a CREATE TABLE...AS SELECT command against the data dictionary table. In fact, if you do this after you run the ANALYZE command to collect new statistics, using a different table to store the results each time, you will build a history of the table's growth and data distribution. Once you have saved the statistics into a table, you can go ahead and execute the DELETE STATISTICS option to remove them from the base table definition.

Managing Extent Allocation

There are two reasons to monitor how many extents exist in a table:

Reasons to add table extents manually
There are a number of reasons for manually adding extents to a table, although relying on automatic allocation generally makes your work a lot easier. Some of the more common reasons for manual allocation include over-coming a shortage of space in the tablespace that doesn't allow you to choose a good value for the NEXT extent to be allocated automatically; placing the extent in a data file of your choice, which allows you to spread the storage around different data files and, presumably, different disk drives; allocating the extent to a specific instance's free list groups if you're using the parallel server option; and ensuring that you fit the largest extent possible into the given space.

To see the number of extents in table, you can query DBA_EXTENTS for the given table (segment) name. If you use the COUNT(*) value in the SELECT clause, the result will show the exact number of extents owned by the table. If you want to see other information, such as the data files in which the extents are stored, you can query other columns in this table.

To allocate an additional extent to a table for which you're using manual allocation, issue an ALTER TABLE...STORAGE (MAXEXTENTS x) command, where x is one more than the current number of extents. You can then add an extent with the ALTER TABLE...ALLOCATE EXTENT command. The following script manually adds the 12th extent to the SAMPLE10 table, using the third data file in the USR_DATA tablespace:

ALTER TABLE sample10 STORAGE (MAXEXTENTS 12)
/
ALTER TABLE sample10
    ALLOCATE EXTENT (
    FILE 'c:\orant\database\samples\usr_data3.ora')
/

For tables to which extents are being added automatically, you simply need to ensure that MAXEXTENTS stays larger than the current number of extents. By monitoring the table over time, you should be able to predict how fast extents are being added and increase the MAXEXTENTS value before the current limit is reached. You use the same command that appears at the start of the preceding script to change the extent limit.

Removing Unused Space

There are two types of unused space in a table:

If you have blocks of the first type-that is, blocks that have never been used-and don't expect this space to be needed, you can remove it with the DEALLOCATE option of the ALTER TABLE command:

ALTER TABLE table_name DEALLOCATE UNUSED

If you expect some but not all allocated space to be needed, you can drop a portion of the extra space by using a further option:

ALTER TABLE table_name DEALLOCATE UNUSED KEEP integer [K|M]

This removes all but integer bytes (or K (kilobytes) or M (megabytes)).

To reclaim the other type of free space-space that has been released by DML activity-you can try increasing the PCTUSED value for the table, as discussed earlier. This will allow blocks to be returned to the free list and used for future new rows sooner than they have been. However, if the table is fairly static and not many more changes will be made, the blocks that are already partially empty can't be touched again and won't be returned to the free list. Even if they were, there might not be enough new rows added to fill all the reusable space. In this case, you may have to rebuild the table.

You can rebuild a table in a number of ways:

CREATE TABLE temp AS SELECT * FROM sample10
/
TRUNCATE TABLE sample10
/
INSERT INTO sample10 SELECT * FROM temp
/
DROP TABLE temp
/

Using Views to Prebuild Queries

Although you can create tables from the contents of other tables, as discussed in the previous section, in many cases there's no need to consume the space required by the copy just because you need to see a variation of the original table. A view is built in the same way that the AS SELECT clause is used to build a table from the contents of another table. With a view, rather than the data be copied to a new location, only the definition of required data is stored in the data dictionary. This amounts to storing the SELECT statement or, to put it another way, storing a query definition.

Views have many uses. The following sections show you how to build a view to meet a specific need you may have. We use a simple EMPLOYEE table as the basis for the examples in these sections. The following shows the CREATE statement for this table:

You can use views without degrading performance
Consider using a view whenever you think it could be useful, for any purpose whatsoever. There is very little overhead involved in storing the definition or in executing a statement against the view.

CREATE TABLE employee (
    id             NUMBER(8)
                   CONSTRAINT employee_id_pk PRIMARY KEY,
    last_name      VARCHAR2(35),
    first_name      VARCHAR2(30),
    middle_initial CHAR(1),
    department     NUMBER(5)
                   CONSTRAINT employee_department_fk
                      REFERENCES department_table,
    salary         NUMBER(10,2),
    title          VARCHAR2(20),
    phone          NUMBER(5) CONSTRAINT employee_phone_fk
                       REFERENCES phone_table,
    hire_date      DATE)
/

For information on the constraints contained in this table definition, see Chapter 17, "Using Constraints to Improve Your Application Performance."

Changing Column Names with Views

If your table has column names that follow a naming convention-such as a corporate standard or one based on the vocabulary of the primary users-the names may not be meaningful to other users of the table. For example, the Human Resource Department may talk about an employee ID number, the Payroll Department may refer to the same number as a "payroll number," and the Project Scheduling System may use the term "task assignee." By using views, the same employee table can be used to provide this number with the preferred name for each group.

The following script, based the EMPLOYEE table created in the preceding script, shows a view being created for the Payroll Department to give the ID column the name EMPLOYEE_ID, leaving all other columns with their original names:

CREATE VIEW  pay_employee (
    payroll_number,
    last_name,
    first_name,
    middle_initial,
    department,
    salary,
    title,
    phone,
    hire_date) AS
SELECT * FROM employee
/

Dropping Columns with Views

Once in a while, you may find that a column originally defined in a table is no longer needed. The current version of Oracle8 doesn't allow you to drop such a column. Instead, I recommend that you set the column value to NULL in the entire table, which will free up the storage consumed by the column. This doesn't help users who may not know about the column's existence when they try to use the table. An INSERT statement, for instance, would fail if they didn't include a value for the dropped column. If you create a view that excludes the missing column, the view can now be used in place of the table and the dropped column will no longer be a problem.

Updating views may not always be possible
If you have views that participate in table joins, your users may not be able to update them or perform other DML commands on them. For detailed information on the rules governing updatable views, see the later section "Updating Data Through Views."

To make the column's disappearance even more transparent to the users, you can first rename the table and then use the original table name to name the view that excludes the unwanted column:

  1. Issue the following command to prevent table access under the old name:
RENAME table_name TO new_table_name;

  1. Build the required view with this command:
CREATE VIEW table_name (...)
AS SELECT ... FROM new_table_name;
    Make sure that you name all but the unwanted column in the list of column names (shown as in the preceding statement) and use the new name for the table in the FROM clause.

  1. Grant the same permissions on the view that existed on the table.

The following CREATE VIEW statement will result in a view that apparently removes the PHONE column from the EMPLOYEE table:

CREATE VIEW emp AS
SELECT id, last_name, first_name, middle_initial,
    department, salary, title, hire_date
FROM
 employee
/

Hiding Data with Views

Some tables may contain sensitive data that shouldn't be seen by all users, or rows that aren't useful for some parts of the user community. So, although you may need certain users to see all the contents of a table, others should see only certain columns or rows, or even a subset of rows and columns. You can accomplish this by creating a view that contains only the elements that should be seen by the selected users and then granting them access to the view rather than to the table.

For a view that contains a subset of the columns, you can use the same approach as you would to create a view to hide a dropped column (the example in the preceding section shows the creation of such a view). A view that shows users only a subset of the rows is built by using an appropriate WHERE clause. You can restrict column and row access by building a view with a SELECT clause to identify just the required rows and a WHERE clause to choose the desired rows.

The following command shows the creation of such a view. It's based on the EMPLOYEE table from the earlier section "Using Views to Prebuild Queries," but includes only employees in Department 103. Therefore, it doesn't show the department column, nor does it include salary information:

CREATE VIEW dept_103 AS
SELECT id, last_name, first_name, middle_initial,
  title, phone, hire_date
FROM employee
WHERE department = 103
/

Hiding Complicated Queries

Your users and applications may need to execute fairly complicated queries that contain multiple table joins, or subqueries, and combinations of these. If such a query is needed on a regular basis, you might consider creating a view that embodies the query. The user or application can then simply query the view without being concerned with the complexity of the underlying query. This will reduce the possibility of error as well as save time.

The following code builds a view that could be used for an online phone directory service based on the EMPLOYEE table (for the name and phone number) and the DEPARTMENT table, which it references (for the department name). Although it's not very complicated in terms of the number of columns and tables involved, it does provide a standard format for the output, using SQL functions and operators:

CREATE VIEW phone_list (name, department, phone) AS SELECT
    UPPER(last_name) || ', ' ||
    INITCAP(first_name) || ' ' ||
    UPPER(middle_initial) || '.',
    department_name, e.phone
FROM employee e, department d
WHERE department =  d.id
/

Although the phone listing might be usefully ordered by the NAME field, a view can't contain an ORDER BY clause. A query against the view PHONE_LIST (created in the previous command) to show an alphabetical listing of all employees' phone information would have to include its own ORDER BY clause. The command would be

SELECT * FROM phone_list ORDER BY name;

The ordering column is the name from the view definition, not from the columns in the base table on which it's based.

Accessing Remote Databases Transparently with Views

To access a table on a remote database, a statement needs to identify the table name plus a database link name for Oracle to find the correct remote database and table. (For information on database links, see the Oracle8 Server Distributed Systems manual.) The link name is concatenated to the table name with a commercial "at" (@) symbol. To hide this structure from users and applications, you can create a view that embodies the table and link name. If you needed to reach the EMPLOYEE table in San Francisco from a different database on the network, you could create a database link named SF to point to the San Francisco database and then build a view to hide this link's use. The following shows one version of the command to build the link and then the command to build the view:

CREATE DATABASE LINK sf
    CONNECT TO emp_schema IDENTIFIED BY emp_password
    USING 'sfdb'
/
CREATE VIEW employee AS SELECT * FROM employee@sf
/

Obviously, you can create a view to meet any one of a number of requirements. In some cases, you may need a view to help with a number of issues. There's no reason that the view to access the remote EMPLOYEE, created in the preceding code, couldn't also restrict access to the salary column while renaming the ID column TASK_ASSIGNEE.

Creating and Handling Invalid Views

In some circumstances, you may need to create a view but find that you don't have the permissions on the base table, or that the table you know should exist isn't available. In such cases, you can create an invalid view that will remain unusable until the underlying table(s) is accessible by you. To do this, use the keyword FORCE following the CREATE keyword in your command to define the view.

A view can also become invalid at a later time due to changes in the underlying table. Whenever a view is invalid, Oracle will return an error message if you try to execute a statement that refers to the view. After the underlying problem is fixed, the view should work normally again.

Dropping and Modifying Views

Dropping a view simply requires issuing the DROP VIEW command to remove the view definition from the data dictionary. No space will be recovered because a view doesn't own any data.

Once in a while, you may need to change the definition of a view. Although this can be done by dropping the view and then re-creating it with the new definition, this might not be a desirable approach. If the view has been granted to users, dropping the view will lose the privileges. Dropping a view that's referenced in triggers or stored procedures causes these objects to be invalidated, requiring a recompile attempt next time they're used, even if you add the new view definition immediately.

To preserve the integrity of the users and objects dependent on a view, you can modify the view's definition without having to drop it first. This is done with the OR REPLACE option of the CREATE command. By issuing the CREATE OR REPLACE VIEW... command, you can use a different SELECT clause from that in the existing view without the view disappearing from the data dictionary at any time.

You can even modify a view such that a valid view becomes invalid or an invalid view becomes valid, or modify an invalid view to become a different invalid view. If the resulting view would be invalid, you must include the FORCE keyword after the CREATE OR REPLACE clause. Otherwise, the view won't be changed from its previous definition.

Don't expect users to be able to use invalid views
Just as when you create a view with the FORCE option, any modification that requires the FORCE keyword or that otherwise makes a view invalid renders the view unusable. Nobody can use the view name in a SQL statement successfully until the view is revalidated.

Updating Data Through Views

Although views are used primarily for queries, they can be used for other types of SQL statements. Rows can be inserted and deleted via a view on a single table when there are no set or DISTINCT operators in the view, nor any GROUP BY, CONNECT BY, or START WITH clauses. For row inserts, no columns excluded from the view can be defined as NOT NULL or be part of the PRIMARY KEY constraint (which implies NOT NULL).

Updates can also be performed via a view on a single table and, in some cases, through a view on joined tables. For a single table, updates are limited by the same restrictions as inserts and deletes. In the case of a view across a join, only one table can be updated in a single statement. Furthermore, there must be a unique index on at least one column in the joined view, and the columns from the table being updated must all be updatable. To see whether a column in a view is updatable, you can query the table DBA_UPDATABLE_COLUMNS (or USER_UPDATABLE_COLUMNS).

Understanding Oracle terminology for updatable join views
Oracle uses the term key-preserved tables when discussing the update options on views involving table joins. A table is key-preserved in a join view if every key of the table, whether or not it's included in the view's SELECT clause, would still be a valid key following a change to the columns seen in the view. Only key-preserved tables can be updated through the view.

View Consistency

As we've seen, you can create views to restrict the visible rows in the base table. Also, you've learned that you can update a view on a single table. One concern you might have is how these two characteristics work together. Suppose that I use the view DEPT_103, created earlier in the section "Hiding Data with Views," and update it. If I update the employee's title, there shouldn't be a problem. But what if I update one record to change the department number to 242? Now the row doesn't belong to the view and may not be a row I can officially see.

You can add a refinement to views that restrict access to certain rows within a table. This refinement prevents users from modifying a row that they can see through the view to contain a value that they aren't allowed to see. This is done by including the key phrase WITH READ ONLY or WITH CHECK OPTION to the view definition. WITH READ ONLY doesn't allow any changes to be made to the base table through the view, so you can't perform an insert or a delete, or complete any updates, on the underlying table. WITH CHECK OPTION, on the other hand, does allow any of these operations as long as the resulting rows are still visible under the view definition. If you want, you can give WITH CHECK OPTION a name by using a CONSTRAINT keyword, just as for other types of constraints (see Chapter 17). The following command shows how you can create a view with a named CHECK OPTION:

CREATE VIEW dept_103 AS
SELECT id, last_name, first_name, middle_initial,
  title, phone, hire_date
FROM employee
WHERE department = 103
WITH CHECK OPTION CONSTRAINT dept_103_dept_ck
/

The name given to the CHECK OPTION here follows a suggested naming standard developed for constraints.

SEE ALSO
For more information on naming constraints,

© Copyright, Macmillan Computer Publishing. All rights reserved.