Today you'll learn about processes, or threads that are used in the NT instance. You'll see how to identify, monitor, and kill a process or thread.
New Term: The term process is used in this book to describe a thread of execution, or a mechanism that can execute a set of code. In many operating systems, processes has been replaced with threads or lightweight processes. To minimize confusion, the term process refers to the mechanism of execution and can refer to either a traditional process or a thread.
First you'll review how the Oracle processes work and what they do. You'll examine a bit of the terminology and the function of the Oracle8 instance. Later you'll see how to monitor the processes. You'll spend the remainder of this lesson on one of my favorite topics: the Oracle Parallel Query option.
New Term: The Oracle RDBMS uses two types of processes: the user processes (also known as the shadow or server processes) and the Oracle processes (also known as background processes). In some operating systems, such as Windows NT, these processes are actually threads; for consistency, I will refer to them as processes.
User, or client, processes are the user's connections into the RDBMS system. The user process manipulates the user's input and communicates with the Oracle server process through the Oracle program interface. The user process is also used to display the information requested by the user and, if necessary, can process this information into a more useful form.
Background processes are the Oracle processes used to perform various tasks within the RDBMS system. These tasks vary from communicating with other Oracle instances and performing system maintenance and cleanup to writing dirty blocks to disk. The nine Oracle processes are DBWR, LGWR, CKPT, PMON, SMON, RECO, ARCH, LCKn, and Dnnn. Refer to Day 2, "Exploring the Oracle Architecture," for descriptions of these processes.
With Windows NT, these processes are implemented as threads. Because the Oracle threads are listed by number, you cannot readily distinguish which thread is which. So you can cross-reference them, the Oracle threads are listed in Table 11.1.
|Thread number||Oracle process|
|0, 1||Oracle service|
The server, or shadow, processes communicate with the user and interact with Oracle to carry out the user's requests. For example, if the user process requests a piece of data not already in the SGA, the shadow process is responsible for reading the data blocks from the datafiles into the SGA.
There can be a one-to-one correlation between the user processes and the shadow processes (as in a dedicated server configuration). Although one shadow process can connect to multiple user processes (as in a multithreaded server configuration), doing so reduces the utilization of system resources.
Each shadow process or thread uses a certain amount of Oracle and OS resources. Specifically, each shadow process uses approximately 1MB of memory. This memory and the overhead of managing large numbers of processes create quite a load on the system. Therefore, you should reduce the number of user connections if possible.
Because the load incurred by large numbers of user processes can be quite heavy on the system, measures should be taken to reduce this number. Several different methods that involve multiplexing the connections into the Oracle instance can be used to reduce the user load.
Multiplexing involves reducing the number of connections. On one side of the multiplexor, each process or user might have its own connection On the other side, many processes or users might share the same connections. Let's look at a multiplexor as a gray box, as shown in Figure 11.1.
The logical view of a multiplexor.
In its simplest form, the multiplexor reduces a large number of connections to a smaller number of connections. The multiplexing can be done on the server itself, but in general, the multiplexing is done on another server.
New Term: A system that has a middle server to handle connection processing is typically called a three-tier system. If you are directly connecting all the users from client systems to the database server, this is known as a two-tier system.
An example of a two-tier system is shown in Figure 11.2. As you can see, a number of network client systems are connected to the database server via a LAN.
The two-tier system.
The two-tier system directly links the user connections (via a LAN) into the Oracle instance via the server processes. If you are using the multithreaded server, the server processes might be reduced but the LAN connections into the server are still there for all of the users. This is shown as a logical view in Figure 11.3.
The two-tier system logical view.
In the logical view, you can see more clearly that there is a one-to-one relationship between the users and the connections into the Oracle instance. As I mentioned earlier, these connections can be quite expensive.
NOTE: In computer system performance language, a process is said to be expensive if it consumes a relatively large amount of system resources.
New Term: In a three-tier system, a middleware system is used to handle much of the overhead involved in handling user connections and large numbers of processes. This middleware can be in the form of a transaction monitor (TM) or an application. An example of a three-tier system is shown in Figure 11.4.
The three-tier system.
This looks very similar to the two-tier system when you look at the physical layer, but when you look at the logical picture of this system, as shown in Figure 11.5, you can see that the number of connections into the RDBMS has been greatly reduced. Because each connection is expensive in CPU and memory, this savings can improve the performance of your system.
The three-tier system logical view.
The three-tier system is becoming quite popular and will likely become more so. In fact, Microsoft released a middleware component in 1997 called the Microsoft Transaction Server. Other transaction monitors in use today include Tuxedo, Encina, and TopEnd. Many of these products have been around for years.
Transaction monitors have typically been used in mid-range to high-end systems, but with the popularity of PC servers, you'll see more and more multiplexing done in this area as well.
As mentioned previously, the user connection into the Oracle instance can occur via a dedicated server process or a multithreaded server process. In either case, they appear and act identically to the end user under most conditions.
There are a few situations when it is necessary to connect via the dedicated server process:
Under these conditions, the system must use a dedicated server process. Forcing a dedicated server process under other conditions is described a bit later.
When a SQL request from a user is sent to the RDBMS, the server process executes that command. This mechanism protects Oracle from being directly manipulated by a user process. This actually increases the stability and robustness of the Oracle8 RDBMS. A diagram of the dedicated server process is shown in Figure 11.6.
The dedicated server process.
With a dedicated server process, there is a one-to-one correlation between the user process and the server process. Each server process is dedicated to one user process. To request a dedicated server process when the system is running with the multithreaded server enabled, include the parameter SRVR=DEDICATED in the connect string. This will create a dedicated server process for that user process.
The multithreaded server process allows many user processes to share a number of shared server processes. This is diagrammed in Figure 11.7.
The multithreaded server process.
As you can see, all requests to the shared server processes must go through the dispatcher process, which in turn queues the request in the shared pool in the SGA. Once the request has been processed, it is returned to the dispatcher through the shared pool in the SGA.
WARNING: Because the multithreaded server uses the shared pool for queuing requests and returning data, it is important that the shared pool be large enough.
The main advantage of using the multithreaded server is the reduction of server processes. This can greatly reduce CPU and memory usage. As you might guess, however, the multithreaded server does add overhead to the system. This is why a dedicated server process is recommended for long-running batch jobs.
To configure and tune the multithreaded server for use, you'll need to tune the following parameters in your parameter file. You should also monitor the shared pool very carefully to make sure you are not running out of space.
Try monitoring the shared session memory with a small number of users to determine how much memory they are using. You can then extrapolate how much memory all the sessions will require. This can be accomplished by using the following SQL statement:
SELECT SUM(value) || `bytes' "Memory" FROM v$sesstat, v$statname WHERE name = `session memory' AND v$sesstat.statistic# = v$statname.statistic#;
This tells you how much memory you are using. Dividing this by the number of connections tells you the amount of memory per session. You can determine from this how much memory you'll need in the shared pool for all the sessions you'll be supporting.
If you think the shared-pool size might be too small, you can increase it by tuning the parameter SHARED_POOL_SIZE. Remember, the shared pool is also used for the library cache and the data dictionary. Learn to tune these on Day 20, "Effectively Tuning and Optimizing the Database."
The number of dispatchers per protocol is determined by the initialization parameter MTS_DISPATCHERS. If you increase this number, each session will potentially see greater performance because it will not have to wait on a dispatcher to become available. Here is an example of how you would set five dispatcher processes for the TCP/IP protocol:
MTS_DISPATCHERS = "TCP, 5"
Each network protocol is set separately. For protocols with fewer users, set the number lower. The more concurrent sessions per dispatcher, the greater the possibility you'll have to wait for a dispatcher when you need one.
Other parameters related to the multithreaded server are
The number of dispatcher processes and the minimum number of shared server processes can be changed dynamically with the ALTER SYSTEM parameter.
The Oracle Parallel Query option is by far my favorite enhancement that Oracle has made in the last few years. By using the Parallel Query option, you can greatly improve the performance of some database operations.
The Oracle Parallel Query option makes it possible for some Oracle functions to be processed by multiple streams of execution. These functions include some queries, index creation, data loading, and recovery. In each of these functions, the general principle is the same: Keep the CPUs busy while Oracle is waiting for I/O.
For most queries, the time spent waiting for the data to be retrieved from disk usually overshadows the amount of time spent processing the results. With the Parallel Query option, you can compensate for this by using several server processes to execute the query. While one process is waiting for I/Os to complete, other processes can be executing.
Many processes working together can simultaneously process a single SQL statement, a situation known as parallel query processing. The other functions are known as parallel index creation, parallel loading, and parallel recovery, each of which is discussed in the following sections.
Parallel query processing allows certain Oracle statements to be run in parallel. The Oracle server can process the following statements in parallel:
Parallel queries are effective on large operations such as table scans and sorts.
With traditional queries such as table scans, the server process reads the data sequentially, as shown in Figure 11.8. Much of the time spent in this query is spent waiting for I/Os to complete.
A table scan without parallel query.
New Term: A parallel query divides the query into several different pieces, each processed by a different server process. These processes are called query servers. The query servers are dispatched by a process known as the query coordinator. The query coordinator dispatches the query servers and coordinates the results from all the servers to send back to the user. The result of this arrangement is that many smaller table scans take place under the hood (transparent to the user). From the user's standpoint, it is simply a much faster table scan. Figure 11.9 shows a parallel query.
A parallel query table scan.
The query coordinator is given a SQL statement and a degree of parallelism, and is responsible for dividing the query among the query servers and integrating the individual results into one result. The degree of parallelism is the number of query servers assigned to the particular query.
The Oracle server can make parallel the following operations:
Each of these operations has requirements that determine how the query is parallelized. The performance achieved by the parallel query is determined by the amount of data to be accessed and the degree of parallelism achieved.
How the query is parallelized (if at all) is determined by the query coordinator. The decision is made in this order:
2. The query coordinator determines which operations can be performed in parallel.
3. The query coordinator determines how many query servers to enlist.
4. The query coordinator enlists query servers that perform the query.
5. The query coordinator reassembles the resulting data and passes it back to the user.
The degree of parallelism is determined using the following precedence:
2. Table definition--The default degree of parallelism defined for the table has second precedence.
3. Initialization parameters--The Oracle initialization parameters are used.
Regardless of how these values are set, the number of query servers cannot exceed the number of query servers available in the query server pool. This number is specified by the Oracle initialization parameter PARALLEL_MAX_SERVERS.
Hints for the degree of parallelism are set within a comment string in the SQL statement.
The syntax of this comment is as follows:
PARALLEL ( alias_or_tablename , [ integer/DEFAULT ] [ , integer/DEFAULT ] )
The parallel hint specifies the table or alias being scanned, followed by a value for the number of query servers to be used (or the default). The final optional value specifies how the table is to be split among different instances of a parallel server. Here is an example of using hints:
SELECT /*+ FULL(emp) PARALLEL(emp, 4) */ empno FROM emp;
If you add the FULL and PARALLEL hints to this statement, the Oracle optimizer will create an execution plan that uses a full-table scan. Furthermore, this table scan will be executed with a parallel degree of 4 if the query servers are available. This statement overrides both the degree of parallelism specified in the table definition and the default Oracle initialization parameters.
The hint NOPARALLEL disables parallel scanning of a table and overrides the specified degree of parallelism.
The NOPARALLEL hint has the following syntax:
NOPARALLEL ( alias_or_tablename )
Parallel query operations can be very effective on multiprocessor or parallel-processing computers; they can also be effective on uniprocessor systems where much of the time is spent waiting for I/O operations to complete. Systems with sufficient I/O bandwidth--and especially systems with disk arrays--benefit from parallel query operations.
If your system is typically processing at 100% of its CPU utilization and you have a small number of disk drives, you probably won't benefit from parallel query operations. If your system is extremely memory limited, you also will probably not benefit from parallel query operations.
The two areas that can be tuned for parallel queries are I/O and parallel servers. By properly configuring your datafiles, you can help parallel queries be more effective.
The function of a parallel query is to split up query operations so they more effectively use the system. One of the ways a parallel query does this is by allowing the processing of the query to continue while pieces of the query operation are stalled waiting for I/Os to complete. Parallel queries are not effective if the entire table is limited to one disk drive.
By striping the table across many drives, I/Os can be distributed and a higher level of parallelism can occur. Striping can be done with OS striping, with Oracle striping, or (better yet) with a hardware disk array.
Large contiguous extents can also help performance in parallel query operations. During scan operations, the query coordinator splits contiguous ranges of blocks into large, medium, and small groups. Each query server is given a large group of blocks to start with, progressively working its way down to the small group of blocks until the scan is completed. This is done in an attempt to balance the load performed by each query server. If there are several large extents in a table, the query coordinator can find blocks to dispatch to the query servers much more easily.
TIP: Remember to compose your temporary tablespace of several large extents on a striped volume. This arrangement helps sorting performance.
Degree of Parallelism
Properly distributing I/Os and the degree of parallelism are the two most important things to tune in the Parallel Query option. Tuning the degree of parallelism is partially trial and error and partially analysis. It is very important to take notes when you are experimenting with the degree of parallelism. Your first iteration should be based on the following factors:
All these parameters should have some influence on the degree of parallelism you set up for your system. Remember that the preceding points are just guidelines to help with your best attempt at a starting point. Here are a few other suggestions:
Once you determine your starting point, you can monitor your system by querying the dynamic performance table, V$PQ_SYSSTAT. This can be done with the query shown in Listing 11.1.
SQL> select * from v$pq_sysstat; STATISTIC VALUE ------------------------------ ---------- Servers Busy 0 Servers Idle 12 Servers Highwater 16 Server Sessions 380 Servers Started 4 Servers Shutdown 4 Servers Cleaned Up 0 Queries Initiated 21 DFO Trees 77 Local Msgs Sent 2459361 Distr Msgs Sent 0 Local Msgs Recv'd 2459318 Distr Msgs Recv'd 0
13 rows selected.
When looking at this query, the following statistics are quite useful.
After you determine your degree of parallelism, begin testing; evaluate the information you get from V$PQ_SYSSTAT and from your operating-system-monitoring facilities. Keep an eye out for CPU usage and excessive waiting on I/O. If the CPU usage is too high, try reducing the degree of parallelism. If the CPU usage is too low and there is significant waiting for I/O, try increasing the degree of parallelism.
Remember that the degree of parallelism is determined by SQL hints, table definitions, and initialization parameters. The total number of query servers is determined by the initialization parameter PARALLEL_MAX_SERVERS; the number started up initially is determined by the initialization parameter PARALLEL_MIN_SERVERS.
The total number of query servers in use is the number of queries executed in parallel multiplied by their degree of parallelism. If you try to use more than PARALLEL_MAX_SERVERS, you won't be able to parallelize your query.
You can use the Direct Write Sort option with the Parallel Query option and have the query servers each perform their own direct writes.
As you saw earlier, using direct writes causes the server processes to write the output of sort operations directly to disk, bypassing the buffer cache. The effect of direct writes is that for sort operations, large amounts of block buffers are not ejected from the buffer cache. This leaves the buffer cache available for normal queries and updates. When using direct write sorts with the Parallel Query option, each query server gets its own set of direct write buffers.
NOTE: Direct write sorts take more memory than normal sorts. The amount of memory they use with the Parallel Query option can be determined with the following formula:
Direct write sort memory =
(number of query servers) * (SORT_WRITE_BUFFERS) * (SORT_WRITE_BUFFER_SIZE)
Only use direct write sorts if you have sufficient memory and temporary disk space. The temporary disk space should have a sufficient space and I/O bandwidth to handle the load.
Another feature of the Parallel Query option is its capability to create indexes in parallel. With the parallel index creation feature, the time it takes to create an index can be greatly reduced.
As in parallel query processing, a coordinator process dispatches two sets of query servers. One set scans the table to be indexed to obtain the ROWIDs and column values needed for the index. Another set performs the sorting on those values and passes the results to the coordinator process. The coordinator process then puts together the B*-tree index from these sorted items.
When creating an index, the degree of parallelism follows the same precedence as it does in parallel query processing. The first value is an optional PARALLEL clause in the CREATE INDEX statement, followed by the table definition and finally the initialization parameters.
Creating an index in parallel can be several times faster than creating an index by normal means. The same conditions apply for index creation as were given for parallel query processing. A system that has been configured to take advantage of parallel query processing will also see good performance from parallel index creation.
Loading can be done in parallel by having multiple concurrent sessions perform a direct path load into the same table. Depending on the configuration of the system, you can see excellent load performance by loading in parallel. Because loading is both CPU and I/O intensive, you should see good results in an SMP or MPP environment with a high-bandwidth, I/O subsystem.
Parallel loads are performed by multiple, direct-loader processes, each using the PARALLEL=TRUE and DIRECT=TRUE options. When you specify PARALLEL=TRUE, the loader does not place an exclusive lock on the table being loaded as it would otherwise. During the parallel load, the loader creates temporary segments for each of the concurrent processes and merges them together on completion.
Although parallel loading performs best when each temporary file is located on a separate disk, the increased performance of the load does not usually justify the complexity of the manual striping needed to do this. I still recommend striping the tables on an OS level or preferably on a hardware disk array. Performance can be improved by putting each of the input files on a separate volume to take advantage of the sequential nature of the reads.
Parallel loading can be beneficial, especially if load time is critical in your environment. By putting each of the input files on separate disk volumes, you can increase performance. Overall, the general tuning principles used in parallel query processing are valid in parallel loading also.
Having multiple concurrent sessions perform a recovery is probably my favorite feature of the Parallel Query option. When benchmarking Oracle and testing hardware and software, it is often necessary to intentionally crash the system to prove recoverability. With the Parallel Recovery option, the time it takes to perform an instance recovery can be dramatically reduced.
Recovery time is significantly reduced when the system being recovered has many disks and supports asynchronous I/O. For a small system that has few drives or for an operating system that does not support asynchronous I/O, it might not be wise to enable parallel recovery.
In traditional recovery, one process reads from the redo log files and applies changes to the datafiles, as shown in Figure 11.10. This operation can take a significant amount of time because the recovery process must wait for disk I/Os to complete.
Recovery without parallel recovery.
With the Parallel Recovery option, one process is responsible for reading and dispatching redo entries from the redo log files and passing those entries on to the recovery processes that apply the changes to the datafiles, as shown in Figure 11.11.
Because the recovery process reads sequentially from the redo log files, the I/O performance is much higher than that of the recovery processes that are writing random data throughout the datafiles. Because writing the data is very seek intensive, it is a good idea to have one or two recovery processes for each data disk in the system.
By having more recovery processes, you can have more outstanding I/Os and thus simultaneously use all the data drives. Because recovery is done at instance startup, this arrangement reduces dead time when no other database processing can be done.
The number of concurrent recovery processes is set with the initialization parameter RECOVERY_PARALLEL. The value of this parameter cannot exceed the value specified in the initialization parameter PARALLEL_MAX_SERVERS.
By specifying a sufficient number of recovery servers, you'll see an immediate improvement in instance recovery time. Do not use parallel recovery if your system does not support asynchronous I/O or if you are limited to a small number of disk drives. If your I/O subsystem is fast enough and your data is properly striped across these drives (either through software or hardware RAID), you should see very good improvement.
In summary, the Parallel Query option is useful in distributing processing loads so that CPUs are kept busy processing while other processes are waiting for I/Os to complete. With multiprocessor machines, the Parallel Query option can be quite beneficial; this is not to say that the option is not beneficial on uniprocessor machines as well.
NOTE: As larger and larger disks are produced at lower and lower prices, many installations end up with I/O problems caused by not having the required I/O bandwidth. Two disk drives are faster than one disk drive if your I/O is balanced.
The Parallel Query option can help only in systems where I/O is not a bottleneck. When I/O is not a problem, you'll see significant gains from parallel queries.
The processes or threads used by Oracle can be monitored via the OS facilities of your operating system or from within the Oracle instance itself. Each of these has its own advantages and disadvantages. By looking at the processes from within Oracle, you can gain more insight as to what these processes are actually doing, but you cannot determine exactly how many resources they are currently consuming. It takes a combination of external and internal monitoring to get the complete picture. Let's look at some of the ways you can monitor these processes and resources.
The NT Task Manager is an ideal tool to get a quick view of what is happening in your system. It is not a very complex tool, nor does it give you a great deal of information, but for a quick look it is very useful. To invoke the Task Manager, right-click the NT 4.0 toolbar and select Task Manager, as shown in Figure 11.12.
Select Task Manager from the context menu.
Once the Task Manager is running, you'll see the performance screen (see Figure 11.13). This screen shows the CPU and memory usage in the system. In the lower part of the screen, you'll see numerical information relating to thread and process statistics and memory usage.
By clicking the Processes tab, you can view the Processes screen. From here you can see CPU, physical memory, and virtual memory used by the various processes in the system. The Processes screen is shown in Figure 11.14.
This information is quite useful and easy to get to, but it is quite coarse. As you can see, Oracle shows up as one process named oracle80.exe. This monitor gives you no information on the individual threads that make up this process.
The NT Performance Monitor (perfmon) can also be used to monitor threads, but as with the Task Manager, not much useful information can be gleaned from this. Once the Performance Monitor is invoked, you can select thread information, as shown in Figure 11.15. This information can be in terms of CPU time, context switches per second, and so on. This information has some use, but without knowing what session it is, it is not worth much.
An example of the perfmon.
For viewing process and thread information, I think the NT Performance Monitor is of little use. However, as you'll see on Day 20, the NT Performance Monitor is great for viewing I/O and CPU information.
The Instance Manager has some fundamental information that you can view about sessions. This information essentially concerns connections and activities. Open the Instance Manager and click the Sessions icon. Here you'll see the Sessions screen, which is shown in Figure 11.16.
You can glean a lot of information from this screen. The right portion of the screen provides the following session information:
Viewing sessions with the Instance Manager.
This information can be quite useful if you are looking for a particular session. It is also possible to disconnect a session from the instance using the Instance Manager. Right-click the session and select Disconnect, as shown in Figure 11.17. This disconnects that session from the Oracle instance.
Disconnecting a session.
You can gather a huge amount of information about the Oracle processes by querying the internal performance tables using the V$ views. These views, predefined by Oracle, look into the dynamic performance tables. This information can be quite useful. Some V$ views that are of most interest when you are monitoring processes are as follows:
If you want more information about the dynamic performance tables or performance in general, I suggest you read my first book, Oracle Performance Tuning and Optimization (Sams Publishing), in which I go into this and more in great detail.
Today you were introduced to two types of processes: user processes (also known as the shadow or server processes) and Oracle processes (also known as background processes). In some operating systems, such as Windows NT, these processes are actually threads.
As you saw, it is only through the server or shadow processes that the user communicates with the database. The user process manipulates the user's input and communicates with the Oracle server process through the Oracle program interface. The user process is also used to display the information requested by the user and, if necessary, can process this information into a more useful form. As you saw, this can happen on a one-to-one basis with the dedicated server process, or on a shared basis with the multithreaded server.
Day 12, "Working with Tables, Views, and Synonyms," introduces the Oracle schema. The next three days cover the various schema objects such as tables, views, clusters, and indexes. The partitioned table, which is new in Oracle8, is covered tomorrow. Also new in Oracle8 is the index only table, which is covered in the next three lessons.
The Oracle schema is a very important concept to both the Oracle DBA and to the application developer. The next three chapters finish the basics on building an Oracle database.
A The server processes are the background processes or threads that perform tasks on behalf of Oracle. They consist of the following: DBWR, LGWR, PMON, SMON, RECO, ARCH, CKPT, LCKn, and Dnnn.
Q What is a dedicated server process?
A A dedicated server process has a one-to-one correlation between the user process and the server process. Each user process gets one server process.
Q What is a shared server process?
A The shared server process handles more than one user process. The dispatcher queues the job and the shared server process executes it.
Q What is the Parallel Query option?
A The Parallel Query option is an Oracle option that allows large database operations to be parallelized and run simultaneously.
The workshop provides quiz questions to help you solidify your understanding of the material covered and exercises to provide you with experience in using what you've learned. Find the answers in Appendix A, "Answers."
2. What is the LGWR process used for?
3. What is the CKPT process used for?
4. Under what circumstances will the LCKn process be used?
5. What type of system includes a Transaction Monitor (TM)?
6. What component is usually the limiting factor in your system?
7. What is the order of precedence (that is, parameters, hints, table definitions, and so on) for Oracle choosing the degree of parallelism?
8. Does the Parallel Query option help you only if you have multiple processors?
9. With Oracle on NT, does the Oracle server use processes or threads?
10. Where does the multithreaded server get its memory from?
2. Using perfmon, look at the Oracle8 threads.
3. Try running a table scan both with and without the Parallel Query option.
4. Compare the disk activity with and without the Parallel Query option.
© Copyright, Macmillan Computer Publishing. All rights reserved.