ORACLE DATABASE PROBLEM AND SOLUTIONS

Saturday, January 13, 2018

Physical Database Storage Architecture

An Oracle database consists of files

Datafiles
Temp files
Control Files
Online Redo Log Files
Archived Redo Log Files
Parameter Files
Password Files
Alert and Trace Log Files
Backup Files

Datafiles:

Every Oracle database has one or more physical datafiles, which contain all the database data. The data of logical database structures, such as tables and indexes, is physically stored in the datafiles allocated for a database.
At the operating system level, Oracle Database stores database data in structures called data files. Every Oracle database must have data file.

Tempfiles:

Tempfiles are used with TEMPORARY tablespaces and are used for storing temporary data like sort ,spill-over or data for global temporary tables.

Control Files:

Every Oracle Database has a control file, which is a small binary file that records the physical structure of the database. The control file includes: The database name. Names and locations of associated datafiles and redo log files.Every time an instance of an Oracle database is started, its control file identifies the datafiles, tempfiles, and redo log files that must be opened for database operation to proceed.

Redo Log Files :

The most crucial structure for recovery operations is the redo log, which consists of two or more pre-allocated files that store all changes made to the database as they occur. Every instance of an Oracle Database has an associated redo log to protect the database in case of an instance failure.
Redo log files contain a chronological record of changes made to the database, and enable recovery when failures occur.To protect against a failure involving the redo log itself, Oracle Database lets you create a multiplexed redo log so that two or more copies of the redo log can be maintained on different disks.

Archived Redo Log Files:

Archived redo log files are database-generated offline copies of online redo log files. Oracle Database automatically archives redo log files when the database is in ARCHIVELOG mode. Oracle recommends that you enable automatic archiving of the online redo log.Archive redo log files contain history of data changes (redo) that is generated by instance .
When the database is running in ARCHIVELOG mode, the log writer process (LGWR) cannot reuse and hence overwrite a redo log group until it has been archived.An archived redo log file is a copy of one of the filled members of a redo log group.

Parameter Files:

Parameter files contain a list of configuration parameters for that instance and database.There are two types of parameter files.

PFILE : The init.ora file (also called the PFILE) is a static parameter file. It contains parameters that specify how the database instance is to start up. For example, some parameters will specify how to allocate memory to the various parts of the system global area.

SPFILE : The spfile.ora is a dynamic parameter file. It also stores parameters to specify how to startup a database; however, its parameters can be modified while the database is running.

Password file : specifies which *special* users are authenticated to startup/shut down an Oracle Instance. its use to authenticate users over network.

Trace Log Files : Each server and background process can write to an associated trace file. When an internal error is detected by a process, the process dumps information about the error to its trace file. Some of the information written to a trace file is intended for the database administrator, while other information is for Oracle Support Services. Trace file information is also used to tune applications and instances.

Alert Files : The alert file, or alert log, is a special trace file. The alert log of a database is a chronological log of messages and errors.

Backup files :

To restore a file is to replace it with a backup file. Typically, you restore a file when a media failure or user error has damaged or deleted the original file.
User-managed backup and recovery requires you to actually restore backup files before you can perform a trial recovery of the backups.
Server-managed backup and recovery manages the backup process, such as scheduling of backups, as well as the recovery process, such as applying the correct backup file when recovery is needed.

Connecting to Oracle Instance

Connecting to an Oracle Instance (User Process & Server Process)

Before users can submit SQL statements to an Oracle database, they must connect to an instance.

User Process:
User_process

A database user who needs to request information from the database must first make a connection with the Oracle server. The connection is requested using a database interface tool, such as SQL*Plus, and beginning the user process. The user process does not interact directly with the Oracle server. Rather it generates calls through the user program interface (UPI), which creates a session and starts a server process.

Server Process:

Once a user has established a connection, a server process is started to handle the user processes requests. A server process can be either a dedicated server process or a shared server process. In a dedicated server environment, the server process handles the request of a single user process. Once a user process disconnects, the server process is terminated. In a shared server environment, the server process handles the request of several user processes.The server process communicates with the Oracle server using the Oracle Program Interface.

Connecting to an Oracle Instance:

The user starts a tool such as SQL*Plus,sql developer,toad or runs an application developed using a tool such as Oracle Forms. This application or tool is executed as a user process.
In the most basic configuration, when a user logs on to the Oracle server, a process is created on the computer running the Oracle server. This process is called a server process. The server process communicates with the Oracle instance on behalf of the user process that runs on the client. The server process executes SQL statements on behalf of the user.

User Process is started at the time a database user request a connection to the oracle server .Server Process connect to the oracle instance and started when a user establish a session .
user process & server process are used to manage the execution of sql/plsql statements.
one-to-one correspondence between the User and Server Processes. This is called a Dedicated Server connection. An alternative configuration is to use a Shared Server where more than one User Process shares a Server Process.
Server Process is the go-between for a Client Process and the Oracle Instance.

Dedicated Server environment – there is a single Server Process to serve each Client Process.

Shared Server environment – a Server Process can serve several User Processes, although with some performance reduction

Connection

A connection is a communication pathway between a user process and an Oracle server. A database user can connect to an Oracle server using tool, sqlplus etc.

Sessions

A session is a specific connection of a user to an Oracle server. The session starts when the user is validated by the Oracle server, and it ends when the user logs out or when there is an abnormal termination. For a given database user, many concurrent sessions are possible if the user logs on from many tools, applications, or terminals at the same time.

PGA(Program Global Area)

Oracle instance consists of SGA and oracle processes. Oracle processes can be divided into server processes and background processes. Oracle database uses background processes to perform overall database operations, for example, database writer process writes dirty buffer from database buffer cache to datafiles. Oracle creates one or more Server Processes for users connected to database. Each oracle process is assigned a memory area to contain data and control information. This memory area is called Program Global Area (PGA).

The Program Global Area (PGA) is a private memory region that contains the data and control information for a server process. Only a server process can access the PGA. Oracle Database reads and writes information in the PGA on behalf of the server process. it will allocate depend upon connection type :
Dedicated Server environment – the Private SQL Area is located in the Program Global Area (PGA).
Shared Server environment – the Private SQL Area is located in the System Global Area (SGA).

A PGA is:

a non-shared memory region that contains data and control information exclusively for use by an Oracle process.
One PGA exists for each Server Process and each Background Process. It stores data and control information for a single Server Process or a single Background Process.
PGA is the memory reserved for each user process connecting to an Oracle Database and is allocated when a process is created and deallocated when a process is terminated.
This is NOT a shared part of memory – one PGA to each process only.
The collection of individual PGAs is the total instance PGA, or instance PGA.

Contents of PGA:-

Oracle database creates server processes to mange requests of user process. All user process issues SQL statements, it is the responsibility of Server process to execute these statements and return results to user process. Each server process has one PGA. PGA is memory where server process executes statements and stores information.

PGA consists of:

Private SQL Area
session memory
SQL work Areas

Private SQL Area: Contains data such as bind information and run-time memory structures. It contains Persistent Area which contains bind information and is freed only when the cursor is closed and Run time Area which is created as the first step of an execute request. This area is freed only when the statement has been executed. The number of Private SQL areas that can be allocated to a user process depends on the OPEN_CURSORS initialization parameter. The location of a private SQL area depends on the type of connection established for a session. If a session is connected through a dedicated server, private SQL areas are located in the server process's PGA. However, if a session is connected through a shared server, part of the private SQL area is kept in the SGA.

Session Memory: Consists of memory allocated to hold a session’s variable and other info related to the session information for example session variables and logon information

SQL Work Areas: Used for memory intensive operations such as: Sort-based operators (order by, group-by, rollup, window function) ,Hash-join, Bitmap merge, Bitmap Create

ARCN (Archiver Process)

The archiver process (ARCn) copies redo log files to a designated storage device after a log switch has occurred. ARCn processes are present only when the database is in ARCHIVELOG mode, and automatic archiving is enabled ARCn Responsible for copying online redo log to archival storage before being reused. Runs only when database is in Archivelog Mode.When a Redo Log File fills up, Oracle switches to the next Redo Log File.

If all Redo Log Files fill up, then Oracle switches back to the first one and uses them in a round-robin fashion by overwriting ones that have already been used.
Overwritten Redo Log Files have information that, once overwritten, is lost forever.

ARCHIVELOG Mode:

If ARCn is in what is termed ARCHIVELOG mode, then as the Redo Log Files fill up, they are individually written to Archived Redo Log Files.
LGWR does not overwrite a Redo Log File until archiving has completed.
Committed data is not lost forever and can be recovered in the event of a disk failure.
Only the contents of the SGA will be lost if an Instance fails.

NOARCHIVELOG Mode:

The Redo Log Files are overwritten and not archived.
Recovery can only be made to the last full backup of the database files.
All committed transactions after the last full backup are lost, and you can see that this could cost the firm a lot of $$$.

When running in ARCHIVELOG mode, the DBA is responsible to ensure that the Archived Redo Log Files do not consume all available disk space! Usually after two complete backups are made, any Archived Redo Log Files for prior backups are deleted.

RECO (Recoverer Process)

The Recoverer Process (RECO) is used to resolve failures of distributed transactions in a distributed database.
Consider a database that is distributed on two servers – one in India and one in Chicago.
Further, the database may be distributed on servers of two different operating systems, e.g. LINUX and Windows.
The RECO process of a node automatically connects to other databases involved in an in-doubt distributed transaction.
When RECO reestablishes a connection between the databases, it automatically resolves all in-doubt transactions, removing from each database's pending transaction table any rows that correspond to the resolved transactions.
Recoverer is responsible for recovering failed distributed transactions in a distributed database.
If the RECO process fails to connect with a remote server, RECO automatically tries to connect again after a timed interval. However, RECO waits an increasing amount of time (growing exponentially) before it attempts another connection. The RECO process is present only if the instance permits distributed transactions. The number of concurrent distributed transactions is not limited.

CKPT (checkpoint)

A checkpoint is a data structure that defines a system change number (SCN) in the redo thread of a database. Checkpoints are recorded in the control file and in each data file header. They are a crucial element of recovery.

When a checkpoint occurs, Oracle Database must update the headers of all data files to record the details of the checkpoint. This is done by the CKPT process. The CKPT process does not write blocks to disk; DBWn always performs that work. The SCNs recorded in the file headers guarantee that all changes made to database blocks prior to that SCN have been written to disk.

A checkpoint identifies a point in time with regard to the Redo Log Files where instance recovery is to begin should it be necessary.
A checkpoint is taken at a minimum, once every three seconds.
An event called a checkpoint occurs when the Oracle background process DBWn writes all the modified database buffers in the SGA, including both committed and uncommitted data, to the data files

How works CKPT:

Think of a checkpoint record as a starting point for recovery. DBWn will have completed writing all buffers from the Database Buffer Cache to disk prior to the checkpoint, thus those records will not require recovery.
This does the following:

Ensures modified data blocks in memory are regularly written to disk – CKPT can call the DBWn process in order to ensure this and does so when writing a checkpoint record.
Reduces Instance Recovery time by minimizing the amount of work needed for recovery since only Redo Log File entries processed since the last checkpoint require recovery.
Causes all committed data to be written to datafiles during database shutdown
When checkpointing occurs ,its generate SCN (System Change Number) number and its recorded every datafiles header and control files .

If a Redo Log File fills up and a switch is made to a new Redo Log File , the CKPT process also writes checkpoint information into the headers of the datafiles.
Checkpoint information written to control files includes the system change number (the SCN is a number stored in the control file and in the headers of the database files that are used to ensure that all files in the system are synchronized), location of which Redo Log File is to be used for recovery, and other information.
CKPT does not write data blocks or redo blocks to disk – it calls DBWn and LGWR as necessary.

LGWR (Log Writer)

LGWR writes information from redo log buffers to redo log files .

Oracle database keeps record of changes made to data. Every time user performs a DML, DDL or DCL operation, its redo entries are also created. These redo entries contain commands to rebuild or redo the changes. These entries are stored in Redo Log buffer.
Log writer process (LGWR) writes these redo entries to redo log files. Redo log buffer works in circular fashion. It means that it overwrites old entries. But before overwriting, old entries must be copies to redo log files. Usually Log writer process (LGWR) is fast enough to mange these issues. Log writer process (LGWR) writes redo entries after certain amount of time to ensure that free space is available for new redo entries.
The Log Writer (LGWR) writes contents from the Redo Log Buffer to the Redo Log File that is in use. These are sequential writes since the Redo Log Files record database modifications based on the actual time that the modification takes place.
LGWR actually writes before the DBWn writes and only confirms that a COMMIT operation has succeeded when the Redo Log Buffer contents are successfully written to disk.

LGWR can also call the DBWn to write contents of the Database Buffer Cache to disk.

How LGWR works:

The redo log buffer is a circular buffer. When LGWR writes redo entries from the redo log
buffer to a redo log file, server processes can then copy new entries over the entries in the redo
log buffer that have been written to disk. LGWR normally writes fast enough to ensure that
space is always available in the buffer for new entries, even when access to the redo log is
heavy. LGWR writes one contiguous portion of the buffer to disk.
LGWR writes:
• When a user process commits a transaction
• When the redo log buffer is one-third full
• Before a DBWn process writes modified buffers to disk (if necessary)
• Every three seconds

Before DBWn can write a modified buffer, all redo records that are associated with the changes to the buffer must be written to disk (the write-ahead protocol). If DBWn finds that some redo records have not been written, it signals LGWR to write the redo records to disk and waits for LGWR to complete writing the redo log buffer before it can write out the data buffers. LGWR writes to the current log group. If one of the files in the group is damaged or unavailable, LGWR continues writing to other files in the group and logs an error in the LGWR trace file and in the system alert log. If all files in a group are damaged, or if the group is unavailable because it has not been archived, LGWR cannot continue to function.
When a user issues a COMMIT statement, LGWR puts a commit record in the redo log buffer and writes it to disk immediately, along with the transaction’s redo entries. The corresponding changes to data blocks are deferred until it is more efficient to write them. This is called a fast commit mechanism. The atomic write of the redo entry containing the transaction’s commit record is the single event that determines whether the transaction has committed. Oracle Database returns a success code to the committing transaction, although the data buffers have not yet been written to disk. If more buffer space is needed, LGWR sometimes writes redo log entries before a transaction is committed. These entries become permanent only if the transaction is later committed. When a user commits a transaction, the transaction is assigned a system change number (SCN), which Oracle Database records along with the transaction’s redo entries in the redo log. SCNs are recorded in the redo log so that recovery operations can be synchronized in Real Application Clusters and distributed databases.

In times of high activity, LGWR can write to the redo log file by using group commits. For example, suppose that a user commits a transaction. LGWR must write the transaction’s redo entries to disk. As this happens, other users issue COMMIT statements. However, LGWR cannot write to the redo log file to commit these transactions until it has completed its previous write operation. After the first transaction’s entries are written to the redo log file, the entire list of redo entries of waiting transactions (not yet committed) can be written to disk in one operation, requiring less I/O than do transaction entries handled individually. Therefore, Oracle Database minimizes disk I/O and maximizes performance of LGWR. If requests to commit continue at a high rate, every write (by LGWR) from the redo log buffer can contain multiple commit records