2011年3月28日星期一

Expert Oracle Database Architecture读书笔记之Files

The files associated with an instance are simply
 - Parameter files: These files tell the Oracle instance where to find the control files,and they also specify certain initialization parameters that define how big certain memory structures are, and so on. We will investigate the two options available for storing database parameter files.
 - Trace files: These are diagnostic files created by a server process, generally in response to some exceptional error condition.
 - Alert files: These are similar to trace files, but they contain information about ¡Èexpected¡É events, and they also alert the DBA in a single, centralized file of many database events.

The files that make up the database are
 - Data files: These are for the database; they hold your tables, indexes, and all other data segment types.
 - Temp files: These are used for disk-based sorts and temporary storage.
 - Control files: These tell you where the data files, temp files, and redo log files are, as well as other relevant metadata about their state. They also contain backup information maintained by RMAN (Recovery Manager, the backup and recovery tool).
 - Redo log files: These are your transaction logs.
 - Password files: These are used to authenticate users performing administrative activities over the network. We will not discuss these files in any great detail as they are not a necessary component of any Oracle database.

 In Oracle 10gand above, there are a couple of new optional file types that are used by Oracle to facilitate faster backup and faster recovery operations. These two new files are
 - Change-tracking file: This file facilitates a true incremental backup of Oracle data. It does not have to be located in the Flash Recovery Area, but as it relates purely to database backup and recovery, we’ll discuss it in the context of that area.
 - Flashback log files: These files store “before images” of database blocks in order to facilitate the new FLASHBACK DATABASE command.

 We’ll also take a look at other types of files commonly associated with the database, such as
 - Dump (DMP) files: These files are generated by the Export database utility and consumed by the Import database utility.
 - Data Pump files: These files are generated by the Oracle Data Pump Export process and consumed by the Data Pump Import process. This file format may also be created and consumed by external tables.
 -Flat files: These are plain old files you can view in a text editor. You normally use these for loading data into the database.

SQL> select name from v$datafile;
SQL> select member from v$logfile;
SQL> select name from v$controlfile;

Parameter Files
What Are Parameters?
To see the current value of an instance parameter, you can query the V$ view V$PARAMETER. Alternatively, in SQL*Plus you can use the SHOW PARAMETER command, for example:
ops$tkyte%ORA11GR2> select value
  2    from v$parameter
  3   where name = 'db_block_size'
  4/
VALUE
-------------------------------------------------------------------------------
8192
ops$tkyte%ORA11GR2> show parameter db_block_s
NAME                                    TYPE         VALUE
------------------------------------ ----------- ------------------------------
db_block_size                           integer      8192
                                                                                                    SHOW
PARAMETER wins for me in ease of use and the fact that it “wildcards” automatically. Notice that I typed in only db_block_s; SHOW PARAMETER adds % to the front and back.

    “Normal” accounts are not granted access to the V$ performance views by default. Don’t let that get you down, however. There is a documented API typically available to all users that permits you to see the contents of V$PARAMETER; this little helper function helps you see what is set as a parameter. For example:
scott%ORA11GR2> create or replace
  2 function get_param( p_name in varchar2 )
  3 return varchar2
  4 as
  5       l_param_type number;
  6       l_intval        binary_integer;
  7       l_strval        varchar2(256);
  8       invalid_parameter exception;
  9       pragma exception_init( invalid_parameter, -20000 );
 10 begin
 11       begin
 12            l_param_type :=
 13            dbms_utility.get_parameter_value
 14            ( parnam => p_name,
 15                 intval => l_intval,
 16              strval => l_strval );
 17       exception
 18            when invalid_parameter
 19            then
 20                 return '*access denied*';
 21       end;
 22       if ( l_param_type = 0 )
 23       then
 24            l_strval := to_char(l_intval);
 25       end if;
 26       return l_strval;
 27 end get_param;
 28 /

If you execute this function in SQL*Plus, you’ll see:
scott%ORA11GR2> exec dbms_output.put_line( get_param( 'db_block_size' ) );
8192

Legacy init.ora Parameter Files
by default it will be found in
$ORACLE_HOME/dbs            (Unix)
%ORACLE_HOME%\DATABASE (Windows)




Server Parameter Files (SPFILEs) 

The naming convention for this file by default is
$ORACLE_HOME/dbs/spfile$ORACLE_SID.ora        (Unix environment variable)
%ORACLE_HOME/database/spfile%ORACLE_SID%.ora         (Windows environment variable)


The Storage Hierarchy in an Oracle Database

A database is made up of one or more tablespaces. A tablespace is a logical storage container in Oracle
that comes at the top of the storage hierarchy and is made up of one or more data files. These files might
be cooked files in a file system, raw partitions, ASM-managed database files, or files on a clustered file
system. A tablespace contains segments, as described next. 


Segments

Segments are the major organizational structure within a tablespace. Segments are simply your database  objects that consume storage—typically objects such as tables, indexes, undo segments, and so on. Most times, when you create a table, you create a table segment. When you create a partitioned table, you are not creating a table segment, rather you create a segment per partition. When you create an index, you  normally create an index segment, and so on. Every object that consumes storage is ultimately stored in a single segment. There are undo segments, temporary segments, cluster segments, index segments,  and so on.

Extents 
Segments consist of one or more extent. An extent is a logically contiguous allocation of space in a file. Traditionally, every segment starts with at least one extent, and some objects may require at least two (rollback segments require at least two extents). 11g Release 2 has introduced the concept of a “deferred” segment—a segment that will not immediately allocate an extent, so in that release and going forward, a segment might defer allocating its initial extent until data is inserted into it. When an object needs to grow beyond its initial extent, it will request another extent be allocated to it. This second extent will not necessarily be located right next to the first extent on disk—it may very well not even be allocated in the same file as the first extent. The second extent may be located very far away from the first extent, but the space within an extent is always logically contiguous in a file. Extents vary in size from one Oracle data block to 2GB.

Blocks

Extents, in turn, consist of blocks. A block is the smallest unit of space allocation in Oracle. Blocks are where your rows of data, or index entries, or temporary sort results are stored. A block is what Oracle typically reads from and writes to disk. Blocks in Oracle are generally one of four common sizes: 2KB, 4KB, 8KB, or 16KB (although 32KB is also permissible in some cases; there are restrictions in place as to the maximum size by operating system).















A segment is made up of one or more extents, and an extent is a logically contiguous allocation of
blocks.

  There will be the database default block size, which is the size specified in the initialization file during the CREATE DATABASE command. The SYSTEM tablespace will have this default block size always, but you can then create other tablespaces with nondefault block sizes of 2KB, 4KB, 8KB, 16KB and, depending on the operating system, 32KB.  Most blocks, regardless of their size, have the same general format, which looks something like
Figure 3-2.













The table directory, if present, contains information about the tables that store rows in this block 
(data from more than one table may be stored on the same block). The row directory contains 
information describing the rows that are to be found on the block. This is an array of pointers to where 
the rows are to be found in the data portion of the block. These three pieces of the block are collectively 
known as the block overhead, which is space used on the block that is not available for your data, but 
rather is used by Oracle to manage the block itself. 
  The remaining two pieces of the block are straightforward: there may be free space on a block, and 
then there will generally be used space that is currently storing data. 



Row Format
The row data part of the block contains the actual data, such as table rows or index key
entries. Just as every data block has an internal format, every row has a row format
that enables the database to track the data in the row.
Oracle Database stores rows as variable-length records. A row is contained in one or
more row pieces. Each row piece has a row header and column data



















Rowid Format
Oracle Database uses a rowid to uniquely identify a row. Internally, the
rowid is a structure that holds information that the database needs to access a row. A
rowid is not physically stored in the database, but is inferred from the file and block on
which the data is stored.
An extended rowid includes a data object number. This rowid type uses a base 64
encoding of the physical address for each row. The encoding characters are A-Z, a-z,
0-9, +, and /.


SQL> SELECT ROWID FROM employees WHERE employee_id = 100;

ROWID
------------------
AAAPecAAFAAAABSAAA











■ OOOOOO
The data object number identifies the segment (data object AAAPec in
Example 12–1). A data object number is assigned to every database segment.
Schema objects in the same segment, such as a table cluster, have the same data
object number.
■ FFF
The tablespace-relative data file number identifies the data file that contains the
row (file AAF in Example 12–1).
■ BBBBBB
The data block number identifies the block that contains the row (block AAAABS
in Example 12–1). Block numbers are relative to their data file, not their tablespace.
Thus, two rows with identical block numbers could reside in different data files of
the same tablespace.
■ RRR
The row number identifies the row in the block


Tablespaces 

As noted earlier, a tablespace is a container—it holds segments. Each segment belongs to exactly one
tablespace. A tablespace may have many segments within it. All of the extents for a given segment will be
found in the tablespace associated with that segment. Segments never cross tablespace boundaries. A
tablespace itself has one or more data files associated with it. An extent for any given segment in a
tablespace will be contained entirely within one data file. However, a segment may have extents from
many different data files. Graphically, a tablespace might look like Figure 3-3.








  Figure 3-3 shows a tablespace named USER_DATA. It consists of two data files, user_data01.dbf and
user_data02.dbf. It has three segments allocated it: T1, T2, and I1 (probably two tables and an index).
The tablespace has four extents allocated in it, and each extent is depicted as a logically contiguous set of
database blocks. Segment T1 consists of two extents, one extent in each file. Segments T2 and I1 each
have one extent depicted. If we need more space in this tablespace, we could either resize the data files
already allocated to the tablespace or we could add a third data file to it.
  A tablespaces is a logical storage container in Oracle. As developers, we will create segments in
tablespaces. We will never get down to the raw file level—we don’t specify that we want our extents to be
allocated in a specific file (we can, but in general we don’t). Rather, we create objects in tablespaces and
Oracle takes care of the rest. If at some point in the future, the DBA decides to move our data files
around on disk to more evenly distribute I/O, that is OK with us. It will not affect our processing at all.

Storage Hierarchy Summary 

1.  A database is made up of one or more tablespaces.
2.  A tablespace is made up of one or more data files. These files might be cooked
files in a file system, raw partitions, ASM managed database files, or a file on a
clustered file system. A tablespace contains segments.
3.  A segment (TABLE, INDEX, and so on) is made up of one or more extents. A
segment exists in a tablespace, but may have data in many data files within
that tablespace.
4.  An extent is a logically contiguous set of blocks on disk. An extent is in a single
tablespace and, furthermore, is always in a single file within that tablespace.
5.  A block is the smallest unit of allocation in the database. A block is the smallest
unit of I/O used by a database.

Dictionary-Managed and Locally-Managed Tablespaces

  if SYSTEM is locally managed, all other tablespaces in that database will be locally managed as well, and the legacy dictionary-managed method will not work. It’s not that dictionary-managed tablespaces are not supported in a database where SYSTEM is locally managed, it’s that they simply can’t be created:


ops$tkyte%ORA11GR2> create tablespace dmt
  2  datafile '/tmp/dmt.dbf' size 2m
  3  extent management dictionary;
create tablespace dmt
*
ERROR at line 1:
ORA-12913: Cannot create dictionary managed tablespace



Temp Files

Temporary data files (temp files) in Oracle are a special type of data file. Oracle will use temporary files to store the intermediate results of large sort operations and hash operations, as well as to store global temporary table data, or resultset data, when there is insufficient memory to hold it all in RAM. Permanent data objects, such as a table or an index, will never be stored in a temp file, but the contents of a temporary table and its indexes would be.


ops$tkyte%ORA11GR2> !df -h /tmp
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                      132G   79G   47G  63% /

ops$tkyte%ORA11GR2> create temporary tablespace temp_huge
  2  tempfile '/tmp/temp_huge.dbf' size 2048m;

Tablespace created.

ops$tkyte%ORA11GR2> !df -h /tmp
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                      132G   79G   47G  63% /

ops$tkyte%ORA11GR2> !ls -l /tmp/temp_huge.dbf
-rw-rw----  1 ora11gr2 ora11gr2 2147491840 Jan 20 15:45 /tmp/temp_huge.dbf


Apparently it didn’t take much storage to hold that file. If we look at the ls output, it appears to be a normal 2GB file, but it is, in fact, consuming only a few kilobytes of storage currently. So we could actually create undreds of these 2GB temporary files, even though we have roughly 47GB of disk space free. Sounds great—free storage for all! The problem is, as we start to use these temp files and they start expanding out, we would rapidly hit errors stating “no more space.” Since the space is allocated or physically assigned to the file as needed by the OS, we stand a definite chance of running out of room (especially if after we create the temp files, someone else fills up the file system with other stuff). How to solve this differs from OS to OS. On Linux, you can use dd to fill the file with data, causing the OS to physically assign disk storage to the file, or use cp to create a nonsparse file, for example:


ops$tkyte%ORA11GR2> !cp --sparse=never /tmp/temp_huge.dbf /tmp/temp_huge_not_sparse.dbf

ops$tkyte%ORA11GR2> !df -h /tmp
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
                      132G   81G   45G  65% /

ops$tkyte%ORA11GR2> drop tablespace temp_huge including contents and datafiles;


ops$tkyte%ORA11GR2> create temporary tablespace temp_huge
  2  tempfile '/tmp/temp_huge_not_sparse.dbf' reuse;

Control Files

  Control files are fairly small files (they can grow up to 64MB or so in extreme cases) that contain a
directory of the other files Oracle needs. The parameter file tells the instance where the control files are, 
and the control files tell the instance where the database and online redo log files are. 


  The control files also tell Oracle other things, such as information about checkpoints that have taken
place, the name of the database (which should match the db_name parameter in the parameter file), the 
timestamp of the database as it was created, an archive redo log history (this can make a control file 
large in some cases), RMAN information, and so on. 





Redo Log Files


  If the power goes off on your database machine, causing
an instance failure, Oracle will use the online redo logs to restore the system to exactly the point it was at 
immediately prior to the power outage. If your disk drive containing your data file fails permanently, 
Oracle will use archived redo logs, as well as online redo logs, to recover a backup of that drive to the 
correct point in time. Additionally, if you “accidentally” drop a table or remove some critical information and commit that operation, you can restore a backup and have Oracle restore it to the point just before the accident using these online and archive redo log files.

Online Redo Log

  Every Oracle database has at least two online redo log file groups. Each redo log group consists of one or
more redo log members (redo is managed in groups of members). The individual redo log file members 
of these groups are true mirror images of each other. These online redo log files are fixed in size and are 
used in a circular fashion. Oracle will write to log file group 1, and when it gets to the end of this set of 
files, it will switch to log file group 2 and rewrite the contents of those files from start to end. When it has 
filled log file group 2, it will switch back to log file group 1 (assuming we have only two redo log file 
groups; if we have three, it would, of course, proceed to the third group). This is shown in Figure 3-4.













The act of switching from one log file group to another is called a log switch. It is important to note that a log switch may cause a temporary “pause” in a poorly configured database. Since the redo logs are used to recover transactions in the event of a failure, we must be certain we won’t need the contents of a redo log file before we are able to use it. If Oracle isn’t sure that it won’t need the contents of a log file, it will suspend operations in the database momentarily and make sure that the data in the cache that this redo “protects” is safely written (checkpointed) onto disk. Once Oracle is sure of that, processing will resume and the redo file will be reused.


  The database buffer cache is where database blocks are stored temporarily. This is a structure in Oracle’s SGA. As blocks are read, they are stored in this cache, hopefully so we won’t have to physically reread them later. The buffer cache is first and foremost a performance-tuning device. It exists solely to make the very low process of physical I/O appear to be much faster than it is. When we modify a block by updating a row on it, these modifications are done in memory to the blocks in the buffer cache. Enough information to redo this modification is stored in the redo log buffer, another SGA data structure. When we COMMIT our modifications, making them permanent, Oracle does not go to all of the blocks we modified in the SGA and write them to disk. Rather, it just writes the contents of the redo log buffer out to the online redo logs. As long as that modified block is in the buffer cache and not on disk, we need the contents of that online redo log in case the database fails. If, at the instant after we committed, the power was turned off, the database buffer cache would be wiped out.
  If this happens, the only record of our change is in that redo log file. Upon restart of the database, Oracle will actually replay our transaction, modifying the block again in the same way we did and committing it for us. So, as long as that modified block is cached and not written to disk, we can’t reuse that redo log file.
  This is where DBWn comes into play. This Oracle background process is responsible for making space in the buffer cache when it fills up and, more important, for performing checkpoints. A checkpoint is the writing of dirty (modified) blocks from the buffer cache to disk. Oracle does this in the background for us. Many things can cause a checkpoint to occur, the most common being a redo log switch.
  As we filled up log file 1 and switched to log file 2, Oracle initiated a checkpoint. At this point, DBWn started writing to disk all of the dirty blocks that are protected by log file group 1. Until DBWn flushes all of
these blocks protected by that log file, Oracle can’t reuse it. If we attempt to use it before DBWn has finished its checkpoint, we’ll get a message like this in our database’s ALERT log:
...

Thread 1 cannot allocate new log, sequence 66
Checkpoint not complete
  Current log# 2 seq# 65 mem# 0: /home/ora11gr2/app/ora11gr2/oradata/orcl/redo01.log
...


  This is a message you never want to see in a nicely tuned database instance. If you do see it, you know for a fact that you have introduced artificial, unnecessary waits for your end users. This can always be avoided. The goal (and this is for the DBA, not the developer necessarily) is to have enough online redo log files allocated so that you never attempt to reuse a log file before the checkpoint initiated by it completes. If you see this message frequently, it means a DBA has not allocated sufficient online redo logs for the application, or that DBWn needs to be tuned to work more efficiently.

Archived Redo Log


Password Files 


Change Tracking File 

  The change-tracking file is a new, optional file for use with Oracle 10g Enterprise Edition and above. The
sole purpose of this file is to track what blocks have modified since the last incremental backup. With 
this, the Recovery Manager (RMAN) tool can back up only the database blocks that have actually been 
modified without having to read the entire database. 





sys%ORA11GR2> alter database enable block change tracking
  2  using file
  3  '/home/ora11gr2/app/ora11gr2/oradata/orcl/ORCL/changed_blocks.bct';

sys%ORA11GR2> alter database disable block change tracking;

Flashback Database 


DMP Files (EXP/IMP Files) 


Data Pump Files 


Flat Files 

2011年3月26日星期六

Expert Oracle Database Architecture读书笔记之Architecture Overview

Defining Database and Instance

•  Database: A collection of physical operating system files or disks. When using
Oracle Automatic Storage Management (ASM) or RAW partitions, the database
may not appear as individual, separate files in the operating system, but the
definition remains the same.
•  Instance: A set of Oracle background processes or threads and a shared memory area, which is memory that is shared across those threads or processes running on a single computer. This is the place for volatile, nonpersistent stuff, some of which gets flushed to disk. A database instance can exist without any disk storage
whatsoever. It might not be the most useful thing in the world, but thinking about it that way definitely helps draw the line between the instance and the database.

  Some further explanation should help clear up these concepts. An instance is
simply a set of operating system processes, or a single process with many threads, and some memory. These processes can operate on a database, which is just a collection of files (data files, temporary files, redo log files, and control files). At any time, an instance will have only one set of files (one database) associated with it. In most cases, the opposite is true as well: a database will have only one instance working on it. However, in the special case of Oracle Real Application Clusters (RAC), an Oracle option that allows it to function on many computers in a clustered environment, we may have many instances simultaneously mounting and opening this one database, which resides on a set of shared physical disk. This gives us access to this single database from many different computers at the same time.Oracle RAC provides for extremely highly available systems and has the potential to architect extremely scalable solutions.
  Let’s start by taking a look at a simple example. Say we’ve just installed Oracle 11g version 11.2.0.1 on our Linux-based computer.

 sqlplus / as sysdba
SQL>startup

  Oracle used defaults to put everything together and created a database as a set of persistent files. If we close this database and try to open it again, we’ll discover that we can’t:
SQL> alter database close;
Database altered.
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-1619c6: database has been previously opened and closed

  An instance can mount and open at most one database in its life. Remember, the instance consists simply of the processes and shared memory. This is still up and running. All we did was close the database, that is, the physical files. We must discard this instance (shutdown) and create a new one (startup) in order to open this or any other database.
•An instance is a set of background processes and shared memory.
•A database is a collection of data stored on disk.
•An instance can mount and open only a single database, ever.
•A database may be mounted and opened by one or more instances (using RAC)  and the number of instances mounting a single database can fluctuate over time.

The SGA and Background Processes
You’re probably ready now for an abstract picture of what an Oracle instance and database look like, so take a look at Figure 2-1.















Dedicated Server













Shared server










Mechanics of Connecting over TCP/IP

If we are making a dedicated server connection, the listener process will create a dedicated server for us. On UNIX, this is achieved via fork() and exec() system calls (the only way to create a new process after initialization in UNIX is via fork()). The new dedicated server process inherits the connection
established by the listener, and we are now physically connected to the  database. On Windows, the listener process requests the database process to create a new thread for a connection. Once this thread is created, the client is “redirected” to it, and we are physically connected. Diagrammatically in UNIX, it would look as shown in Figure 2-4.