Wednesday, September 9, 2009

Networking Requirements for Oracle Grid Infrastructure

With the release of Oracle 11gR2, Oracle has provided a whole new concept for the Grid Infrastructure. One of the most eye-catching requirements is a new Network Infrastructure requirement. It is no longer possible (maybe with some hacking and tricking) to just have a couple of Network Interface Cards and a bunch of IP addresses configured through the hosts file in /etc. I will try to summarize the requirements that Oracle has defined, and how to implement them:

Network Hardware Requirements

Oracle states the following requirements:


  • Each node must have at least two network adapters or network interface cards (NICs): one for the public network interface, and one for the private network interface (the interconnect).If you use multiple NICs for the public network or for the private network, the recommendation is to use NIC bonding. Public and private bonds should be separated
  • The interface names Oracle will use for the network adapters for each network must be the same on all nodes.In other words: The public network adapter should, for example, be called eth0 on all nodes and the private adapter eth1 on all nodes.
  • The public network adapter must support TCP/IP.
  • The private network adapter must support the user datagram protocol (UDP)High-speed network adapters and switches that support TCP/IP (minimum requirement 1 Gigabit Ethernet) are required for the private network.

IP Address Requirements
Before starting the installation, you must have at least two interfaces configured on each node: One for the private IP address and one for the public IP address.

Reading the installation manual for Grid Infrastructure (of which btw this article is a summary) a DNS server and a DHCP server are now required to setup an Oracle Grid Infrastructure.


You can configure IP addresses with one of the following options:

Oracle Grid Naming Service (GNS)
Using a static public node address and dynamically (DHCP) allocated IP addresses for the Oracle Clusterware provided VIP addresses. The hostnames will be resolved using a multicast domain name server, which is configured as a part of Oracle Clusterware, within the cluster. If you plan to use GNS, then you are required to have a DHCP service running on the public network for the cluster which will be able to provide an IP address for each node's virtual IP, and 3 IP addresses for the cluster used by the Single Client Access Name (SCAN) for the cluster.

Fixed IP addresses, assigned on a DNS Server for each node.
In other words, you can no longer trick around by defining the hosts in the /etc/hosts file. DNS is a requirement, as well as DHCP. Well, that may not be the case, but I will have to test this still...

IP Address Requirements with Grid Naming Service
If you enable Grid Naming Service (GNS), name resolution for any node in the cluster is delegated to the GNS server, listening on the GNS virtual IP address. You have to define this address in the DNS domain before you start the installation. The DNS must be configured to delegate resolution requests for cluster names (any names in the subdomain delegated to the cluster) to the GNS. When a request is made, GNS processes the requests and responds with the appropriate addresses for the name requested.
In order to use GNS, the DNS administrator must establish a DNS Lookup to direct DNS resolution of an entire subdomain to the cluster. This should be done before you install Oracle Grid Infrastructure! If you enable GNS, then you must also have a DHCP service on the public network that allows the cluster to dynamically allocate the virtual IP addresses as required by the cluster.

IP Address Requirements for Manual Configuration
If you choose not to enable GNS, then the public and virtual IP addresses for each node must be static IP addresses, but they should not be in use before you start the installation. Public and virtual IP addresses must be on the same subnet, but that should be no news, since that has been the case since the days VIPs were introduced (Oracle 10g).
The cluster must have the following IP addresses:


  • A public IP address for each node
  • A virtual IP address for each node
  • A single client access name (SCAN) configured on the domain name server (DNS) for Round Robin resolution to at least one address (three addresses is recommended).
SCAN
The single client access name (SCAN) is a name used to provide service access for clients to the cluster. One of the major benefits of SCAN is that the SCAN enables you to add or remove nodes from the cluster without the need to reconfigure clients This is because the SCAN is associated with the cluster as a whole, rather than to a particular node. Another benefit is location independence for the databases. Client configuration does not have to depend on which nodes are running a particular database. Clients can however continue to access the cluster like with previous releases. The recommendation is to use SCAN.
The SCAN addresses must be on the same subnet as virtual IP addresses and public IP addresses. For high availability and scalability, you should configure the SCAN to use Round Robin resolution to three addresses. The name for the SCAN cannot begin with a numeral. For installation to succeed, the SCAN must resolve to at least one address.
It is best not to use the hosts file to configure SCAN VIP addresses. In stead, use DNS resolution for SCAN VIPs. If you use the hosts file to resolve SCANs, then you will only be able to resolve to one IP address. The result is that you will have only one SCAN address. It could possibly be a workaround for lab circumstances, but again, I didn't test it this way yet.

DNS Configuration for Domain Delegation to Grid Naming Service
If you plan to use GNS, then before grid infrastructure installation, you must configure your DNS server to forward requests for the cluster subdomain to GNS.
In order to establish this, you must use so-called delegation. This is how you configure delegation: In the DNS, create an entry for the GNS virtual IP address. For example:
gns.domain.net: 192.168.146.51 (The address you provide must be routable)
In the DNS, create an entry similar to the following for the delegated domain, where cluster.domain.net is the subdomain you want to delegate:
cluster.domain.net: NS gns.domain.net
You must also add the DNS Servers to the resolv.conf on the nodes in the cluster

Make sure that the nis entry in /etc/nsswitch.conf is at the end of the search list. For example:
hosts: files dns nis

Grid Naming Service Configuration Example
When nodes are added to the cluster, the DHCP server can provide addresses for these nodes dynamically. These addresses are then registered automatically in GNS, and GNS provides delegated resolution within the subdomain to cluster node addresses registered with GNS.
Because allocation and configuration of addresses is performed automatically with GNS, additional configuration is no longer required. Oracle Clusterware provides dynamic network configuration as nodes are added to or removed from the cluster. The following example should make things a bit clear (click on the image to get a larger view):


The above table reflects a two node cluster with GNS configured. The cluster name is "cluster", the GNS parent domain is domain.net, the subdomain is cluster.domain.net, 192.0.2 in the IP addresses represent the cluster public IP address network, and 192.168.0 represents the private IP address subnet.

Manual IP Address Configuration Example
If you decide not to use GNS, then you must configure public, virtual, and private IP addresses before installation. Also, check that the default gateway is reachable.
For example, with a two node cluster where each node has one public and one private interface, and you have defined a SCAN domain address to resolve on your DNS to one of three IP addresses, you might have the configuration shown in the following table for your network interfaces:

You do not need to provide a private name for the interconnect. If you want name resolution for the interconnect, then you can configure private IP names in the hosts file or the DNS. However, Oracle Clusterware assigns interconnect addresses on the interface defined during installation as the private interface (eth1, for example), and to the subnet used for the private subnet.
The addresses to which the SCAN resolves are assigned by Oracle Clusterware, so they are not fixed to a particular node. To enable VIP failover, the configuration shown in the preceding table defines the SCAN addresses and the public and VIP addresses of both nodes on the same subnet, 192.0.2.

Network Interface Configuration Options
An important thing to be aware of is when you use NAS for RAC and this storage is connected through an Ethernet network. In this case, you must have a third network interface for NAS I/O, otherwise you can face serious performance issues.

Enable Name Service Cache Daemon
You should enable nscd (Name Service Cache Daemon), to prevent network failures with RAC databases using Network Attached Storage or NFS mounts.
Use chkconfig --list nscd
To change the configuration, enter one of the following command (as root):
# chkconfig --level 35 nscd on

If you meet either one of the above method requirements (GNS or Manual configuration), you are safe to install Oracle Grid Infrastructure.

Source: Oracle 11gR2 Grid Infrastructure Installation Guide

Friday, September 4, 2009

Oracle 11gR2 Grid Infrastructure Installation

One of the new "features" of Oracle 11gR2 is the Grid Infrastructure. Oracle 11gR2 Grid Infrastructure is the collection of infrastructural components provided for Oracle Database and other software.
I tested the installation of Oracle Grid Infrastructure on two virtual machines running OEL 5.3, operating under VMware Server 1.08. I suppose it is not supported this way, but it can give an idea of what to expect during the installation. The screenshots below show the installation procedure. I will add some comments when appropriate, especially to emphasize on new features. NOTE: Click on the screenshot to get a larger view.


The first installation screen. The bullet shows which option I chose here.

Usually the Advanced option is for experienced users. Well, at least I think I am experienced, I suppose...

Languages. Anyone ever chose additional languages here? I never did.

Nice new feature here: SCAN (Single Client Access Name): All the servers in the cluster will act upon a single hostname. This way, the cluster becomes completely transparent.
GNS: If you enable Grid Naming Service (GNS), then name resolution requests to the cluster are delegated to the GNS, which is listening on the GNS virtual IP address. You define this address in the DNS domain before installation. The DNS must be configured to delegate resolution requests for cluster names (any names in the subdomain delegated to the cluster) to the GNS. When a request comes to the domain, GNS processes the requests and responds with the appropriate addresses for the name requested.
To use GNS, before installation the DNS administrator must establish DNS Lookup to direct DNS resolution of a subdomain to the cluster. If you enable GNS, then you must have a DHCP service on the public network that allows the cluster to dynamically allocate the virtual IP addresses as required by the cluster.
Validation is performed... and accepted!

As a result of enabling GNS, I can no longer assign VIP addresses to my cluster node names. These addresses are assigned by GNS from now on. Of course, when I wouldn't have enabled GNS, I were able to assign the addresses in this screen.

Regular SSH connectivity test

Network Interface Usage specification. Should be no surprise for anyone experienced in 10g Clusterware.

Now, I will come to the choice of my storage solution. Before I started the installation I added another shared disk, enabled ASMlib on both servers and created an ASM Disk. Therefore, I can now choose ASM for storage.

Choose the diskgroup name and assign disks to the diskgroup.

Passwords. Be careful here! Passwords should comply with Oracle Recommendations, however you can ignore them. You will be notified when something is wrong or should be reviewed according to Oracle:


I am not using IPMI (Intelligent Platform Management Interface) so, I didn't select this option.
IPMI provides a set of common interfaces to computer hardware and firmware that system administrators can use to monitor system health and manage the system. With Oracle 11g release 2, Oracle Clusterware can integrate IPMI to provide failure isolation support and to ensure cluster integrity.
You can configure node-termination with IPMI during installation by selecting a node-termination protocol, such as IPMI. You can also configure IPMI after installation with crsctl commands.

Here you can assign different groups to specific management options.
ASM Database Administrator (OSDBA)
ASM Instance Administration Operator (OSOPER)
ASM Instance Administrator (OSASM)
Oracle has separated the privileges for each of these groups of administrators. Therefore, when I chose dba as group for each of these, I got a warning. Under normal (production operational) circumstances, I should have created three different groups and assigned these groups to different users on my system. Well, I have got only one user on my test environment: oracle, so what is the use of setting up three separate groups? Anyway, the idea is clear and it is good practice to follow these guidelines.

Specify the locations of the Oracle Base and Oracle Grid Home. Oracle Grid Home should be installed parallel to the Oracle Base, so not inside the Oracle Base!

File locations are being checked on both servers.

New feature: Maybe it is better to say there is an enhancement here. The system is being verified thouroughly for prerequisites. The result can be seen in the next screen shots:

Uhh. I have got some work to do here...
Nice new feature : the fixup script. Everything that Oracle can change in a script, a script is provided for by the installer. Just run the script and those checks that failed and are Fixable are being fixed. Don't worry about prerequisite checks before running the installer. Just let the installer do the work for you. It'll report everything you need to fix before installing. Just push the "Check Again" button as many times as you wish until everything is reported as passed.

Execute the runfixup.sh script:

...And Check Again...
Well, I didn't have enough memory and swap. Also, NTP wasn't started with the -x option... Who cares? - We'll see at the end of this weblog...
...summary...

Start the installation by clickin on Finish.
This is what you'll see during installation:
Tap, tap... :-)

...tap...!
We're almost there! Just run these two scripts and we're finished with the installation of Grid Infrastructure.
Well, that is what I thought. It took my virtual machines almost all night to finish the last script (root.sh in Grid Home). In fact, the script timed out after an hour or so, and the Virtual Machine practically froze... After having waited a couple of hours it even rebooted my entire machine (Including the Host OS!).
I will have to assign some more memory to the guests and see if I can finish the installation, or whether that is needed at all. However, it is weekend now, my wife and kids will not like me sitting behind my laptop all weekend. We'll be back next week...

11g Release 2 New Features for Grid Infrastructure

Being a RAC fanatic, the new Oracle 11g Release 2 Grid Infrastructure is one of my main focuses to investigate. Here is what I found to be the new features of Grid Infrastructure, which by the way comprises of Oracle Clusterware and Oracle Automatic Storage Management (ASM). All of them are truly fantastic features that are screaming to be investigated... I'll do my best to cover some of them here in the near future.

Oracle grid Infrastructure
With Oracle grid infrastructure 11g release 2 (11.2), Automatic Storage Management (ASM) and Oracle Clusterware are installed into a single home directory, which is referred to as the Grid Infrastructure home.

Oracle Automatic Storage Manager and Oracle files
With this release, Oracle Cluster Registry (OCR) and voting disks can be placed on Automatic Storage Management (ASM).
This feature enables ASM to provide a unified storage solution, storing all the data for the clusterware and the database, without the need for third-party volume managers or cluster filesystems.
For new installations, OCR and voting disk files can be placed either on ASM, or on a cluster file system or NFS system. Installing Oracle Clusterware files on raw or block devices is no longer supported, unless an existing system is being upgraded.

Automatic Storage Management Cluster File System (ACFS)
Oracle Automatic Storage Management Cluster File System (Oracle ACFS) is a new multi-platform, scalable file system and storage management design that extends Oracle Automatic Storage Management (ASM) technology to support all application data. Oracle ACFS provides dynamic file system resizing, and improved performance using the distribution, balancing and striping technology across all available disks, and provides storage reliability through ASM's mirroring and parity protection.

ASM Job Role Separation Option with SYSASM
The SYSASM privilege that was introduced in Oracle ASM 11g release 1 (11.1) is now fully separated from the SYSDBA privilege. If you choose to use this optional feature, and designate different operating system groups as the OSASM and the OSDBA groups, then the SYSASM administrative privilege is available only to members of the OSASM group. The SYSASM privilege also can be granted using password authentication on the Oracle ASM instance.
You can designate OPERATOR privileges (a subset of the SYSASM privileges, including starting and stopping ASM) to members of the OSOPER for ASM group.
Providing system privileges for the storage tier using the SYSASM privilege instead of the SYSDBA privilege provides a clearer division of responsibility between ASM administration and database administration, and helps to prevent different databases using the same storage from accidentally overwriting each other's files.

Cluster Time Synchronization Service
Cluster node times should be synchronized. With this release, Oracle Clusterware provides Cluster Time Synchronization Service (CTSS), which ensures that there is a synchronization service in the cluster. If Network Time Protocol (NTP) is not found during cluster configuration, then CTSS is configured to ensure time synchronization.

Enterprise Manager Database Control Provisioning
Enterprise Manager Database Control 11g provides the capability to automatically provision Oracle grid infrastructure and Oracle RAC installations on new nodes, and then extend the existing Oracle grid infrastructure and Oracle RAC database to these provisioned nodes. This provisioning procedure requires a successful Oracle RAC installation before you can use this feature.
See Also: Oracle Real Application Clusters Administration and Deployment Guide for information about this feature.

Fixup Scripts and Grid Infrastructure Checks
With Oracle Clusterware 11g release 2 (11.2), the installer (OUI) detects when minimum requirements for installation are not completed, and creates shell script programs, called fixup scripts, to resolve many incomplete system configuration requirements. If OUI detects an incomplete task that is marked "fixable", then you can easily fix the issue by generating the fixup script by clicking the Fix & Check Again button.
The fixup script is generated during installation. You are prompted to run the script as root in a separate terminal session. When you run the script, it raises kernel values to required minimums, if necessary, and completes other operating system configuration tasks.
You also can have Cluster Verification Utility (CVU) generate fixup scripts before installation.

Grid Plug and Play
In the past, adding or removing servers in a cluster required extensive manual preparation. With this release, you can continue to configure server nodes manually, or use Grid Plug and Play to configure them dynamically as nodes are added or removed from the cluster.
Grid Plug and Play reduces the costs of installing, configuring, and managing server nodes by starting a grid naming service within the cluster to allow each node to perform the following tasks dynamically: Negotiating appropriate network identities for itself
Acquiring additional information it needs to operate from a configuration profile
Configuring or reconfiguring itself using profile data, making hostnames and addresses resolvable on the network Because servers perform these tasks dynamically, the number of steps required to add or nodes is minimized.

Intelligent Platform Management Interface (IPMI) Integration
Intelligent Platform Management Interface (IPMI) is an industry standard management protocol that is included with many servers today. IPMI operates independently of the operating system, and can operate even if the system is not powered on. Servers with IPMI contain a baseboard management controller (BMC) which is used to communicate to the server.
If IPMI is configured, then Oracle Cluster uses IPMI when node fencing is required and the server is not responding.

Oracle Clusterware Out-of-place Upgrade
With this release, you can install a new version of Oracle Clusterware into a separate home from an existing Oracle Clusterware installation. This feature reduces the downtime required to upgrade a node in the cluster. When performing an out-of-place upgrade, the old and new version of the software are present on the nodes at the same time, each in a different home location, but only one version of the software is active.

Oracle Clusterware Administration with Oracle Enterprise Manager
With this release, you can use Enterprise Manager Cluster Home page to perform full administrative and monitoring support for both standalone database and Oracle RAC environments, using High Availability Application and Oracle Cluster Resource Management.
When Oracle Enterprise Manager is installed with Oracle Clusterware, it can provide a set of users that have the Oracle Clusterware Administrator role in Enterprise Manager, and provide full administrative and monitoring support for High Availability application and Oracle Clusterware resource management. After you have completed installation and have Enterprise Manager deployed, you can provision additional nodes added to the cluster using Enterprise Manager.

SCAN for Simplified Client Access
With this release, the single client access name (SCAN) is the hostname to provide for all clients connecting to the cluster. The SCAN is a domain name registered to at least one and up to three IP addresses, either in the domain name service (DNS) or the Grid Naming Service (GNS). The SCAN eliminates the need to change clients when nodes are added to or removed from the cluster. Clients using the SCAN can also access the cluster using EZCONNECT.
SRVCTL Command Enhancements for Patching
With this release, you can use srvctl to shut down all Oracle software running within an Oracle home, in preparation for patching. Oracle grid infrastructure patching is automated across all nodes, and patches can be applied in a multi-node, multi-patch fashion.

Typical Installation Option
To streamline cluster installations, especially for those customers who are new to clustering, Oracle introduces the Typical Installation path. Typical installation defaults as many options as possible to those recommended as best practices.

Voting Disk Backup Procedure Change
In prior releases, backing up the voting disks using a dd command was a required postinstallation task. With Oracle Clusterware release 11.2 and later, backing up and restoring a voting disk using the dd command is not supported.
Backing up voting disks manually is no longer required, as voting disks are backed up automatically in the OCR as part of any configuration change and voting disk data is automatically restored to any added voting disks.

Tuesday, September 1, 2009

Oracle Database 11g Release 2 available

I just found out that Oracle Database 11g Release 2 for Linux is available for download.
I am downloading it now, hopefully I will find some time to investigate some of those new features. I have heard there are a bunch in the RAC and ASM area. I can't wait to get my fingers on it...

Be back soon!

Thursday, August 27, 2009

OGh DBA Day in Utrecht, The Netherlands

On November 3, 2009, the OGh (Oracle Gebruikersclub Holland - An independent Oracle user group in The Netherlands) will organize a DBA day. Topic of the day is High Availability.

I will be doing a presentation on the subject of recovering a corrupted database without restoring the backup. A Demo will be included, as well as pros and cons, opportunities and limitations.

The rest of the day will be defined soon, so for the moment no further information is available.
Check back soon for more details!

Thursday, August 20, 2009

New Feature Added: E-Mail Notification

I added a new feature to my blog today: E-Mail Notification.
If you are interested in getting updates from this Weblog, simply subscribe using the widget at the side-bar of this page. You will get an e-mail as soon as new posts are published.
Happy reading and have a wonderful day.

Friday, July 3, 2009

Certified, at last!

Last year I took the beta exam for E-Business Suite Release 12, Install, Patch and Maintain Applications. Unfortunately I failed that exam, but I didn't prepare at all. In fact, I found out about the beta exam one day before the beta program ended...
Now I took some more time to prepare for the exam and passed with 82% score.
So, as of today I am officially allowed to call myself: Oracle E-Business Suite Release 12 Database Administrator Certified Professional.
The exam was very appropriate: Proper level of difficulty, good questions, some pretty hard, I must say.
If you want to prepare: Take the course (Install Patch and Maintain Applications), learn all the new features of R12, the new file locations (logfiles, etc), and make yourself familiar with the managment tools around R12. Then you should be able to pass without too much problems.