In our
previous article, we have talking how database connectivity was established
prior Oracle Database RAC 11g Release 2. Now, we will understand how database
connectivity is working using SCAN. But in order to understand this, we have to
understand how server pools work.
Traditionally,
when you connect into a cluster, you had a dedicated clusters for specific
applications. Let's take an example. We have an application called CRM, an
application called ERP and an application called DWH. By default you would
create a database cluster for each one of them individually. When you size your
database, you always size for the peak load and always remember that, the peak
load in not something that is constant requirement. A peak load might occur one
a week, one a month etc. So when you size your database cluster in terms of its
resources, you would typically keep the peak load in mind. The result is that
your infrastructure investment is lying idle most of the time.
So in our
example, we will have one RAC for CRM, one RAC for ERP and one RAC for DWH:
Under
such circumstances, you are not using all the three nodes in each of the
clusters totally nine nodes. But you have to invest in Hardware, Software
including Licenses, Support which includes the number of CPU or cores you have
on these clusters. This is not the best use, So what do we means by the best
use ? Are we going to run all these nodes actually 80%, 90% utilization?
RARELY. That is number 1 problem.
In case on a particular day, my CRM cluster
loses two nodes for whatever reason. You will not be able to share the
resources from ERP or DWH cluster, by moving node. So you have the resources
lying idle on the other two clusters, but you are not able to use them in the
cluster which needs them.
So, in
11g release 2, Oracle introduced a concept of server pool, where Oracle said :
Don't create the dedicated clusters, Instead, create a common pool of cluster.
It's a single cluster which will be shared by the three applications.
We are
going to that, we are using our resources to the best and sharing these
resources whenever there is a need. So
instead of creating three different clusters of three nodes each, we could have
a single cluster with nine nodes.
By
creating this kind of large cluster, we now have a flexibility of deciding, who
will use how much of it and change it from time to time dynamically. In our
example, we are going to dynamically configure server pool parameters. We will
create three server pools :
- One server pool used by CRM application, calling CRM_POOL
- One server pool used by ERP application, calling ERP_POOL
- One server pool used by DWH application, calling DWH_POOL
In our
example, we have one to one relationship between the server pool and the
application. But in the other cases, you could have multiple applications
connected to the same server pool.
Given
that we are going to create 3 server pools within one cluster, we need to know
what are the parameters for the server pool like :
- What is the minimum nodes that will be part of the server pool
- What is the maximum number of nodes will be part of each server pool
- What is the importance or priority to be given
- Etc
There are
more other parameters that we can look at later. Let's say among these server
pools, we give the top priority to our CRM and the importance which defines the
priority, the larger the number is, the more important it is so. So let's say,
our CRM has an importance of 1000, our ERP which is the next important, has a
priority of 700 and our DWH which is the least important amongst these three,
is 500. The priority is relative. We are not saying that DWH is not important,
but given the context of our ERP and CRM, DWH is less in priority. That is how
we are constructed our example. In your organization, it might be different.
So, let's
say, we need 2 nodes minimum to our CRM, 2 nodes minimum to our ERP and 3 nodes
minimum to our DWH (Because our Datawarehouse, DWH would do a lot of work, we
would like to give it more resources as the minimum number of servers. And in
maximum of nodes, let's say in our CRM we will need 3 nodes, in ERP 3 nodes and
in DWH, 3 nodes.
Now, how
it works ?
Suppose
that our cluster is down and the first node starts up. The cluster has to
decide to which server pool should it belong. This is determined by the
importance. Amongst the three servers pools that we have CRM is the one with
the maximum importance, so the first node joins the CRM pool.
The
second starts up ant it has to identify which server pool it has to join. The
fact that the CRM has the importance of 1000 and a minimum nodes of 2, the
second node will also join the CRM pool.
The next
two nodes, with the similar manner will join the ERP pool and then, the next 3
nodes will join the DWH pool. So, seven nodes have started, we have 2-2-3 is
the allocation between the three server pools.
When the
eight node starts, it realizes that all the three server pools have met their
minimum requirement. So we need to meet the maximum requirement, and it will go
to join the most important pool which is CRM, the next node will join the ERP
pool.
Now, we
have the scenario that something goes wrong and two nodes fall out of the
cluster and those two nodes are part of CRM server pool. So CRM server pool now
just has one available node.
Since the
minimum requirement of CRM pool is 2 nodes and it's the most important pool,
ERP which is the next important, need 2 nodes as minimum, one of the node on
ERP pool will be dynamically moved into CRM pool.
Now, all
the three server pool are met their minimum requirement. In the same manner if
one nodes goes down in the ERP pool, then one node moved from the DWH node to
ERP pool, etc. But it one node comes up, it will join the DWH pool, to meet the
minimum requirement. So you can imagine the other scenarios.
What did
we realize on this ? With the same set of nodes, we have been able to share the
load across the different server pools and in case of failures we could ensure
that the most important server pool will be available and the database and the
database that is running there, would be available, and as the Database
Administrator, I don't need to do anything.
This is
the fundamental benefit of server pools.
Now, the
next challenge is that, we don't know where our database will belonged. To do
this, when you create a database using the Grid Infrastructure setup, you have
to create a database as a policy-managed or as administrator-managed.
Administrator-managed
is like running prior Oracle 11g release 2. When you create that kind of
database, you would explicitly specify on which nodes in the cluster, will the
database be available, meaning in which nodes will instance be started. When
you create services, you also specify on which nodes these services will be
available.
When you
create a policy-managed database, you don't specify on which nodes, it will be
available. You will specify on which server pool, it will be available.
We have
to also know that, we can have one database which is part of multiple server
pools; which is mean, if you create a new database and make it part of CRM as
well as ERP pool, then, the instance will be available on both the pools. You
also have a parameter called cardinality to specify how many instances to bring
up in each server pool. So when, you say 2 instances, then when the server pool
is available, even if it has 5 nodes, because the cardinality is 2, it will
only start it on 2 nodes.
That is how the server pool is running.
With this
fact, what do I do with our applications?
Earlier,
prior to 11g release 2, in the application, we listed all the VIPs. For
backward compatibility, you could the same abode 11g release 2. But imagine
that you have 12 nodes or more in a cluster. Every application that is going to
connect must have in its connection string a list of all the 12 VIPs or more.
That's a pretty long list. And when the application is trying to connect, it
will through one after one amongst all the 12 VIPs to identify where it can
connect.
To make
it easy, Oracle had introduced the SCAN.
SCAN is
the cluster ware-managed resource which has a single host name, the scan name
which has three associated IP addresses.
More about the SCAN, on the next publication.
Aucun commentaire:
Enregistrer un commentaire