Which one (gfs / ext3) is best for the file system resource in Redhat cluster (RHEL 5)

senthil_kumar_1 · ‎08-22-2011

Hi All,

I would like to configure two node cluster as active and active cluster for SAP and oracle services.

For that I need to configure shared cluster file for oracle data.

I have some questions on configuring shared file system for cluster.

1)Normal disk with ext3 filesystem:

To configure normal disk (LUN) with ext3 as shared file system resource under cluster, We need to assign the same disk (LUN) to both the nodes, In the node where we are configure cluster, we have to create the file system ext3 on the same disk using the command mkfs.ext3, and we have to create mount point (empty diretory) on both nodes, After that we have to configure this as shared file system in cluster, there we need to enter the details like Name, File system type, mount point, device. Is it correct?

My questions:

1.1)How to solve the issue if the disk (LUN) assigned to both nodes having different device files, for say, node1 showing as "/dev/sdd" and node2 showing as "/dev/sdf"? At this situation, Can we mention lable name of the file system, or can we mention file system ID "system ID of the device"?

1.2)If yes, what is the command to find "system ID of the device file"...

1.3)If not, What is the purpose of file system ID option mentioned in shared file system configuration properties of the cluster configuation?

1.4)This file system will be mounted and visible on active node only, Am I correct?

2)Using mutipath device file:

My questions:

2.1)Do I need to configure multipath in both nodes?

2.2)In one node, where we are doing cluster configuation, We need to create ext3 file system on disk (LUN), Need to create empty directory for mount point, and need to use the corresponding multipath device point the same LUN/s, Or we need to mention file system ID (If multipath device file name pointing same LUN differs in both nodes) in cluster configuration. Am I correct?

2.3)what are the known issues if we use multipath devices as shared cluster file system in clsuter?

2.4)This file system will be mounted and visible on active node only, Am I correct?

3)GFS with LVM:

If we want to use GFS with LVM, We need to assign LUN to both nodes, In one node, where we are doing cluster configuration, we need to create a VG and LV using same LUN, Now we have to create GFS file system on LV, Now we are configuring this file system in cluser.

3.1)How the VG and LV details will be transfered to another node, Will it be done by cluster service OR do we need to do manually, if manual, how to do this, please explain step by step?

3.2)Is it that, we need to use GFS with LVM only, If yes, why?

3.3)Can we create GFS file system directly on normal device file like "/dev/sdd" or "/dev/mapper/mpath1 (multipath device file)" and use under cluster as share file system, And will RHN support this type of setup?

3.4)Is it that, we can not use GFS with normal device file like "/dev/sdd" or "/dev/mapper/mpath1", if yes, why?

3.5)If we are using GFS based file system, then it will be mounted in both nodes all the time, Am I correct?

3.6)Where we need GFS file system, Please explain some scenarios?

4)Which one best suits for my requirement for oracle disk, I need complex free method for shared file system?

senthil_kumar_1 · ‎09-01-2011

Hi All,

Could any one please answer for my questions.

Matti_Kurkela · ‎09-02-2011

1.1)There are many ways to solve it:

If you use LVM, it will automatically find the correct disks based on PV and VG UUIDs (those are created when you run pvcreate/vgcreate). LVM does not care about device names.
If you don't use LVM but do use multipathing, you can configure custom alias names for your disks in /etc/multipath.conf. (You would have a /dev/mapper/mydisk instead of /dev/mapper/mpathNN)
If you use neither LVM nor multipathing, you can use filesystem UUIDs or LABELs in place of device names. See "man mount" for more details.
Or you could use udev rules to give your disks persistent names, by UUID or whatever attributes your storage system offers.

The UUID of an ext3 filesystem can be found with a command like:

tune2fs -l /dev/sda1 |grep 'Filesystem UUID:'

The "filesystem ID" mentioned in the cluster configuration is not an UUID, but a simple number. See answer 1.3.

1.2) You can assign them yourself. Just make sure they're unique. (With NFSv4, filesystem ID number 1 is sort of special, but if you're using NFSv4, you should already know this.)

1.3) A filesystem ID is important if your cluster is providing a highly-available NFS service, in particular with NFSv4. In order to allow the NFS server service to successfully failover from one node to another, the NFS service must be guaranteed to use the same filesystem IDs on both nodes. If the filesystem ID is different after a failover (which is likely if you let the NFS service auto-assign the filesystem IDs, as is the default behavior), the NFS clients will notice something strange has happened and won't use the NFS share until it is unmounted and then remounted at the client. If you don't use your shared filesystems for NFS service, the filesystem ID is not used for anything and therefore not important.

Here's the RHEL 5 documentation on filesystem parameters.

1.4) Yes, an ext3 filesystem can be mounted on one node at a time only. If you have a shared ext3 filesystem in a RedHat Cluster, you should *not* configure it to /etc/fstab at all; instead, you should configure it as a "File System" resource in the cluster configuration, and associate it with the service that uses the filesystem. The cluster daemons will check & mount the filesystem as appropriate.

2.1) If both of your nodes have a multipathed connection to the shared disks, then yes - not because of the clustering, but because it allows you to get the maximum benefit of your multiple storage connections (fault tolerance and sometimes also increased storage bandwidth, depending on the features of your storage system).

2.2) Yes. (Note: you must create the empty directory for a mount point on both nodes, so that the empty directory is ready for mounting the shared filesystem in the event of a failover.)

2.3) You say it like it's a bad thing? OK...

By assigning custom alias names to your storage LUN WWIDs in /etc/multipath.conf, you can easily make the multipath names be the same on both nodes. This will limit your maximum stress level and reduce the probability that you'll experience the thrill of noticing you're doing operations on a wrong disk :)
Having multipathing on both nodes makes your storage connections robust: if a storage admin needs to down a single connection for maintenance, your filesystems will keep running instead of switching to read-only mode and crashing the database. You might miss some emergency pay and won't learn to always be ready for trouble, like a soldier in a war zone :) You might even experience an uninterrupted vacation sometimes.

2.4) Multipathing makes no difference in this regard: if you have a multipathed ext3 filesystem, you can still mount and use it on one node at a time only. If you have a GFS filesystem in your cluster, you can mount it on all cluster nodes simultaneously, whether it's multipathed or not.

3.1.) Once you've created the VG on node A, you can simply run "vgscan" on node B and the VG will be available for activation on node B too. This is basic Linux LVM functionality, not related to the clustering at all.

3.2) RedHat says you must use LVM with GFS/GFS2. See the RHEL 5 Cluster documentation.

"While a GFS file system may be used outside of LVM, Red Hat supports only GFS file systems that are created on a CLVM logical volume". The GFS2 document for RHEL5 has the same sentence in it. The reasoning behind this decision is not documented: perhaps you, as a premium support customer of RedHat, might ask them directly if you need this information?

3.3) See 3.2.

3.4) See 3.2.

3.5) The GFS filesystem will be defined as a special "GFS File System" resource in the cluster configuration, and it will be mounted when the service associated with it is running. When you perform maintenance that requires a reboot (kernel updates, firmware updates, hardware maintenance etc.), you may want to failover the applications to the other node and then make the node to be serviced to leave the cluster temporarily in a controlled manner. The GFS filesystem requires "dlm" (Distributed Locking Manager) which is one of the services provided by RedHat cluster daemons, so GFS must be unmounted before shutting down the cluster daemons.

3.6) For example, here is an old whitepaper of Oracle RAC on GFS.

4.) In an earlier thread, you said you plan to run Oracle on one node and SAP on another (in a normal situation). For this, GFS is overkill. This would be a multi-service "failover" or "active/passive" cluster, not really "active/active", even though both nodes do something useful simultaneously: in a true active/active cluster, multiple nodes are running the same service simultaneously. In your cluster, I understood this will not be the case.

An active/active cluster often needs a load balancer or some other method of directing the incoming requests to multiple nodes.

In RHEL 5, the GFS/GFS2 filesystems support larger filesystem sizes than ext3. If you need large amounts of data storage (multiple terabytes) on RHEL 5, this might be important. Or you might use RHEL 6 and ext4 filesystems instead.

These RHKB articles might be useful: (RHN access required)

https://access.redhat.com/kb/docs/DOC-3068

https://access.redhat.com/kb/docs/DOC-17651

MK

senthil_kumar_1 · ‎10-04-2011

Hi Matti,

Thanks lot for your continuous reply to my all doubts...

You have told that "3.1.) Once you've created the VG on node A, you can simply run "vgscan" on node B and the VG will be available for activation on node B too. This is basic Linux LVM functionality, not related to the clustering at all." I am understanding this...

My Question:

1)If we are extending vg (one PV is newly added, before that I have added one LUN to both nodes, that LUN to be configured as PV) in one node, how to reflect this in other nodes, Please expalin the steps to be done?

2)If LV alone extended, how to reflect in other nodes?

Matti_Kurkela · ‎10-04-2011

1.) I assume the LUN is already made visible (and multipathed, if necessary) for all nodes.

If not, do it first before extending the VG, using the instructions in "Online Storage Reconfiguration Guide":

http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Online_Storage_Reconfiguration_Guide/index.html

If you are using cLVM (which is the only way to have a supported GFS configuration according to RedHat), you don't have to do anything special: as you extend the VG on node A, the clvmd daemon on node A gets notified and relays the information to all the other nodes. The VG is extended on all nodes simultaneously. Yes, this allows you to extend a GFS filesystem while it is mounted and active on multiple nodes. (If one of the nodes does not see the new PV or there is some other problem in extending the VG, the extension operation is rejected on all nodes and you get an error message.)

If you're using ext3, you can use either cLVM or a more classic-style HA-LVM configuration (see https://access.redhat.com/kb/docs/DOC-3068 ). With cLVM, again you don't have to do anything special: the clvmd informs the other nodes automatically.

With HA-LVM configuration, you extend the VG in one node just as usual. Then, on the other nodes, you run "vgscan" to make them re-check the VG configuration.

2.) The answer is the same as with question 1.

Note: if you have configured the /etc/lvm/lvm.conf for cLVM (i.e. changed the locking_type setting from the default value 1 to 3), all the new VGs you create will be in cluster mode by default. You can identify the VGs in cLVM cluster mode by running "vgs" and looking at the Attr field: the VGs in cLVM cluster mode will have a "c" in the 6th position.

To switch a regular VG into cLVM cluster mode, make sure the locking_type is set correctly in /etc/lvm/lvm.conf and then use "vgchange -c y <VGname>".

If you need to create a regular single-host VG in a system that is configured for cLVM, use "vgcreate -c n".

By the way, you might want to click on the blue "Kudos!" button if you find a good answer. It does not even have to be your question: you can give kudos to any message that is not written by you.

MK

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Which one (gfs / ext3) is best for the file system resource in Redhat cluster (RHEL 5)

Which one (gfs / ext3) is best for the file system resource in Redhat cluster (RHEL 5)

Re: Which one (gfs / ext3) is best for the file system resource in Redhat cluster (RHEL 5)

Re: Which one (gfs / ext3) is best for the file system resource in Redhat cluster (RHEL 5)

Re: Which one (gfs / ext3) is best for the file system resource in Redhat cluster (RHEL 5)

Re: Which one (gfs / ext3) is best for the file system resource in Redhat cluster (RHEL 5)