Wednesday, April 8, 2009

How to create a RAIDZ2 pool with a hot spare

Continuing with the same scenarios, we would be using

Disk 2, Disk 3, Disk 4, Disk 5 and Disk 6 to create a RAIDZ2 pool. Each disk is 2.00GB each. We would be later using Disk 1 as a spare in the pool. First lets go ahead and create the pool. Here's how it is done:

root@opensolaris:~# zpool create plrdz2 c4d1 c5t0d0 c5t1d0 c5t2d0 c5t3d0
root@opensolaris:~#

Lets check this.

root@opensolaris:~# zpool status plrdz2
pool: plrdz2
state: ONLINE
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
plrdz2 ONLINE 0 0 0
raidz2 ONLINE 0 0 0
c4d1 ONLINE 0 0 0
c5t0d0 ONLINE 0 0 0
c5t1d0 ONLINE 0 0 0
c5t2d0 ONLINE 0 0 0
c5t3d0 ONLINE 0 0 0

errors: No known data errors
root@opensolaris:~#

The commmand executed successfully and thus now we have a pool named plrdz2 which is RAIDZ2. So this pool can easily tolerate 2 disk failures without giving any data loss. So if 2 disks fail for any reason, you still have your data one piece. Nothing to lose. However the pool would be in DEGRADED state. Now to add further reliability, i would go ahead and one disk as a spare disk. Which would mean that if one of the disk in the pool happens to fail, that disk would be set aside and this hot spare disk would be used to write data. Eventually your pool is safe even for 3 disk failures. Here's how we add a hot spare to our pool:

root@opensolaris:~# zpool add plrdz2 spare c3d1
root@opensolaris:~#

The command completed successfully.

root@opensolaris:~# zpool status plrdz2
pool: plrdz2
state: ONLINE
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
plrdz2 ONLINE 0 0 0
raidz2 ONLINE 0 0 0
c4d1 ONLINE 0 0 0
c5t0d0 ONLINE 0 0 0
c5t1d0 ONLINE 0 0 0
c5t2d0 ONLINE 0 0 0
c5t3d0 ONLINE 0 0 0
spares
c3d1 AVAIL

errors: No known data errors

Now lets do some play. I would go ahead and physically remove Disk 4 (i.e. c5t1d0). Now lets see what happens

root@opensolaris:~#
root@opensolaris:~# zpool status plrdz2
pool: plrdz2
state: DEGRADED
status: One or more devices could not be opened. Sufficient replicas exist for
the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
see: http://www.sun.com/msg/ZFS-8000-2Q
scrub: resilver completed after 0h0m with 0 errors on Wed Apr 8 15:49:52 2009
config:

NAME STATE READ WRITE CKSUM
plrdz2 DEGRADED 0 0 0
raidz2 DEGRADED 0 0 0
c4d1 ONLINE 0 0 0 67.5K resilvered
c5t0d0 ONLINE 0 0 0 70K resilvered
spare DEGRADED 0 0 0
c5t1d0 UNAVAIL 0 868 0 cannot open
c3d1 ONLINE 0 0 0 134M resilvered
c5t2d0 ONLINE 0 0 0 67.5K resilvered
c5t3d0 ONLINE 0 0 0 67K resilvered
spares
c3d1 INUSE currently in use

errors: No known data errors
root@opensolaris:~#

If you see carefully, the system has now intelligently take the hot spare disk and is still functioning well.

Point to Note and Remember: From what i know RAIDZ pool cannot be upgraded to RAIDZ2 pool. You would need a minimum of 3 disk to create a RAIDZ2 pool.

Tuesday, April 7, 2009

Monitoring performance of ZFS file system using zfs iostat

ZFS is built-in with a lot of monitoring features. We would be covering iostat here specifically.

The command syntax is pretty simple and straight forward and much close to the older iostat command. Here's how it looks like:

root@opensolaris:~# zpool iostat


root@opensolaris:~# zpool iostat zpooldata 1 5

capacity operations bandwidth

pool used avail read write read write

---------- ----- ----- ----- ----- ----- -----

zpooldata 222K 7.06G 0 0 2.47K 1.45K

zpooldata 222K 7.06G 0 0 0 0

zpooldata 222K 7.06G 0 0 0 0

zpooldata 222K 7.06G 0 0 0 0

zpooldata 222K 7.06G 0 0 0 0

root@opensolaris:~#

Monday, April 6, 2009

Best Practises: Migrate your non-system data from UFS to ZFS

Consider the following best-practices when migrating non-system-related data from UFS file systems to ZFS file systems:

1. Unshare the existing UFS file systems

If your UFS file system is being shared as an NFS resource or any other kind of resource. First unshare it. To do this, you would find a lot of useful tips and tricks, which i wont be covering here.

2. Unmount the existing UFS file systems from the previous mount points
Unmount your UFS file system from their existing mount points. This would help close some other potential hidden problems that you might run into, if you fail to do so.

3. Mount the UFS file systems to temporary unshared mount points
Make a new folder somewhere in your file system and mount all these to your newly created UFS mount points.

4. Migrate the UFS data with parallel instances of rsync running to the new ZFS file systems


5. Set the mount points and the sharenfs properties on the new ZFS file systems

Step by Step: Migrate root UFS file system to ZFS

Solaris 10 10/08 is released and one of the great features to come with it is ZFS boot.

People have been waiting for this for a long time, and will naturally be eager to migrate their root filesystem from UFS to ZFS. This article will detail how you can do this using Live Upgrade. This will allow you to perform the migration with the least amount of downtime, and still have a safety net in case something goes wrong.

These instructions are aimed at users with systems ALREADY running Solaris 10 10/08 (update 6)

Step 1: Create the Root zpool

The first thing you need to do is create your disk zpool. It MUST exist before you can continue, so create and verify it:

# zpool create rootpool c1t0d0s0
# zpool list
NAME SIZE USED AVAIL CAP HEALTH ALTROOT
rootpool 10G 73.5K 10.0G 0% ONLINE -
#

If the slice you’ve selected currently has another filesystem on it, eg UFS or VxFS, you’ll need to use the -f flag to force the creation of the ZFS filesystem.

You can use any name you like. I’ve chosen rootpool to make it clear what the pool’s function is.

Step 2:Create The Boot Environments (BE)

Now we’ve got our zpool in place, we can create the BEs that will be used to migrate the current root filesystem across to the new ZFS filesystem.

Create the ABE as follows:

# lucreate -c ufsBE -n zfsBE -p rootpool

This command will create two boot environments where:

- ufsBE is the name your current boot environment will be assigned. This can be anything you like and is your safety net. If something goes wrong, you can always boot back to this BE (unless you delete it).
- zfsBE is the name of your new boot environment that will be on ZFS and…
- rootpool is the name of the zpool you create for the boot environment.

This command will take a while to run as it copies your ufsBE to your new zfsBE and will produce output similar to the following if all goes well:

# lucreate -c ufsBE -n zfsBE -p rootpool
Analyzing system configuration.
No name for current boot environment.
Current boot environment is named .
Creating initial configuration for primary boot environment .
The device
is not a root device for any boot environment; cannot get BE ID.
PBE configuration successful: PBE name PBE Boot Device .
Comparing source boot environment file systems with the file
system(s) you specified for the new boot environment. Determining which
file systems should be in the new boot environment.
Updating boot environment description database on all BEs.
Updating system configuration files.
The device
is not a root device for any boot environment; cannot get BE ID.
Creating configuration for boot environment .
Source boot environment is .
Creating boot environment .
Creating file systems on boot environment .
Creating file system for in zone on .
Populating file systems on boot environment .
Checking selection integrity.
Integrity check OK.
Populating contents of mount point
.
Copying.
Creating shared file system mount points.
Creating compare databases for boot environment .
Creating compare database for file system
.
Updating compare databases on boot environment .
Making boot environment bootable.
Creating boot_archive for /.alt.tmp.b-7Tc.mnt
updating /.alt.tmp.b-7Tc.mnt/platform/sun4u/boot_archive
Population of boot environment successful.
Creation of boot environment successful.
#

The x86 output it not much different. It’ll just include information about updating GRUB.

Update: You may get the following error from lucreate:

ERROR: ZFS pool does not support boot environments.

This will be due to the label on the disk.

You need to relabel your root disks and give them an SMI label. You can do this using “format -e”, select the disk, then go to “label” and select “[0] SMI label”. This should be all that’s needed, but whilst you’re at it, you may as well check your partition table is still as you want. If not, make your changes and label the disk again.

For x86 system, you need to ensure your disk has an fdisk table.

You should now be able to perform the lucreate.

The most likely reason for your disk having an EFI label is it’s probably been used by ZFS as a whole disk before. ZFS uses EFI labels for whole disk usage, however you need an SMI label for your root disks at the moment (I believe this may change in the future).

Once the the lucreate has completed, you can verify your Live Upgrade environments with lustatus:

# lustatus
Boot Environment Is Active Active Can Copy
Name Complete Now On Reboot Delete Status
-------------------------- -------- ------ --------- ------ ----------
ufsBE yes yes yes no -
zfsBE yes no no yes -
#

Step 3: Activate and Boot from ZFS zpool

We’re almost done. All we need to do now is activate our new ZFS boot environment and reboot:
# luactivate zfsBE
# init 6

NOTE: Ensure you reboot using “init 6” or “shutdown -i6“. Do NOT use “reboot”

Remember, if you’re on SPARC, you’ll need to set the appropriate boot device at the OBP. luactivate will remind you.

You can verify you’re booted from the ZFS BE using lustatus:

# lustatus
Boot Environment Is Active Active Can Copy
Name Complete Now On Reboot Delete Status
-------------------------- -------- ------ --------- ------ ----------
ufsBE yes no no yes -
zfsBE yes yes yes no -
#

At this point you can delete the old ufsBE if all went well. You can also re-use that old disk/slice for anything you want like adding it to the rootpool to create a mirror. The choice is yours, but now you have your system booted from ZFS and all it’s wonderfulness is available on the root filesystem too.

Playing around with ZFS File system using Virtual Disk created from Sun xVM VirtualBox - Working with ZFS hotspares

In our earlier scenario we created a zpool and added extra storage to it. Here we would detail how to create a zpool with a hotspare. But before we could do this, we need to delete / destroy our previous pool. (Note - doing this would erase all your data in your zpool). Here's how we do it:

root@opensolaris:~# zpool destroy zpooldata

So with this, zpooldata is gone.

Now lets go ahead and a create one with a hotspare.

In our case, we would be using Disk 1 as a hotspare and Disk 7 and Disk 8 would form part of zpooldata. Here's how we do this:

root@opensolaris:~# zpool create zpooldata c5t4d0 c5t5d0 spare c3d1

(Note: For all practical purposes, make sure that the the disk which you allocate as a hot spare is larger than any of the disk on your pool.)

root@opensolaris:~# zpool status

pool: zpooldata
state: ONLINE
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
zpooldata ONLINE 0 0 0
c5t4d0 ONLINE 0 0 0
c5t5d0 ONLINE 0 0 0
spares
c3d1 AVAIL

errors: No known data errors
root@opensolaris:~#

the above shows that Disk 2 is a spare.

Now lets go ahead and add another disk to the hotspare. Here's how we do it.

root@opensolaris:~# zpool add zpooldata spare c4d1
root@opensolaris:~# zpool status

pool: zpooldata
state: ONLINE
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
zpooldata ONLINE 0 0 0
c5t4d0 ONLINE 0 0 0
c5t5d0 ONLINE 0 0 0
spares
c4d1 AVAIL
c3d1 AVAIL

errors: No known data errors
root@opensolaris:~#

Now, lets remove both of the hotspare disks.

root@opensolaris:~# zpool remove zpooldata c3d1 c4d1
root@opensolaris:~# zpool status

pool: zpooldata
state: ONLINE
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
zpooldata ONLINE 0 0 0
c5t4d0 ONLINE 0 0 0
c5t5d0 ONLINE 0 0 0

errors: No known data errors
root@opensolaris:~#

Remember, adding or removing hotspares from your zpool wont really affect the total available storage in your pool. These are just hotspare disks and would be used in an event when a disk in the main pool fails.

Playing around with ZFS File system using Virtual Disk created from Sun xVM VirtualBox - Creating a zpool and adding storage to it

ZFS organizes physical devices into logical pools called storage pools. Both individual disks and array logical unit numbers (LUNs) visible to the operating system may be included in a ZFS pools. ZFS can be based on other less traditional storage structures as well.

Ideal way to practice and play around with ZFS is to install OpenSolaris as a Guest OS on VMware or using Sun xVM VirtualBox. My examples would be all based on Sun xVM VirtualBox. Its pretty easy and straight forward to add as many virtual disk as you like. I wont cover that stuff here. So this can be an ideal practice ground. You don't really have to run down a hardware store and get some inexpensive disks to play around with ZFS.

In my scenario, i have created in all 9 different virtual disks. Below are the details:

root@opensolaris:~# format
Searching for disks...done


AVAILABLE DISK SELECTIONS:
0. c3d0 DEFAULT cyl 3668 alt 2 hd 128 sec 32
/pci@0,0/pci-ide@1,1/ide@0/cmdk@0,0
1. c3d1 VBOX HAR-ca698f65-ae2b945-0001-5.82GB
/pci@0,0/pci-ide@1,1/ide@0/cmdk@1,0
2. c4d1 VBOX HAR-fdacd661-6e2e166-0001-2.00GB
/pci@0,0/pci-ide@1,1/ide@1/cmdk@1,0
3. c5t0d0 ATA-VBOX HARDDISK-1.0-2.00GB
/pci@0,0/pci8086,2829@d/disk@0,0
4. c5t1d0 ATA-VBOX HARDDISK-1.0-2.00GB
/pci@0,0/pci8086,2829@d/disk@1,0
5. c5t2d0 ATA-VBOX HARDDISK-1.0-2.00GB
/pci@0,0/pci8086,2829@d/disk@2,0
6. c5t3d0 ATA-VBOX HARDDISK-1.0-2.00GB
/pci@0,0/pci8086,2829@d/disk@3,0
7. c5t4d0 ATA-VBOX HARDDISK-1.0-4.00GB
/pci@0,0/pci8086,2829@d/disk@4,0
8. c5t5d0 ATA-VBOX HARDDISK-1.0-3.11GB
/pci@0,0/pci8086,2829@d/disk@5,0
Specify disk (enter its number): ^C
root@opensolaris:~#


Disk 0 - is my boot disk and i wont be touching this at all.
Disk 1 - 5.82 GB disk.
Disk 2 - 2.00 GB disk.
Disk 3 - 2.00 GB disk
Disk 4 - 2.00 GB disk
Disk 5 - 2.00 GB disk
Disk 6 - 2.00 GB disk
Disk 7 - 4.00 GB disk
Disk 8 - 3.11 GB disk

All the 2.00 GB one's i would be using extensively for creating MIRROR and RAID-Z zfs volumes. Rest of them would be for other purposes.

1. ZFS Pool

Lets first begin by creating a ZFS pool named "zpooldata". We would be using disk 2 and disk 7 for this, for now. So here's how we do it:

root@opensolaris:~# zpool create zpooldata c4d1 c5t4d0

the command executes successfully and i have a new zpool created zpooldata with 5.9 GB of available storage to use.

root@opensolaris:~# df -h
Filesystem Size Used Avail Use% Mounted on
rpool/ROOT/opensolaris
6.2G 2.3G 3.9G 37% /
swap 540M 324K 539M 1% /etc/svc/volatile
/usr/lib/libc/libc_hwcap3.so.1
6.2G 2.3G 3.9G 37% /lib/libc.so.1
swap 539M 12K 539M 1% /tmp
swap 539M 44K 539M 1% /var/run
rpool/export 3.9G 19K 3.9G 1% /export
rpool/export/home 3.9G 19K 3.9G 1% /export/home
rpool/export/home/vishal
3.9G 22M 3.9G 1% /export/home/vishal
rpool 3.9G 72K 3.9G 1% /rpool
zpooldata 5.9G 18K 5.9G 1% /zpooldata
root@opensolaris:~#


plain and simple, just one command and its done.

Now lets just go ahead and add Disk 8 to our zpool to get some extra storage. Here's how we do it.

root@opensolaris:~# zpool add zpooldata c5t5d0
root@opensolaris:~# df -h
Filesystem Size Used Avail Use% Mounted on
rpool/ROOT/opensolaris
6.1G 2.3G 3.9G 37% /
swap 540M 324K 540M 1% /etc/svc/volatile
/usr/lib/libc/libc_hwcap3.so.1
6.1G 2.3G 3.9G 37% /lib/libc.so.1
swap 540M 12K 540M 1% /tmp
swap 540M 44K 540M 1% /var/run
rpool/export 3.9G 19K 3.9G 1% /export
rpool/export/home 3.9G 19K 3.9G 1% /export/home
rpool/export/home/vishal
3.9G 31M 3.9G 1% /export/home/vishal
rpool 3.9G 72K 3.9G 1% /rpool
zpooldata 9.0G 18K 9.0G 1% /zpooldata

root@opensolaris:~# zpool status

pool: zpooldata
state: ONLINE
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
zpooldata ONLINE 0 0 0
c4d1 ONLINE 0 0 0
c5t4d0 ONLINE 0 0 0
c5t5d0 ONLINE 0 0 0

errors: No known data errors
root@opensolaris:~#


As of this writing, there is no provision to remove a disk from an existing zpool. I heard this feature is coming up in the future releases. However, what you could do is, to create a zpool with hotspares and then those hotspares could be removed as and when need arises. I will cover this in the next post.

Monday, August 25, 2008

How to Rename a Solaris Zone?

A few days back i had a need to rename my Solaris zones from "orazone" to "oraprodzone". I followed the below steps to successfully rename my zone's name.

STEP 1: Shutdown the zone "orazone"

Issue the following commands from the globalzone to shutdown orazone.

globalzone# zoneadm list -iv
ID NAME STATUS PATH
0 global running /
2 orazone running /zones/orazone
globalzone# zoneadm -z orazone halt
globalzone# zoneadm list -iv
ID NAME STATUS PATH
0 global running /
- orazone installed /zones/orazone
globalzone#

STEP 2: Rename the Zone from "orazone" to "oraprodzone"

Enter zone configuration from the global zone using the below mentioned commands.

globalzone# zonecfg -z orazone
zonecfg:orazone> set zonename=oraprodzone
zonecfg:orazone> commit
zonecfg:orazone> exit

globalzone# zoneadm list -vc
ID NAME STATUS PATH BRAND
0 global running / native
- oraprodzone installed /zones/orazone native

STEP 3: Boot the zone

After you have made the above changes, boot the zone from the global zone using the below commands.

globalzone# zoneadm -z oraprodzone boot
globalzone# zoneadm list -iv

ID NAME STATUS PATH
0 global running /
2 orazone running /zones/orazone

Done!

There is another way to rename a zone (not supported, but it worked for me), but then that's not the right one though. However, i would mention that as well.

Renaming zone orazone to oraprodzone

Perform all of the below as root of global zone.
First shutdown your orazone zone

globalzone# zoneadm -z orazone halt
globalzone# vi /etc/zones/index

change orazone to oraprodzone

globalzone#
cd /etc/zones
globalzone# mv orazone.xml oraprodzone.xml
globalzone# vi oraprodzone.xml

change orazone to oraprodzone

globalzone#
cd /zones
-/zones is where I have stored all the zones

globalzone#
mv orazone oraprodzone

-cd to your new zone (/zones/oraprodzone)and modify /etc/hosts, /etc/nodename, /etc/hostname.xxx

globalzone#
cd /zones/oraprodzone/root/etc

-boot new renaming zone
globalzone# zoneadm -z oraprodzone boot

Feel free to leave a comment :)

BLOG Maintained by - Vishal Sharma | GetQuickStart