Technologist

Tech stuff about Cloud, DevOps, SysAdmin, Virtualization, SAN, Hardware, Scripting, Automation and Development

Browsing Posts in Storage

VMware ESXi can take advantage of Flash/local SSDs in multiple ways:

  • Host swap cache (since 5.0):  ESXi will use part of the an SSD datastore as swap space shared by all VMs.  This means that when there is ESX memory swapping, the ESXi server will use the SSD drives, which is faster than HDD, but still slower than RAM.
  • Virtual SAN (VSAN) (since 5.5 with VSAN licensing): You can combine  the local HDD and local SSD on each host and basically create a distributed storage platform.  I like to think of it as a RAIN(Redundant Array of Independent Nodes).
  • Virtual Flash/vFRC (since 5.5 with Enterprise Plus): With this method the SSD is formatted with VFFS and can be configured as read and write through cache for your VMs, it allows ESXi to locally cache virtual machine read I/O and survives VM migrations as long as the destination ESXi host has Virtual Flash enabled. To be able to use this feature VMs HW version needs to be 10.

Check if the SSD drives were properly detected by ESXi

From vSphere Web Client

Select the ESXi host with Local SSD drives -> Manage -> Storage -> Storage Devices

See if it shows as SSD or Non-SSD, for example:

flash1

 

From CLI:

~ # esxcli storage core device list
...
naa.60030130f090000014522c86152074c9
 Display Name: Local LSI Disk (naa.60030130f090000014522c86199898)
 Has Settable Display Name: true
 Size: 94413
 Device Type: Direct-Access
 Multipath Plugin: NMP
 Devfs Path: /vmfs/devices/disks/naa.60030130f090000014522c86199898
 Vendor: LSI
 Model: MRSASRoMB-8i
 Revision: 2.12
 SCSI Level: 5
 Is Pseudo: false
 Status: on
 Is RDM Capable: false
 Is Local: true
 Is Removable: false
 Is SSD: false  <-- Not recognized as SSD
 Is Offline: false
 Is Perennially Reserved: false
 Queue Full Sample Size: 0
 Queue Full Threshold: 0
 Thin Provisioning Status: unknown
 Attached Filters:
 VAAI Status: unsupported
 Other UIDs: vml.020000000060030130f090000014522c86152074c94d5253415352
 Is Local SAS Device: false
 Is Boot USB Device: false
 No of outstanding IOs with competing worlds: 32
...

To enable the SSD option on the SSD drive

At this point you should put your host in maintenance mode because it will need to be rebooted.

If the SSD is not properly detected you need to use storage claim rules to force it to be type SSD. (This is also useful if you want to fake a regular drive to be SSD for testing purposes)

# esxcli storage nmp device list
...
naa.60030130f090000014522c86152074c9   <-- Take note of this device ID for the command below
 Device Display Name: Local LSI Disk (naa.60030130f090000014522c86152074c9)
 Storage Array Type: VMW_SATP_LOCAL
 Storage Array Type Device Config: SATP VMW_SATP_LOCAL does not support device configuration.
 Path Selection Policy: VMW_PSP_FIXED
 Path Selection Policy Device Config: {preferred=vmhba2:C2:T0:L0;current=vmhba2:C2:T0:L0}
 Path Selection Policy Device Custom Config:
 Working Paths: vmhba2:C2:T0:L0
 Is Local SAS Device: false
 Is Boot USB Device: false
...

Add a PSA claim rule to mark the device as SSD (if it is not local (e.g. SAN))

# esxcli storage nmp satp rule add --satp=<SATP_TYPE> --device=<device ID> --option="enable_ssd"

For example (in case this was a SAN attached LUN)

# esxcli storage nmp satp rule add --satp=VMW_SATP_XXX --device=naa.60030130f090000014522c86152074c9  --option="enable_ssd"

 

Add a PSA claim rule to mark the device as Local and SSD at the same time (if the SSD drive is local)

# esxcli storage nmp satp rule add –-satp=VMW_SATP_LOCAL –-device=<device ID> --option="enable_local enable_ssd"

For the device in my example it would be:

# esxcli storage nmp satp rule add --satp=VMW_SATP_LOCAL --device=naa.60030130f090000014522c86152074c9 --option="enable_local enable_ssd"

Reboot your ESXi host for the changes to take effect.

 

To remove the rule (for whatever reason, including testing and going back)

esxcli storage nmp satp rule remove --satp VMW_SATP_LOCAL --device <device ID> --option=enable_ssd
esxcli storage nmp satp list |grep ssd
esxcli storage core claiming reclaim -d <device ID>
esxcli storage core device list --device=<device ID>

Once the ESXi server is back online verify that the SSD option is OK

From vSphere Web Client

Select the ESXi host with Local SSD drives -> Manage -> Storage -> Storage Devices

See if it shows as SSD or Non-SSD, for example:

flash2

From CLI:

~ # esxcli storage core device list
...
naa.60030130f090000014522c86152074c9
 Display Name: Local LSI Disk (naa.60030130f090000014522c86152074c9)
 Has Settable Display Name: true
 Size: 94413
 Device Type: Direct-Access
 Multipath Plugin: NMP
 Devfs Path: /vmfs/devices/disks/naa.60030130f090000014522c86152074c9
 Vendor: LSI
 Model: MRSASRoMB-8i
 Revision: 2.12
 SCSI Level: 5
 Is Pseudo: false
 Status: on
 Is RDM Capable: false
 Is Local: true
 Is Removable: false
 Is SSD: true  <-- Now it is true
 Is Offline: false
 Is Perennially Reserved: false
 Queue Full Sample Size: 0
 Queue Full Threshold: 0
 Thin Provisioning Status: unknown
 Attached Filters:
 VAAI Status: unsupported
 Other UIDs: vml.020000000060030130f090000014522c86152074c94d5253415352
 Is Local SAS Device: false
 Is Boot USB Device: false
 No of outstanding IOs with competing worlds: 32
...

Exit Maintenance mode.

Do the same on ALL hosts in the cluster.

Configure Virtual Flash

Now that the ESXi server recognize the SSD drives we can enable Virtual Flash.

You need to perform the below steps from the vSphere Web Client on all ESX hosts

ESXi host -> Manage -> Settings -> Virtual Flash -> Virtual Flash Resource Management -> Add Capacity…

flash3

You will see that the SSD device has been formatted using the VFFS filesystem, it can be used to allocate space for virtual flash host swap cache or to configure virtual Flash Read Cache for virtual disks.

flash4

 

Configure Virtual Flash Host Swap

One of the options you have is to use the Flash/SSD as Host Swap Cache, to do this:

ESXi host -> Manage -> Settings -> Virtual Flash -> Virtual Flash Host Swap Cache Configuration -> Edit…

// Enable and select the size of the cache in GB

flash5

 

Configure Flash Read Cache

Flash read cache is configured on a per-vm basis, per vmdk basis. VMs need to be at virtual hardware version 10 in order to use vFRC.

To enable vFRC on a VM’s harddrive:

VM -> Edit Settings -> Expand Hard Disk -> Virtual Flash Read Cache

Enter the size of the cache in GB (e.g. 20)

You can start conservative and increase if needed, I start with 10% of the VMDK size. Below, in the monitor vFRC section, you will see tips to rightsize your cache.

flash6

 

If you click on Advanced, you can configure/change the specific block-size (default is 8k) for the Read Cache, this allows you to optimize the cache for the specific workload the VM is running.

flash7

The default block size is 8k, but you may want to rightsize this based on the application/workload to be able to efficiently use the cache.

If you dont size the block-size of the cache you could potentially be affecting the efficiency of the cache:

  • If the workload has block sizes larger than the configured block-size then you will have increased cache misses.
  • If the workload has block sizes smaller than the configured block-size then you will be wasting precious cache.

Size correctly the block-size of your Cache

To correctly size the block-size of your cache you need to determine the correct I/O length/size for cache block size:

Login to the ESX host running the workload/VM for which you want to enable vFRC

 

Find world ID of each device

~ # /usr/lib/vmware/bin/vscsiStats -l
Virtual Machine worldGroupID: 44670, Virtual Machine Display Name: myvm, Virtual Machine Config File: /vmfs/volumes/523b4bff-f2f2c400-febe-0025b502a016/myvm/myvm.vmx, {
 Virtual SCSI Disk handleID: 8194 (scsi0:0)
 Virtual SCSI Disk handleID: 8195 (scsi0:1)
 }
...

 

Start gathering statistics on World ID // Give it some time while it captures statistics

~ # /usr/lib/vmware/bin/vscsiStats -s -w 44670
 vscsiStats: Starting Vscsi stats collection for worldGroup 44670, handleID 8194 (scsi0:0)
 Success.
 vscsiStats: Starting Vscsi stats collection for worldGroup 44670, handleID 8195 (scsi0:1)
 Success.

Get the IO length histogram to find the most dominant IO length

You want the IO length for the harddisk you will enable vFRC, in this case scsi0:1

(-c means compressed output)

~ # /usr/lib/vmware/bin/vscsiStats -p ioLength -c -w 44670
...
Histogram: IO lengths of Write commands,virtual machine worldGroupID,44670,virtual disk handleID,8195 (scsi0:1)
 min,4096
 max,409600
 mean,21198
 count,513
 Frequency,Histogram Bucket Limit
 0,512
 0,1024
 0,2048
 0,4095
 174,4096
 0,8191
 6,8192
 1,16383
 311,16384
 4,32768
 1,49152
 0,65535
 2,65536
 1,81920
 1,131072
 1,262144
 11,524288
 0,524288
...

As you can see, in this specific case,  16383(16k) is the most dominant IO length, and this is what you should use in the Advance options.

flash8

Now you are using a Virtual Flash Read Cache on that VM’s harddisk, which should improve the performance.

Monitor your vFRC

Login to the ESX host running the workload/VM for which you enabled vFRC, in the example below it is a 24GB Cache with 4K block-size:

# List physical Flash devices
 ~ # esxcli storage vflash device list
 Name Size Is Local Is Used in vflash Eligibility
 -------------------- ----- -------- ----------------- ---------------------------------
 naa.500a07510c06bf6c 95396 true true It has been configured for vflash
 naa.500a0751039c39ec 95396 true true It has been configured for vflash
# Show virtual disks configured for vFRC. You will find the vmdk name for the virtual disk in the cache list:
 ~ # esxcli storage vflash cache list
 vfc-101468614-myvm_2
# Get Statistics about the cache
~ # esxcli storage vflash cache stats get -c vfc-101468614-myvm_2
   Read:
         Cache hit rate (as a percentage): 60
         Total cache I/Os: 8045314
         Mean cache I/O latency (in microseconds): 3828
         Mean disk I/O latency (in microseconds): 13951
         Total I/Os: 13506424
         Mean IOPS: 249
         Max observed IOPS: 1604
         Mean number of KB per I/O: 627
         Max observed number of KB per I/O: 906
         Mean I/O latency (in microseconds): 4012
         Max observed I/O latency (in microseconds): 6444
   Evict:
         Last I/O operation time (in microseconds): 0
         Number of I/O blocks in last operation: 0
         Mean blocks per I/O operation: 0
   Total failed SSD I/Os: 113
   Total failed disk I/Os: 1
   Mean number of cache blocks in use: 5095521

There is a lot of important information here:
The Cache hit rate shows you the percentage of how much the cache is being used. A high number is better because it means that hits use the cache more frequently.
Other important items are IOPs and latency.

This stats also show information that can help you right size your cache, if you see a high number of cache evictions, Evict->Mean blocks per I/O operation, it could be an indication that your cache size is small or that the block-size of the cache is incorrectly configured.

To calculate available block in the cache, do the following:
SizeOfCache(in bytes) / BlockSizeOfCache(in bytes) = #ofBlocksInvFRC

For the example: A 24GB cache with 4k block-size, will have 6291456 blocks in the vFRC, see:
25769803776
/
4096
=
6291456

 

In the stats above we see 5095521 as the Mean number of cache blocks in use, and no evictions which indicates that 24GB cache with 4k seems to be a correctly sized cache.

Keep monitoring your cache to gain as much performance as you can from your Flash/SSD devices.

If you are running your VMware infrastructure on NetApp storage, you can utilize NetApp’s Virtual Storage Console (VCS) which integrates with vCenter to a provide a strong, fully integrated solution to managing your storage from within vCenter.

With VCS you can discover, monitor health and capacity, provision, perform cloning, backup and restores, as well as optimize your ESX hosts and misaligned VMs.

The use case I will write about is the ability to take a back up of all of your production Datastores and initiate a SnapMirror transfer to DR.

Installing NetApp’s Virtual Storage Console

Download the software from NetApp’s website (need credentials) from the software section: VSC_vasavp-5-0.zip (version as of this post)

Install on a Windows System (can be vCenter if using Windows vCenter)

There are currently a couple of bugs with version 5.0 that can be addressed by following the following articles – hopefully they will be addressed soon by NetApp:

http://mysupport.netapp.com/NOW/cgi-bin/bol?Type=Detail&Display=821600

and

http://mysupport.netapp.com/NOW/cgi-bin/bol?Type=Detail&Display=767444

Follow the wizard…

 

smvi1 smvi2

 

Select Backup and Recovery to be able to use these features
smvi3 smvi4 smvi5

 

You may get a warning here and this is where you need to follow the bug fixes specified earlier (adding a line to smvi.override)

Then you need to enter the information requested:

Plugin service information: hostname/IP of the server you installed VSC (in this case it was the vCenter server)

Then enter the vCenter information

smvi6

Check that the registration was successful

smvi7

Verify that it is installed in the vCenter Web Client

smvi8

 

Configure the NetApp Virtual Storage Console form vCenter Web Client

On the vCenter Web Client click on the Virtual Storage Console icon

smvi9

Click on ‘Storage Systems’ and add your NetApp controllers including your DR controllers ( you will need this to successfully initiate SnapMirror after backups)

smvi10

Once you have added them, you will be able to see their details and status, take a look at the summary and related objects. Also click on the ‘View Details’ link(s) they will provide a wealth of information about your storage

smvi11

Go back to the main page of the Virtual Storage Console and you will see global details

smvi12

With the above setup you can start provisioning storage, create backups/restores, mount snapshots and look at the details of everyobject from a storage perspective. Take a look at the Datacenter, Datastores, VMs.

smvi13

smvi14

 

Configure Datasore Backups followed by NetApp SnapMirror for Disaster Recovery

Pre-requisites:

You need to have an initialized SnapMirror relationship

prod-filer> vol size vm_datastore
vol size: Flexible volume 'vm_datastore' has size 500g.
dr-filer>  vol create vm_datastore_snapmirror aggr0 500g
dr-filer> vol restrict vm_datastore_snapmirror
dr-files> snapmirror initialize -S prod-filer:vm_datastore dr-filer:vm_datastore_snapmirror

Create an empty schedule by adding the following line to /etc/snapmirror.conf

prod-filer:vm_datastore   dr-filer:vm_datastore_snapmirror    - - - - -

Ensure you have added your production NetApp controllers as well as your DR controllers on the vCenter Web Clien Virtual Storage Console

Configuration:

In vCenter Web Client, go to your Datastores view.

(Optional but recommended) Enable Deduplication in your Datastores

// This will save storage and increase the efficiency of the replication because you will only replicate deduplicated data. To do so:

Right click on a Datastore -> NetApp VSC -> Deduplication -> Enable

Right click on a Datastore -> NetApp VSC -> Deduplication -> Start (Select to scan the entire volume)

smvi15

By default the deduplication process is scheduled daily at midnight, I recommend to have it happen at least 2 hours before SnapMirror replication.

For example:

Deduplication: daily at 8pm

SnapMirror: daily at 10pm

To change the default schedule of the deduplication process per volume you need to do the following on the NetApp controllers CLI:

prod-filer> sis config -s sun-sat@20 /vol/vm_datastore

Schedule the Backup and SnapMirror Update

Right click on a Datastore -> NetApp VSC -> Backup -> Schedule Backup

smvi16

smvi17

smvi18

smvi19

smvi20

 

smvi21

 

Add other Datastores to the same backup job (please remember that for SnapMirror Update to work you need to have pre-created the SnapMirror relationship).

Right click on the other Datastores -> NetApp VSC -> Backup -> Add to Backup Job

You will see the already created backup job (10pm_backup), select it and click ok.

smvi22

At this point, all the Datastores you selected will be Deduplicated, Backed-up and Replication to the DR site.

Restoring on the Prod or DR site

Now that NetApp VSC is setup, backing up a replicating data, we can restore at will from the snapshot.

Restore a VM (entire VM or some of its virtual disks)

Right click on VM -> NetApp VSC -> Restore

Select backup from the list and choose to restore entire VM or just some disks

Restore from Datastore

Right click on Datastore -> NetApp VSC -> Restore

Select backup from the list and choose what to restore

Mount a Snapshot (it will show as another Datastore and you can retrieve files or even start VMs)

Click on a Datastore and go to Related Objects -> Backups

Select Backup, Right-Click and select Mount

You will see the datastore present and mounted to one ESX host, from there you can retrieve files, start VMs, etc.

Once you are done go back to the Datastore and unmount the Backup.

 

In this guide I will go through the process of upgrading a NetApp cluster’s Data OnTap, RLM, disk and shelf firmware in a non-disruptive manner.

The following process is for a FAS3040 cluster, but it should work on other series.

Environment:
FAS3040 cluster
OS: DOT 8.0.3P2 7-mode
shelves:
– DS14MK2 (both FC and SATA)
– DS4243 (both SAS and SATA)

Information gathering
Do a sysconfig -v and check for the following:

...
System Storage Configuration: Multi-Path HA   /// This tells you that your system is multipathed from a controller to shelf perspective
...
Remote LAN Module           Status: Online
		Part Number:        110-XXXXX
		Revision:           XX
		Serial Number:      XXXXX
		Firmware Version:   4.0       // It is very important to use the latest RLM/SP version (this is your out of band access to the system)
		Mgmt MAC Address:   XXXXXXXXX
		Ethernet Link:      up
		Using DHCP:         no
...
...
                60: NETAPP   X267_HKURO500SSX AB0A 423.1GB (976642092 512B/sect) // Check your disk firmware (AB0A)
                61: NETAPP   X267_HKURO500SSX AB0A 423.1GB (976642092 512B/sect) 
                Shelf 1: AT-FCX  Firmware rev. AT-FCX A: 38  AT-FCX B: 38     // Check your module version: frimware (AT-FCX A: 38) for FC-connnected shelves
		Shelf 2: AT-FCX  Firmware rev. AT-FCX A: 38  AT-FCX B: 38
...
...
                11.22: NETAPP   X308_HMARK03TSSM NA01 2538.5GB (5860533168 512B/sect) // Check your disk firmware (NA01)
                11.23: NETAPP   X308_HMARK03TSSM NA01 2538.5GB (5860533168 512B/sect)
		Shelf   0: IOM3  Firmware rev. IOM3 A: 0132 IOM3 B: 0132      // Check your module version: firmware (IOM3 A: 0132) for SAS-connnected shelves
		Shelf  10: IOM3  Firmware rev. IOM3 A: 0132 IOM3 B: 0132

...
...

Usually when I perform an upgrade of OnTap, I take the opportunity (or it may be a requirement) to update disk and shelf firmware.
You need to get the disk, shelf and RLM/SP firmware from netapp’s site support.netapp.com

Steps:
1) Upgrade your RML/SP
Download the latest RLM/SP (4.1) from: https://support.netapp.com/NOW/download/tools/rlm_fw/

Check your RLM/SP version (in this case it is RLM)

toaster> rlm status
	Remote LAN Module           Status: Online
		Part Number:        110-xxx
		Revision:           xx
		Serial Number:      xxxxx
		Firmware Version:   4.0
		Mgmt MAC Address:   xxxxxxxxxxxxx
		Ethernet Link:      up
		Using DHCP:         no
	IPv4 configuration:
		IP Address:         xxxxxxxxx
		Netmask:            xxxxxxxxx
		Gateway:            xxxxxxxxx

Place the RLM_FW.zip on the NetApp controller, under $etc/software, then:

toaster> software list
..
RLM_FW.zip
...

toaster> software install RLM_FW.zip

toaster> priv set advanced

toaster*> rlm update -f

Note: You must enter the -f option.
...The update takes approximately 30 minutes.
...

When the system prompts you to reboot the RLM, enter y to continue.

Verify:

toaster> rlm status
	Remote LAN Module           Status: Online
		Part Number:        110-xxx
		Revision:           xx
		Serial Number:      xxxxx
		Firmware Version:   4.1
		Mgmt MAC Address:   xxxxxxxxxxxxx
		Ethernet Link:      up
		Using DHCP:         no
	IPv4 configuration:
		IP Address:         xxxxxxxxx
		Netmask:            xxxxxxxxx
		Gateway:            xxxxxxxxx

2) Upgrade your disk firmware for all the disks that are outdated(do this the night before the DOT upgrade)
To do the disk FW upgrade on the background, check the following is enabled:

toaster> options raid.background_disk_fw_update.enable

From the ‘sysconfig -v‘:
11.22: NETAPP X308_HMARK03TSSM NA01 2538.5GB (5860533168 512B/sect)
Disk X308_HMARK03TSSM with firmware NA01 needs to be upgraded to NA04

Download the latest firmware from: http://support.netapp.com/NOW/download/tools/diskfw/
Place the .LOD file under $etc/disk_fw

You will see that they will start upgrading on the background non-disruptively

3) Upgrade your shelf firmware (same day as DOT upgrade)

Download the latest firmware from: https://support.netapp.com/NOW/download/tools/diskshelf/
Copy the .SFW file and the .FVF file if present to the $etc/shelf_fw and .AFW and its .FVF file to the $etc/acpp_fw directory.

4) Upgrade OnTap
Download ontap from NetApp’s site- in this case 8.1.2
Check its md5 checksum against what netapp posts on their download page to make sure you image is good.

Since we are doing a NDU(non-disruptive-upgrade), please make sure one system can handle your load

sysstat -c 10 -x 3  // Check the CPU to make sure it does not go over 50%
toaster> sysstat -c 10 -x 3
 CPU    NFS   CIFS   HTTP   Total     Net   kB/s    Disk   kB/s    Tape   kB/s  Cache  Cache    CP  CP  Disk   OTHER    FCP  iSCSI     FCP   kB/s   iSCSI   kB/s
                                       in    out    read  write    read  write    age    hit  time  ty  util                            in    out      in    out
  5%      0      0      0      85       0      0     803     11       0      0    26     92%    5%  Tf    7%       0     58      0     283    831       0      0
  4%      0      0      0     101       0      0    1147   3140       0      0    26     94%   17%  :    10%       0    101      0     449    996       0      0
  4%      1      0      0     105       0      0     576     11       0      0    26     87%    0%  -     6%       0    104      0     315    140       0      0
  3%      1      0      0      59       0      0     371      8       0      0    26     91%    0%  -     7%       0     58      0     379    844       0      0
  6%      0      0      0     111       0      0    2383   4595       0      0     1     93%   37%  T    10%       1     83      0     260     28       0      0
  3%      0      0      0      36       0      0     349      8       0      0     1     91%    0%  -     8%       0     36      0     149    902       0      0
  4%      1      0      0      38       0      0     480     11       0      0     1     90%    0%  -    16%       0     37      0     312    853       0      0
  4%      1      0      0      98       0      0     379     11       0      0     1     92%    0%  -     7%       0     70      0     347   1107       0      0
  5%      0      0      0      65       0      0    1483   3224       0      0     1     95%   24%  T    12%       0     65      0     334    897       0      0
  4%      0      0      0      77       0      0     349     11       0      0     1     86%    0%  -     6%       0     77      0     235     33       0      0

On both NetApp controllers:
Download the system files for 8.1.2 (812_q_image.tgz) from the Support Site. Be sure to download the system files that match your node model.
If you are performing a Data ONTAP NDU (or backout), you must perform this step on both nodes before performing the takeover and giveback steps.

Copy 812_q_image.tgz to $etc/software

Make sure that it is there:

toaster> software list
...
812_q_image.tgz
...

Let NetApp know you are starting the NDU upgrade:

toaster> options autosupport.doit "Staring_NDU 8.1.2"

Start the upgrade (-r prevents automatic reboot)

toaster> software update 812_q_image.tgz -r
software: You can cancel this operation by hitting Ctrl-C in the next 6 seconds.
software: Depending on system load, it may take many minutes
software: to complete this operation. Until it finishes, you will
software: not be able to use the console.
cmd = ngsh -c system image update -node local -package file://localhost/mroot/etc/software/812_q_image.tgz -setdefault true
...
...
Installed MD5 checksums pass
Installing diagnostics and firmware files
Installation complete. image1 updated on node TOASTER

image1 has been set as the default

Tue Febsoftware: installation of 812_q_image 26 11:.tgz completed.
Please type "reboot" for the changes to take effect.     // DO NOT TYPE REBOOT, WE WILL TAKEOVER

Check the version

toaster> version -b
/cfcard/x86_64/freebsd/image1/kernel: OS 8.1.2
/cfcard/x86_64/freebsd/image2/kernel: OS 8.0.3
...
...

Now, use this opportunity to update the shelf firmware

toaster> storage download shelf
Downloading disk shelf firmware may take up to 10 minutes,
but will NOT disrupt client access during that time.

Are you sure you want to continue with shelf firmware update? yes
...
...
helf]: Firmware file IOM3.0152.SFW Tue Feb 26 11:06:11 EST [toaster: sdownloafu.downloadSuccess:info]: [storage download shelf]: Firmware file IOM3.0152.SFW downloaded on 2c.shelf0.
Tue Feb 26 11:06:11 EST [toaster: sfu.downloadSummary:info]: Shelf firmware updated on 3 shelves.
ded on 2c.shelf11.
Tue Feb 26 11:06:11 EST [toaster: sfu.downloadSuccess:info]: [storage download shelf]: Firmware file IOM3.0152.SFW downloaded on 2c.shelf10.
Tue Feb 26 11:06:11 EST [toaster: sfu.downloadSuccess:info]: [storage download shelf]: Firmware file IOM3.0152.SFW downloaded on 2c.shelf0.
Tue Feb 26 11:06:11 EST [toaster: sfu.downloadSummary:info]: Shelf firmware updated on 3 shelves.  // You are done
toaster> 

Perform the same process on the other NetApp controller

toaster2> software update 812_q_image.tgz -r
..
toaster2> version -b
..
toaster2> storage download shelf
[storage download shelf]: No shelves eligible for update   // You already did this on the other controller, this is to verify

Now that both controllers have the 8.1.2 DOT version, it is time for takeover in a NDU manner, which will reboot the controller

From controller1 (toaster)

toaster> cf status
Cluster enabled, toaster2 is up.
toaster> cf takeover
..
..
toaster(takeover)>

You should wait about 10 minutes before giving back to give the clients an opportunity to stabilize.
On the other controller, you will see (after a reboot)

Waiting for giveback...(Press Ctrl-C to abort wait)

…After 10 minutes

toaster> cf giveback
...
...

Check the second controller(toaster2) to ensure that it is running 8.1.2

toaster2> version
toaster2> sysconfig

Wait about 10 minutes, then from toaster2 takeover toaster

toaster2> cf takeover -n   // The option -n allows takeover when the onTap versions are incompatible, in this case 8.0.3 and 8.1.2
cf: ignoring version mismatch as part of NDU takeover
cf: takeover initiated by operator
...
...

You will see on toaster

Waiting for giveback...(Press Ctrl-C to abort wait)

Now is time to giveback services
On toaster2:

toaster2> cf giveback
...
...

Check the controller to ensure that it is running 8.1.2

toaster2> version
toaster2> sysconfig

Let NetApp know you are done:

toaster> options autosupport.doit "finishing_NDU 8.1.2"

That is it, RLM, disk fw, shelf fw and DOT were upgraded in a non-disruptive manner. You can check by running ‘sysconfig -v’

Many companies buy wildcard certificates for many reasons: price, management, flexibility, etc.

The following guide shows how to install a wildcard certificate from DigiCert on your NetApp controllers.

You will need the following 3 files in PEM format:
DigiCertCA.pem // This is the Certificate Authority, in this case from DigiCert
wildcard_example_com.pem // This is the wildcard certificate
wildcard_example_com_key.pem // This is the private key

1) Stop SSL on the NetApp controller
filer> secureadmin disable ssl

Now From a Linux/Unix system:

2) mount the NetApp’s vol0
LinuxStation# mkdir /mnt/filer
LinuxStation# mount filer.example.com:/vol/vol0 /mnt/filer

3) Go to the keymgr folder and backup the current certificate and key.

# Backup Certificate
LinuxStation# cd /mnt/filer/etc/keymgr/cert/
LinuxStation:/mnt/filer/etc/keymgr/cert/# mv secureadmin.pem secureadmin.pem.bak

# Backup Key
LinuxStation# cd /mnt/filer/etc/keymgr/key/
LinuxStation:/mnt/filer/etc/keymgr/key/# mv secureadmin.pem secureadmin.pem.bak

4) Create the new files based on the wildcard certificate files, assuming you placed them on /opt/certificates

# Create Certificate
LinuxStation# cd /opt/certificates/
LinuxStation:/opt/certificates/# cat wildcard_example_com.pem DigiCertCA.pem > secureadmin_cert.pem
LinusStation# mv /opt/certificates/secureadmin_cert.pem /mnt/filer/etc/keymgr/cert/secureadmin.pem

# Create Key
LinuxStation# cd /opt/certificates/
LinuxStation:/opt/certificates/# cat wildcard_example_com_key.pem > secureadmin_key.pem
LinusStation# mv /opt/certificates/secureadmin_key.pem /mnt/filer/etc/keymgr/key/secureadmin.pem

5) On the NetApp controller, add the new cert:
filer> secureadmin addcert ssl /etc/keymgr/cert/secureadmin.pem

6) Enable SSL
filer> secureadmin enable ssl

NetApp Data OnTap 8 comes with a feature called Data Motion which will move Volumes between aggregates with no disruption.
But for places that are running OnTap 7.x, and need to migrate Volumes from one aggr to another there is ndmpcopy or SnapMirror.

I had the task of moving all data from old NetApp shelves into new Shelves, this really meant to me migrating Volumes from the aggregates in the old shelves into aggregates in the new shelves.

For this guide I am going to use SnapMirror and the task is to migrate the volume ‘oldvol’ sitting on the aggregate ‘oldaggr’ to volume ‘newvol’ which will seat on aggregate ‘newaggr’. All of this is happening on the same NetApp controller, I am not migrating to another controller in this instance, this is just to decommision the old shelves.

Filer1:oldaggr:oldvol->Filer1:newaggr:newvol

1) Check that you have SnapMirror license

filer> license
snapmirror XXXXXX

* If you don’t you will need to purchase one and install it.

2) Add the controller (in this case is the same) to the allowed SnapMirror Hosts

options snapmirror.access host=filer1

3) Enable SnapMirror

options snapmirror.enable on

4) Create the SnapMirror destination volume. The size of the destination volume must be at least the same size as the original volume

vol create newvol newaggr 100G
// The original volume oldvol is also 100G

5) Restrict your destination volume to leave it ready for SnapMirror

vol restrict newvol

6) You can schedule replication to happen often, that way when you are ready to migrate, less data will need to be migrated during the cut-over. I ran scheduled replication every night at 10:00 PM, let it run during weekdays and cut-over to the new location on Saturday morning.

Add the schedule to /etc/snapmirror.conf

FILER1:oldvol FILER1:newvol – 0 2 * *

7) At this point we are ready to start the SnapMirror relationship

snapmirror initialize –S FILER1:oldvol FILER1:newvol

8 ) Monitor status of the replication

snapmirror status

9) At this point we are ready to cut-over to the new shelves/aggr. If you have a LUN in the volume, you might want to disconnect the server that attaches to the LUN by either disconnecting the LUN or unmapping the LUN to the server, or bring down the server while you are doing this maintenance.

10) Now run the migration, which will do the following:

  • Performs a SnapMirror incremental transfer to the destination volume.
  • Stops NFS and CIFS services on the entire storage system with the source volume.
  • Migrates NFS file handles to the destination volume.
  • Makes the source volume restricted.
  • Makes the destination volume read-write.

filer1> snapmirror migrate oldvol newvol

11) snapmirror migrate will migrate NFS handles, but you will need to re-establish CIFS connections and map the igroups to the new LUN paths

NetApp Appliances support Link Aggregation of their network interfaces, they call the Link Aggregation a VIF (Virtual Interface) and this provides Fault Tolerance, Load Balancing and higher throughput.

NetApp supports the following Link Aggregation modes:

From the NetApp documentation:
Single-mode vif
In a single-mode vif, only one of the interfaces in the vif is active. The other interfaces are on standby, ready to take over if the active interface fails.
Static multimode vif
The static multimode vif implementation in Data ONTAP is in compliance with IEEE 802.3ad (static). Any switch that supports aggregates, but does not have control packet exchange for configuring an aggregate, can be used with static multimode vifs.
Dynamic multimode vif
Dynamic multimode vifs can detect not only the loss of link status (as do static multimode vifs), but also a loss of data flow. This feature makes dynamic multimode vifs compatible with high-availability environments. The dynamic multimode vif implementation in Data ONTAP is in compliance with IEEE 802.3ad (dynamic), also known as Link Aggregation Control Protocol (LACP).

In this guide I will set up a Dynamic multimode vif between the NetApp system and the Cisco switches using LACP.

I am working with following hardware:

  • 2x NetApp FAS3040c in an active-active cluster
    With Dual 10G Ethernet Controller T320E-SFP+
  • 2x Cisco WS-C6509 configured as one Virtual Switch (using VSS)
    With Ten Gigabit Ethernet interfaces

Cisco Configuration:

Port-Channel(s) configuration:
// I am using Port-Channel 8 and 9 for this configuration
// And I need my filers to be in VLAN 10

!
interface Port-channel8
description LACP multimode VIF for filer1-10G
switchport
switchport access vlan 10
switchport mode access
!
interface Port-channel9
description LACP multimode VIF for filer2-10G
switchport
switchport access vlan 10
switchport mode access
!

Interface Configuration:
// Since I am using VSS, my 2 Cisco 6509 look like 1 Virtual Switch
// For example: interface TenGigabitEthernet 2/10/4 means:
// interface 4, on blade 10, on the second 6509

!
interface TenGigabitEthernet1/10/1
description “filer1_e1a_net 10G”
switchport access vlan 10
switchport mode access
channel-group 8 mode active
spanning-tree portfast
!
!
interface TenGigabitEthernet2/10/1
description “filer1_e1b_net 10G”
switchport access vlan 10
switchport mode access
channel-group 8 mode active
spanning-tree portfast
!
!
interface TenGigabitEthernet1/10/2
description “filer2_e1a_net 10G”
switchport access vlan 10
switchport mode access
channel-group 9 mode active
spanning-tree portfast
!
!
interface TenGigabitEthernet2/10/2
description “filer2_e1b_net 10G”
switchport access vlan 10
switchport mode access
channel-group 9 mode active
spanning-tree portfast
!

Check the Cisco configuration

6509-1#sh etherchannel sum
...
Group  Port-channel  Protocol    Ports
------+-------------+-----------+-----------------------------------------------
...
8    Po8(SU)       LACP      Te1/10/1(P)     Te2/10/1(P)     
9    Po9(SU)       LACP      Te1/10/2(P)    Te2/10/2(P)    
...

NetApp Configuration:

filer1>vif create lacp net10G -b ip e1a e1b
filer1>ifconfig net10G 10.0.0.100 netmask 255.255.255.0
filer1>ifconfig net10G up

filer2>vif create lacp net10G -b ip e1a e1b
filer2>ifconfig net10G 10.0.0.200 netmask 255.255.255.0
filer2>ifconfig net10G up

Don’t forget to make the change persistant

Filer1:: /etc/rc
hostname FILER1
vif create lacp net10G -b ip e1b e1a
ifconfig net `hostname`-net mediatype auto netmask 255.255.255.0 partner net10G
route add default 10.0.0.1 1
routed on
options dns.domainname example.com
options dns.enable on
options nis.enable off
savecore

Filer2:: /etc/rc
hostname FILER2
vif create lacp net10G -b ip e1b e1a
ifconfig net `hostname`-net mediatype auto netmask 255.255.255.0 partner net10G
route add default 10.0.0.1 1
routed on
options dns.domainname example.com
options dns.enable on
options nis.enable off
savecore

Check the NetApp configuration

FILER1> vif status net10G
default: transmit 'IP Load balancing', VIF Type 'multi_mode', fail 'log'
net10G: 2 links, transmit 'IP Load balancing', VIF Type 'lacp' fail 'default'
         VIF Status     Up      Addr_set 
        up:
        e1a: state up, since 05Nov2010 12:37:59 (00:06:23)
                mediatype: auto-10g_sr-fd-up
                flags: enabled
                active aggr, aggr port: e1b
                input packets 1338, input bytes 167892
                input lacp packets 101, output lacp packets 113
                output packets 203, output bytes 20256
                up indications 13, broken indications 6
                drops (if) 0, drops (link) 0
                indication: up at 05Nov2010 12:37:59
                        consecutive 0, transitions 22
        e1b: state up, since 05Nov2010 12:34:56 (00:09:26)
                mediatype: auto-10g_sr-fd-up
                flags: enabled
                active aggr, aggr port: e1b
                input packets 3697, input bytes 471398
                input lacp packets 89, output lacp packets 98
                output packets 153, output bytes 14462
                up indications 10, broken indications 4
                drops (if) 0, drops (link) 0
                indication: up at 05Nov2010 12:34:56
                        consecutive 0, transitions 17


This post is aimed to help administrators to keep Linux home directories in a centralized location and mounting them when needed by using the Automounter.
NOTE: Each user should have unique uid/gid

NFS Server:
Any NFS Server will do just fine.
I will use NetApp NFS since this is for a production environment.
filer.example.com

RHEL Client:
RHEL 5.3 64bit
rhelbox.example.com

Users:
john uid=2100 gid=2100
alex uid=2101 gid=2101

NetApp NFS Server Setup:
1) Create a volume to host your home directories

filer> vol create homedirs aggr1 200g

2) Enter the following in your /etc/exports file to export this to the specific RHEL client.

filer> exportfs -p rw=rhelbox.example.com,root=rhelbox.example.com /vol/homedirs
filer> exportfs -a

RHEL Client Configuration:

1) As root mount the volume anywhere in the system. (This is only to create the home directories and assign the proper ownership, then unmount.)
[root@rhelbox ~]# mkdir /mnt/homedirs
[root@rhelbox ~]# mount ny1afilerd1:/vol/homedirs /mnt/homedirs/
[root@rhelbox ~]# mount

filer:/vol/homedirs on /mnt/homedirs type nfs (rw,addr=rhelbox.example.com)

2) Create the home directories and assign proper ownership
[root@rhelbox ~]# mkdir /mnt/homedirs/{john,alex}

[root@rhelbox ~]# id john
uid=2100(john) gid=2100 groups=2100
[root@rhelbox ~]# chown 2100:2100 /mnt/homedirs/john/

[root@rhelbox ~]# id alex
uid=2101(alex) gid=2101 groups=2101
[root@rhelbox~]# chown 2101:2101 /mnt/homedirs/alex/

3) Copy the files from /etc/skel to the new home directory
[root@rhelbox ~]# for i in john alex; do cp /etc/skel/.* /mnt/homedirs/$i/; done

4) Unmount the temporary folder
[root@rhelbox~]# umount /mnt/homedirs
[root@rhelbox~]# rmdir /mnt/homedirs

5) Configure the Automounter
Enter the following in /etc/auto.master
/home /etc/auto.home –timeout=60

Create /etc/auto.home and populate as follows:
* -fstype=nfs,rw,nosuid,soft filer.example.com:/vol/homedirs/&

6) Restart the automounter
[root@rhelbox ~]# service autofs restart

7) That should be it, lets give it a try
[root@rhelbox ~]# su – john
[john@rhelbox ~]$ ls -A
.bash_history .bash_logout .bash_profile .bashrc

This posting will help you configuring multipathing on RHEL 5.3 for LUNs carved from a NetApp SAN. For this guide I am using a C-Class blade system with QLogic HBA cards.

1) Make sure you have the packages needed by RHEL, otherwise install them.

rpm -q device-mapper
rpm -q device-mappermultipath
yum install device-mapper
yum install device-mapper-multipath

2) Install QLogic Drivers if needed, or utilize RHEL drivers. In my case I am using HP C-Class blades with Qlogic HBA cards. HP drivers can be found at the HP site, driver is called hp_sansurfer. I am utilizing RHEL built in drivers, but you can install the HP/QLogic drivers as follows:

rpm -Uvh hp_sansurfer-5.0.1b45-1.x86_64.rpm

3) If Qlogic HBA, install the SanSurfer CLI, this is very useful program for doing things with QLogic HBA cards, it can be downloaded at QLogic website, install as follows:

rpm -Uvh scli-1.7.3-14.i386.rpm

4) Install NetApp Host Utilities Kit, the package is a tar.gz file, you can find it at the now site http://now.netapp.com.

Open it and run the install shell script

netapp_linux_host_utilities_5_0.tar.gz

5) Once Everything is installed on the host, create the LUN and ZONE it from the NetApp, Brocade(SAN Fabric),Host

To find your WWPNs, use the scli as follows:
# scli –i all
// Use the WWPN numbers for the iGroup and Brocade Aliases

6) Once it has been Zoned and mapped correctly, verify if your RHEL host can see it.

// Rescan HBA for new SAN Luns

# modprobe -r qla2xxx
# modprobe qla2xxx
// Check the kernel can see it
# cat /proc/scsi/scsi
# fdisk –lu

7) Utilize NetApp tools to see LUN connectivity

// Check your host and utilities see the LUNs
[root@server ~]# sanlun lun show
controller:          lun-pathname          device filename  adapter  protocol          lun size                                      lun state
NETAPPFILER:  /vol/servervol/serverlun  /dev/sdf         host6    FCP          100g (107374182                             400)   GOOD
NETAPPFILER:  /vol/servervol/serverlun  /dev/sda         host4    FCP          100g (107374182                             400)   GOOD
NETAPPFILER:  /vol/servervol/serverlun  /dev/sde         host6    FCP          100g (107374182                             400)   GOOD
NETAPPFILER:  /vol/servervol/serverlun  /dev/sdc         host5    FCP          100g (107374182                             400)   GOOD
NETAPPFILER:  /vol/servervol/serverlun  /dev/sdd         host5    FCP          100g (107374182                             400)   GOOD
NETAPPFILER:  /vol/servervol/serverlun  /dev/sdb         host4    FCP          100g (107374182                             400)   GOOD
.

8 ) Utilize NetApp tools to check multipathing, not set yet

[root@server ~]# sanlun lun show -p
NETAPPFILER:/vol/servervol/serverlun (LUN 0)                Lun state: GOOD
Lun Size:    100g (107374182400) Controller_CF_State: Cluster Enabled
Protocol: FCP           Controller Partner: NETAPPFILER2
Multipath-provider: NONE
--------- ---------- ------- ------------ --------------------------------------------- ---------------
   sanlun Controller                                                            Primary         Partner
   path         Path   /dev/         Host                                    Controller      Controller
   state        type    node          HBA                                          port            port
--------- ---------- ------- ------------ --------------------------------------------- ---------------
     GOOD  primary       sdf        host6                                            0c              --
     GOOD  secondary     sda        host4                                            --              0c
     GOOD  secondary     sde        host6                                            --              0c
     GOOD  secondary     sdc        host5                                            --              0d
     GOOD  primary       sdd        host5                                            0d              --
     GOOD  primary       sdb        host4                                            0c              --

Time to configure multipathing

9) Start the multipath daemon

# service multipathd start

10) Find you WWID, this will be needed in the configuration if you want to alias it.

Comment out the blacklist in the default /etc/multipath.conf, otherwise you will NOT see anything.

#blacklist {
#        devnode "*"
#}
// Show your devices and paths, and record the WWID of the LUN
# multipath -v3
...
...
===== paths list =====
uuid                              hcil    dev dev_t pri dm_st  chk_st  vend/pr
360a98000486e576748345276376a4d41 4:0:0:0 sda 8:0   1   [undef][ready] NETAPP,
360a98000486e576748345276376a4d41 4:0:1:0 sdb 8:16  4   [undef][ready] NETAPP,
360a98000486e576748345276376a4d41 5:0:0:0 sdc 8:32  1   [undef][ready] NETAPP,
360a98000486e576748345276376a4d41 5:0:1:0 sdd 8:48  4   [undef][ready] NETAPP,
360a98000486e576748345276376a4d41 6:0:0:0 sde 8:64  1   [undef][ready] NETAPP,
360a98000486e576748345276376a4d41 6:0:1:0 sdf 8:80  4   [undef][ready] NETAPP,
...
...

11) Now you are ready to configure /etc/multipath.conf

Exclude (blacklist) all the devices that do not correspond to any
LUNs configured on the storage controller and which are mapped to
your Linux host. There are 2 methods:
Block by WWID
Block by devnode
In this case I am blocking by devnode since I am using HP and know my devnode RegEx
Also configure the device and alias(optional).
The full /etc/multipath.conf will look like this:


defaults
{
        user_friendly_names yes
        max_fds max
        queue_without_daemon no
}
blacklist {
        ###devnode "*"
           devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
           devnode "^hd[a-z]"
           devnode "^cciss!c[0-9]d[0-9]*"  # Note the cciss, usual in HP
}
multipaths {
        multipath {
                wwid    360a98000486e57674834527533455570    # You found this
                alias   netapp # This is how you want to name the device in your host
                               # server LUN on NETAPPFILER
        }
}
devices
{
        device
        {
        vendor "NETAPP"
        product "LUN"
        getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
        prio_callout "/sbin/mpath_prio_ontap /dev/%n"
        features "1 queue_if_no_path"
        hardware_handler "0"
        path_grouping_policy group_by_prio
        failback immediate
        rr_weight uniform
        rr_min_io 128
        path_checker directio
        flush_on_last_del yes
}
}

12) Restart multipath and make sure it starts automatically:

// Restart multipath
# service multipathd restart
// Add to startup
# chkconfig --add multipathd
# chkconfig multipathd on

13) Verify multipath is working

//RHEL tools 
[root@server scli]# multipath -l
netapp (360a98000486e576748345276376a4d41) dm-2 NETAPP,LUN
[size=100G][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=0][active]
 \_ 4:0:1:0 sdb 8:16  [active][undef]
 \_ 5:0:1:0 sdd 8:48  [active][undef]
 \_ 6:0:1:0 sdf 8:80  [active][undef]
\_ round-robin 0 [prio=0][enabled]
 \_ 4:0:0:0 sda 8:0   [active][undef]
 \_ 5:0:0:0 sdc 8:32  [active][undef]
 \_ 6:0:0:0 sde 8:64  [active][undef]
//NetApp utilities Tool 
 [root@server scli]# sanlun lun show -p
NETAPPFILER:/vol/servervol/serverlun (LUN 0)                Lun state: GOOD
Lun Size:    100g (107374182400) Controller_CF_State: Cluster Enabled
Protocol: FCP           Controller Partner: NETAPPFILER2
DM-MP DevName: netapp   (360a98000486e576748345276376a4d41)     dm-2
Multipath-provider: NATIVE
--------- ---------- ------- ------------ --------------------------------------------- ---------------
   sanlun Controller                                                            Primary         Partner

    state       type    node          HBA                                          port            port
--------- ---------- ------- ------------ --------------------------------------------- ---------------
     GOOD  primary       sdb        host4                                            0c              --
     GOOD  primary       sdd        host5                                            0d              --
     GOOD  primary       sdf        host6                                            0c              --
     GOOD  secondary     sda        host4                                            --              0c
     GOOD  secondary     sdc        host5                                            --              0d
     GOOD  secondary     sde        host6                                            --              0c
...

14) Now you can access the LUN by using the mapper

 [root@server scli]# ls -l /dev/mapper
total 0
crw------- 1 root root  10, 63 Sep 12 12:32 control
brw-rw---- 1 root disk 253,  2 Sep 16 10:54 netapp
brw-rw---- 1 root disk 253,  0 Sep 12 16:32 VolGroup00-LogVol00
brw-rw---- 1 root disk 253,  1 Sep 12 12:32 VolGroup00-LogVol01

15) Format it to your liking and mount it

# mkdir /mnt/netapp
# mkfs -t ext3 /dev/mapper/netapp
# mount /dev/mapper/netapp /mnt/netapp/
//verify it mounted
# mount
...
...
/dev/mapper/netapp on /mnt/netapp type ext3 (rw)
...

16 ) If you want it to be persistent after reboots put it on /etc/fstab and make sure multipathd start automatically.

# cat /etc/fstab
...
...
/dev/mapper/netapp      /mnt/netapp             ext3    defaults        0 0

17) If possible reboot to check it mounts correctly after reboots.

You have added a new disk or increased the size of your LUN, or increased the size of the virtual disk in case of virtual machines, and now you need to increase the partition, the Logical Volume and the filesystem in order to be able to use the new space.

In this post I go through the steps necessary to make this happen in a RHEL 5.3 system.

The LUN I will increase has 20GB and it had an LVM partition. I decided to increase the LUN size to 72GB and this is how it looks now.

[root@server~]# fdisk -lu
Disk /dev/sdb: 77.3 GB, 77309411328 bytes
255 heads, 63 sectors/track, 9399 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sdb1 * 1 2611 20971488 8e Linux LVM

I need to perform the following steps in order to be able to use the new space.

1. Increase the size of the partition using fdisk

[root@server ~]# fdisk /dev/sdb

Command (m for help): u //Change the display to sectors
Changing display/entry units to sectors
Command (m for help): p //Print the current partition table for that drive
Disk /dev/sdb: 77.3 GB, 77309411328 bytes
255 heads, 63 sectors/track, 9399 cylinders, total 150994944 sectors
Units = sectors of 1 * 512 = 512 bytes
Device Boot Start End Blocks Id System
/dev/sdb1 * 64 41943039 20971488 8e Linux LVM
Command (m for help): d //Delete the partition information, we will recreate
Selected partition 1
Command (m for help): n //Create partition
Command action
e extended
p primary partition (1-4)
p //In this case it is primary
Partition number (1-4): 1 // In this case it is the first partition on the drive
First sector (63-150994943, default 63): 64 //Align partition if used on NetApp
Last sector or +size or +sizeM or +sizeK (64-150994943, default 150994943):
Using default value 150994943
Command (m for help): t //Change type from Linux(default) to Linux LVM
Selected partition 1
Hex code (type L to list codes): 8e //Linux LVM partition type
Changed system type of partition 1 to 8e (Linux LVM)
Command (m for help): p //Print again to double check
Disk /dev/sdb: 77.3 GB, 77309411328 bytes
255 heads, 63 sectors/track, 9399 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdb1 1 9400 75497440 8e Linux LVM
Command (m for help): w //Write the partition table
The partition table has been altered!
Calling ioctl() to re-read partition table.
WARNING: Re-reading the partition table failed with error 16: Device or resource busy.
The kernel still uses the old table.
The new table will be used at the next reboot.
Syncing disks.

2. You need to reboot for the changes to take place, or just run

server# partprobe

3. Make LVM acknowledge the new space

[root@server ~]# pvresize /dev/sdb1

4. Check that the Volume group shows the new space

[root@server ~]# vgs
VG #PV #LV #SN Attr VSize VFree
vg0 1 2 0 wz–n- 71.97G 52.00G

5. Extend the logical volume:
make it total of 28G in this example

[root@server~]# lvresize -L 28G /dev/mapper/vg0-lvhome
Extending logical volume lvhome to 28.00 GB
Logical volume lvswap successfully resized

You can also take all the free space available

[root@server ~]# lvresize -l +100%FREE /dev/mapper/vg0-lvhome
Extending logical volume lvhome to 67.97 GB
Logical volume lvhome successfully resized

6. Use the rest for whatever partition you want

[root@server~]# lvcreate -l 100%FREE -n lvdata vg0

7. Resize the Filesystem

[root@server~]# resize2fs /dev/mapper/vg0-lvhome
resize2fs 1.39 (29-May-2006)
Filesystem at /dev/mapper/vg0-lvhome is mounted on /home; on-line resizing required
Performing an on-line resize of /dev/mapper/vg0-lvhome to 9953280 (4k) blocks.
The filesystem on /dev/mapper/vg0-lvhome is now 9953280 blocks long.

When deploying vmware virtual machine on top of VMFS on top of a NetApp SAN, you need to make sure to align it properly otherwise you will end up with performance issues. File system misalignment is a known issue when virtualizing. Also, when deploying LUNs from a NetApp appliance, you need to make sure no to reformat the LUN, or you will lose the alignment, just create a filesystem on top of the LUN.

NetApp provides a great technical paper about this at: http://media.netapp.com/documents/tr-3747.pdf

In this post Iwill show you how to align an empty vmdk disk/LUN using the open source utility GParted. This is for new vmdk disks/LUNs, dont do it on disk that contain data as you will lose it. This is for Golden Templates that you want aligned, so subsequent virtual machines will inherit the right alignment, or for servers that need a NetApp LUN attached.

The resulting partition works for Linux and Windows, just create a filesystem on top of it.

You can find GParted at: http://sourceforge.net/projects/gparted/files/

1. Boot the VM from the GParted CD/Iso. Click on the terminal icon to open a terminal:

2. Check the partition Starting Offsets, in this case I have 3 disks 2 are already aligned to the 64k offset, I will align the new disk as well.

3. Create an aligned partition on the drive using fdisk

gparted# fdisk /dev/sdc

Below is a screenshot of the answers to fdisk, the important option is to select to start the offset at 64k, as indicated.

4. Now check again and the partition should be aligned

[root@server ~]# fdisk -lu

Disk /dev/sda: 209 MB, 209715200 bytes
64 heads, 32 sectors/track, 200 cylinders, total 409600 sectors
Units = sectors of 1 * 512 = 512 bytes

Device Boot Start End Blocks Id System
/dev/sda1 * 64 409599 204768 83 Linux

Disk /dev/sdb: 77.3 GB, 77309411328 bytes
255 heads, 63 sectors/track, 9399 cylinders, total 150994944 sectors
Units = sectors of 1 * 512 = 512 bytes

Device Boot Start End Blocks Id System
/dev/sdb1 * 64 41943039 20971488 8e Linux LVM

Disk /dev/sdc: 107.3 GB, 107374182400 bytes
255 heads, 63 sectors/track, 13054 cylinders, total 209715200 sectors
Units = sectors of 1 * 512 = 512 bytes

Device Boot Start End Blocks Id System
/dev/sdc1 64 209715199 104857568 83 Linux