DoC Computing Support Group


NetApp maintenance procedures

Overview

This guide describes the most common procedures to mantain a NetApp system.

  1. Notify NetApp People that some maintenance is being done

  2. Backup the current NetApp config before changing it

  3. Change a failed disk

  4. Update SSL certificate

1.Notify NetApp People that some maintenance is being done

Firstly, remember that a NetApp system phones back home to report any problem. When you want to do some maintenance works, you will have to log this in the log system that NetApp support looks at:

  babbler01> options autosupport.doit "-------------YYYYMMDD_maintenance_START------------------"

and after you finish your maintenance works remember to log it as well:

  babbler01> options autosupport.doit "-------------YYYYMMDD_maintenance_END--------------------"
  babbler01> options autosupport

Secondly, note that you can twick the autosupport feature as follows:

  babbler01> options autosupport.partner.to netapp.support@qassociates.co.uk
  babbler01> options autosupport.enable on

Thirdly, remember to do it for both babbler01 and babbler02.

2.Backup the current NetApp config before changing it

Before changing something in the NetApp configuration, backup the current one. For example for babbler01:

  babbler01> config dump -v 20121102babbler01.cfg

If you want to have a look, it might be easier from oriole:

  root@oriole:/srv/babbler01_vol0/etc/configs# wc -l 20121102babbler01.cfg
  2475 20121102babbler01.cfg

  root@oriole:/srv/babbler01_vol0/etc/configs# du -hs 20121102babbler01.cfg
  128K  20121102babbler01.cfg

3.Change a failed disk

Note there is a distinction between a completely failed disk and a partially failed one.

"A disk that is completely failed is no longer counted by Data ONTAP as a usable disk, and you can immediately disconnect the disk from the disk shelf. However, you should leave a partially failed disk connected long enough for the Rapid RAID Recovery process to complete" is the description provided in NetApp documentation.

Check the status of the disks:

   babbler02> aggr status -f
   Broken disks (empty)

Determine the physical location of the disk you want to remove from the output of the command below:

  babbler02> aggr status -s 
  Spare disks
  RAID Disk     Device          HA  SHELF BAY ... ...
  ---------     ------          ------------- ... ...
  Spare disks for block checksum    
  spare         0a.00.11        0a    0   11  ... ...
  spare         0a.00.13        0a    0   13  ... ...
  ... ...

The location is shown in the columns labeled HA, SHELF, and BAY.

Remove the disk from the disk shelf and replace it with a new one. Suppose the replaced disk is 0a.00.20. The disk has to be assigned to a controller:

babbler02> priv set advanced
babbler02> disk assign 0a.00.20 

http://seriousbirder.com/blogs/netapp-disk-failure-and-replacement-procedures/

4.Generate a new SSL certificate

On babbler01 ontap CLI (can use either main CLI or service processor):

priv set advanced
secureadmin setup -f -q ssl t GB London "South Kensington" \
 "Imperial College of Science, Technology and Medicine" \
 "Computing Department" babbler01.doc.ic.ac.uk \
  help@doc.ic.ac.uk 1024

On babbler02:

priv set advanced
secureadmin setup -f -q ssl t GB London "South Kensington" \
 "Imperial College of Science, Technology and Medicine" \
 "Computing Department" babbler02.doc.ic.ac.uk \
 help@doc.ic.ac.uk 1024

Ref: http://community.netapp.com/t5/OnCommand-Storage-Management-Software-Discussions/OnCommand-System-Manager-recieves-error-500/td-p/93621/page/9

 
 

project/privatecloud/hardware/netapp-procedures (last edited 2016-01-14 13:51:51 by ldk)