Tuesday 20 November 2012

Unable to start vApp - Error: Invalid network in property vami.ip0.VM_1.

There was a requirement to restart our vCenter Operations Manager vApp recently which normally would have been a fairly straightforward process (Log into each VM and initiate a shutdown of the OS and then shut the vApp down). This time though there was an issue when I came to power the vApp back on again as I was greeted by the following error:

(Invalid network 255.255.0.0 in property vami.netmask0.VM_1.)

...and this one too:
(Invalid network in property vami.ip0.VM_1.)

Now this was not something which I had seen before and threw me for a while before finally figuring it out.

The vApp is assigned it's IP settings from an IP Pool associated with the datacenter, in this case both vms receive IP, Netmask, Gateway and DNS settings from this pool. When checking this in more detail I found that the network which was associated with this IP Pool was incorrect.
What had happened was that we had migrated the vApps network from a standard vSwitch for a Distributed vSwitch a few months ago.  The port groups used in the old standard vSwitch was named slightly differently than the new port group on the VDS even though they were the same VLAN ID. Was this meant was that when the vApp tried to power on again it was still looking for the port group from the old vSwitch and as such could not find it and could therefore not power on again.

To resolve this there was two simple settings to change:

First the IP Pool needed to be associated with the correct network again.
To do this, simply go to the datacenter in the vSphere client and then select the IP Pools tab. Then right click the Pool and select properties and then go to the 'Associations' tab and place a tick in the associated network for the IP Pool.


Second, the vApp itself needed to be updated to use the correct network for each of its IP settings.
Select the vApp from the Hosts and Clusters view and then in the summary tab, select 'Edit Settings'. Select 'Advanced' from the left menu and then select the 'Properties' box to reveal the 'Advanced Property Configuration' window as below.
Next just select each entry in turn and select 'Edit' to change the network to the correct value.


Once these settings were applied, then vApp could be started and it's IPs were once again allocated to each vm and all was well!

Simple error and all entirely self inflicted!  Just be aware of vApps and their associated networks as these settings are not changed when you change the individual vms network settings.



Thursday 15 November 2012

Storage vMotion Error: The method is disabled by 'SYMC-FULL dd-mm-yyyy...'

I had this error come up the other day whilst trying to SvMotion one of our vms over to a new storage array:


Now this is actually one of those obvious and helpful error messages that you get every now and then and just by looking at the error message I could see what had caused this issue.

We use Backup Exec with the avvi agent to perform backups of some of our production vms.
The avvi agent allows us to perform SAN to Tape backups off host which means we don't need to do anything special with regards to backup configuration on any of our ESXi hosts.  The configuration process is a simple of case enable the option within the Backup Exec media servers and present the ESX Datastores to them (with the same LUN IDs etc) and that's pretty much it. Most of our vms are also running the Backup Exec Remote Agent for it's OS (Windows or Linux) which then allows us to have granular file recovery from our image based backups which is a nice feature...although not as useful when doing your backups to tape and not disk as the recovery process still needs to extract the full vmdk off of the tape before recovering the individual files to be restored to the vm or elsewhere.

A good guide for setting this configuration up can be found on Symantec's website here:

Now what usually happens when a backup job is run on a vm using this method is this:

·         The BE job starts on the media server and talks to vCenter to take a snapshot of the vms vmdk
·         Once completed the vm is now running from the snapshot and the original vmdk is static and only read by the vm  
·         BE then gets the ESXi host and guest virtual machine information from vCenter it needs to backup
·         BE then opens a connection with the ESXi server to ask for the virtual machine metadata
·         BE then informs vCenter to disable Storage vMotion for that VM to ensure that the backups can complete successfully.
·         Using vStorage APIs, Backup Exec then opens a direct data connection to the ‘unknown’ SAN volumes which have been presented to it and the virtual machine data is offloaded directly to the media server for backup
·         Once the backup process has completed the snapshot is deleted and BE disconnects from the ESXi host and informs vCenter to enable Storage vMotion again for the vm
·         Backup job then completes.


The error above is caused by the Storage vMotion being disabled by Backup Exec to run the backups.  After the backup job completes the call to vCenter does not get made or fails and so the vm is stuck with it's Storage vMotion disabled.

The trouble with this is that you often don't know this is an issue until you go to perform a Storage vMotion or unless you have vms inside an SDRS cluster and they fail to migrate to other datastores.

You can however identify these vms though by performing a lookup within the vCenter database as described in this VMware KB article:

Luckily this is a known issue and there are two very easy ways to address this if you have this issue.  
The first, and often easiest way, is to shutdown the vm and remove it from the inventory.  Then browse thedatastore where it resides, locate the vmx file and add it to the inventory again.
This approach basically gives the vm a new id within vcenter and thus gets any customised settings removed allowing it to SvMotion again.
This does pose an issue however in that you will need downtime on your vm, although very short, in order to resolve this.

The other approach, as detailed by VMware in the KB above, is to manually edit the settings within the vCenter DB for the vm affected.  Whilst this does not require a vm outage to work, it does require vCenter to be stopped whilst you access the DB and in some instances (Environments with vCloud Director, SRM, LabManager etc) this is more impacting than 1 vm being shutdown for a couple of mins and so finding a quiet evening or weekend to shut the vm down is my preferred approach and this can be easily scripted anyway to save those long hours from building up!

This is not restricted to Symantec btw.  I have seen this issue with VEEAM backup software also and as yet I'm not aware of any definitive solution to prevent this from happening from time to time. It pays to keep an eye on this if you are running a similar backup technology in your environment.