Thursday 5 December 2013

Getting around vCenter 512bit certificates

Can't get vCenter WebClient  to work on your new Windows 8.0/8.1 machine as the certificate is not trusted?

The issue arises when you have deployed a vCenter server from version 4.0 and by default the self-signed certificate generated by vCenter was 512bits. If you have never replaced your certificate then it will still be 512bits. If you have upgraded your vCenter to 4.1, 5.0, 5.1 or even 5.5 then, unless you have replaced the certificate along the way, it will still be the same 512bit certificate it was when you first started out.

Nothing very wrong with this setup, until MS released KB2661254 which changed the default minimum accepted key length from 512bits to 1024bits to nudge people up the security ladder a little. This resulted in vCenter certificates no longer being supported by clients and hence stopping access to vCenter web portal.

Now the correct way to deal with this is to generate a new vCenter certificate which has at least a 1024bit key (preferably 2048bits) and this will then not only allow the updated clients to function again, but will give you a warm fussy feeling that only running your environment at a higher security level can achieve. This is, however, easier said than done. 
The process of replacing certificates within vCenter is not straightforward. VMware have significantly improved the process by way of the SSL Automation Tool for vCenter 5.0 and above but this is still a fairly lengthy process which is fraught with possible danger (of breaking your vCenter deployment). This needs to be planned and tested and adequate backup and recovery processes put in place before you proceed with doing this on a mature production environment.

A short term fix to get around this is to once again trust the 512bit key and proceed as you were.
The below command can be run from a command prompt on a Windows client to revert the KBs effects:

certutil -setreg chain\minRSAPubKeyBitLength 512

Obviously, doing this will also result in the client trusting ALL 512bit keys that are out there so this should only be viewed as a short term fix whilst you plan the certificate upgrades for vCenter as recommended by VMware.

Once you have resolved the certificate issues and are now sporting a shiny new 1024bit (or 2048bit) certificate, don’t forget to revert the changes above to secure your client(s) again. This can be easily done by removing the registry entry that the above command creates here:

HKEY_LOCAL_MACHINE\Software\Microsoft\Cryptography\OID\EncodingType0\CertDLLCreateCertificateChainEngine\Config\minRSAPubKeyBitLength



Thursday 28 November 2013

vCloud Director sysprep files

Had some fun running up a vCD server this past week so thought I'd post a quick memo to advise of the following changes in vCD between vCD 5.1 and vCD 5.5 regarding sysprep files.

I had been following some excellent blogs on the vCD 5.1 install process from Kendrick Coleman (Install vCD 5.1 & vCD Networking) and applying this to my vCD 5.5 installation.  When I tried to follow the process copy the sysprep files over to the vCD cell I hit a snag as there was no script to run to generate the sysprep files required. This, it turns out, is because in 5.5 they have improved this process and now you simply need to create the directories and place the sysprep files into the directory and away you go.  Not even a service restart is required to start customizing older OSes through vCD.

The folder locations in vCD 5.5 should be (extract taken from the VMware install document for vCD 5.5 - which I should have read more keenly it seems!):

Procedure:

  1. Log in to the target server as root.
  2. Change directory to $VCLOUD_HOME/guestcustomization/default/windows.
    [root@cell1 /]# cd /opt/vmware/vcloud-director/guestcustomization/default/windows
  3. Create a directory named sysprep.
    [root@cell1 /opt/vmware/vcloud-director/guestcustomization/default/windows]# mkdir sysprep
  4. For each guest operating system that requires Sysprep binary files, create a subdirectory of
    $VCLOUD_HOME/guestcustomization/default/windows/sysprep.
    Subdirectory names are specific to a guest operating system and are case sensitive.
    • Windows 2003 (32-bit) should be called svr2003
    • Windows 2003 (64-bit) should be called svr2003-64
    • Windows XP (32-bit) should be called xp
    • Windows XP (64-bit) should be called xp-64
  5. Copy the Sysprep binary files to the appropriate location on each vCloud Director server in the server group.
  6. Ensure that the Sysprep files are readable by the user vcloud.vcloud.
    Use the Linux chown command to do this.
    [root@cell1 /]# chown -R vcloud.vcloud $VCLOUD_HOME/guestcustomization
When the Sysprep files are copied to all members of the server group, you can perform guest customization
on virtual machines in your cloud. You do not need to restart vCloud Director after the Sysprep files are copied.

So there you go...simple if you read the manuals properly in the first place :)

Thursday 14 November 2013

VMworld 2013 - some thoughts...

I had the very good fortune of attending the VMworld 2013 conference in Barcelona in October [for free too, courtesy of one of out IT suppliers :-)] and so thought I'd post a few thoughts and impressions gathered from the conference whilst I still remember them fresh(ish).

I had previously been to one other VMworld, Cannes in 2009, and had been very impressed with the conference and the general quality of the break out sessions and so was looking forward to this conference immensely especially given some of the new technologies which had been revealed during the US event a couple of months prior such as, vSphere 5.5, vFRC and the awesome looking VSAN.

The venue, having now moved to Barcelona, was new but the quality of the event was still top notch!
The break-out sessions are the real reason to go to these conferences and they did not disappoint one bit. Close to the start of the event it seemed that many of the sessions I wanted to attend were fully booked up. At first I was annoyed with this but soon realised that just going to the session and waiting outside before it started pretty much guaranteed you a place in the room anyway (although probably at the back) and I ended up not missing a single session all week.
My favourite sessions were on VSAN, flash caching and some of the new cloud automation suites that VMware are now doing. Flash, btw, was everywhere at this event.  If you were in any doubt about how things are progressing with flash technology, you were left in no doubt at this event that flash is going to be EVERYWHERE pretty soon (if it's not made it into your datacentre already).

VMworld had released a mobile app for your smartphone where you could register for sessions and plan your days activities and this was really useful to have, especially when trying to navigate around the enormous conference suite. they had provided maps, social feeds and even an interactive game in the app. This was a really good improvement from the last VMworld I'd been to and even though there were large screens displaying all of the session info almost everywhere you looked, it was so handy to have when you were sitting in a quiet spot in the 'hang-space' and trying to plan where to go to later that day.

I remember being impressed by the Labs at the 2009 conference and again I really liked the accessibility and ease of which you can get first hand experience of so many of the new tech coming out from VMware.  This was a popular part of the conference, especially on the first day, but later in the event it was fairly easy to get a desk and get onto any lab that you wanted.
They had even provided BYOD lab areas where you would use your own laptop to connect to the lab environment which I thought was a great idea (except that I'd only brought my old Android tablet out with me which wasn't really up for the challenge).

The solutions exchange was where all of the vendors pitched up to show off their wares and this had all of the usual suspects that you would expect.  One very noticeable exception though was Symantec.  I had hoped they would be attending (like they had in 2009) as we use Symantec backup products I had a few things I wanted to discuss around vSphere backups and virtual machine AV protection.  From what I gathered this was probably a political withdrawal due to some support issues with their backup products being a little late to the vSphere 5.1 support party (by nearly a year) and probably didn't want to be on the end of too much public bashing where the people who really felt these issues would likely be.
Having said that,  I read recently of how Symantec are offering support for vSphere 5.5 and future releases within 90 days of GA.  This is a great response to the problem and if they keep it up, they will surely keep vSphere backup customers and gain new ones too! 90 days is a very acceptable time frame by which you would start to think about deploying an upgrade to the GA of a new mission critical infrastructure platform such as vSphere.

Some of the solutions exchange highlights I saw this year were these (in no particular order):

  • Tintri - VM aware storage promising great performance at a price point that makes a lot of sense to seriously question your next SAN upgrade.
  • NetApp Flash Accel integration with VSC 5.0 - This is something which I am currently looking to deploy into production and probably the subjet of my next blog post too!  A great product (which is free to existing NetApp customers) and now fully integrated into the vSphere web client.  Looked very slick and adds to the already excellent VSC product too.
  • Flashsoft - Flash caching for physical and virtual environments.  Reasonable price and even though the vendor is SANdisk, it works with any SSD or PCIe flash device too
  • Infinio - VM caching solution which uses ESXi host RAM instead of SSD devices.  Very nice concept and again another sweet price point too (albeit with the requirement to have significant memory free in each ESXi host which is not that typical in my experience)
There were many great products and demos and I've certainly missed out loads of good ones.  These were just some that I was particularly impressed with and liked what they were doing. 
As I said earlier, flash and storage caching solutions were everywhere in the solutions exchange and this is a space where there will be a huge change to how we are mostly all doing our virtual deployments at present.  It's getting cheaper and the solutions are getting smarter too.  Always a good combination!






Friday 26 July 2013

Failed to open (The parent virtual disk has been modified since the child was created)

Error:
  • Failed to open (The parent virtual disk has been modified since the child was created).
This error came up the other day one a couple of our virtual machines when we tried to power them on after they dies over a weekend.
This issue is in fact covered extremely well by the following KB article here,
and I would highly recommend that you read through the article and get to grips with how the various files fit together which make up the virtual server and it's disks and snapshots etc. as it will help no ends when trying to fix this or similar issues.


Now it turns out that this issue was being caused by our backup software trying to take a weekly tape copy of some virtual machines whilst at the same time a NetApp SnapMirror for Virtual Infrastructure (SMVI) backup and replication job was trying to run.
The two snapshot commands seem to have overlapped and whilst one was being deleted the other was trying to create a new snapshot and so the disk descriptor files were pointing to different snapshot delta files and referencing the wrong parent ID (This all makes more sense when you read the KB article, trust me!).
I'm not to sure why this is allowed to occur but this has now happened around 5 times in our environment over weekends to different vms and as such we have had to be more selective about when we schedule the tape backups to avoid the regular NetApp snapshots (We only do both as we do not hold long disk retention policies offsite and so require tape backups to supplement our disk backup strategy for long term backup retention...a pain, but just the way it is at present. 

To fix this issue the article recommends connecting to the host and manually opening, reading and possibly editing these files using VI but that is not too easy when you are trying to compare multiple files and cross-reference IDs and parent IDs on potentially 3, 4, 5 or more disk descriptor files depending on the number of snapshots and disks the vm has.

My approach is to follow the steps below and use free 3rd party tools to make things easier on yourself too.

Process:

  1. Enable SSH on the ESXi host and open the hosts file wall port for SSH server if not already allowed (do this through vCenter for ease!)
  2. Connect to the ESXi host using WinSCP – This is much easier than going through the command line or vMA service as detailed in the KB
  3. Copy the following files to your local machine to identify the issue:
    1. Virtualserver.log – use this to identify which disk and which snapshot file is reporting the issue
    2. Virtualserver.vmx – use this to identify which snapshots are currently identified as in use
    3. Virtualserver.vmdk – this is the base disk descriptor file containing the first parent CID
    4. Virtualserver-00001.vmdk – this will be the first snapshot delta disk descriptor file and should have the base disks CID as its parent (there may be more than one snapshot file per disk such as 00002.vmdk and/or 00003.vmdk etc. which should all reference the preceding snapshot as their parent until they eventually lead back to the base disks CID)
  4. Use NotePad++ or similar to view all of the files (This utility is excellent for formatting these files into a more readable state and also maintains the files formatting when modifying which you are likely to have to do!)
  5. Make a copy of the files unedited on your machine in case the resolution doesn't work (IMPORTANT!!!)
  6. Make the required changes to the disk descriptor files or the vmx file as required in order to resolve using the information in the KB article. For reference if the snapshot delta file does not contain any data (16mb or less for example) then it may be best to just edit this out of the vmx file and point to an earlier snapshot or the base disk itself in order to bring the vm back online again.
  7. Copy the edited file(s) back to the original location and overwrite as needed using WinSCP
  8. Power on the VM and cross those fingers! :)
  9. If all is good then be sure to delete any unused snapshot descriptor, delta and check point files from the virtual servers directory so as not to affect any future snapshots and to keep things clean.
This is a good and fairly straightforward resolution to the issue. Key to getting this right though is understanding how the descriptor files work and mapping out (often on a piece of paper if needs be) the relationship between each base disk and the snapshot(s) before making any changes.  As mentioned, keep a copy of these files as you may be able to revert any changes made in error just by replacing these files.  Ideally though if you are not certain, always ensure that you have a full backup of all of the files (especially the flat files) before making any changes as per best practices!

Good luck.

Thursday 2 May 2013

RemoteApp international keyboard layouts

Whilst publishing some RemoteApps to a disperse group of users internationally we needed to be able to support more than the default (well, default to most of the English speaking world anyway) QWERTY keyboard layout.

Initially I was thinking that this could be a big deal and my first thoughts were that this would involve creating custom profile setups for different groups with the correct language and keyboard layout defined but, as it turned out, this was not the case and this was simply a case of installing the additional keyboard layouts using good old 'Region and Language' option within control panel and then going to the 'Keyboards and Languages' tab and adding the additional layout required.


After that, in the 'Text Services and Input Languages' window, go to 'Advanced Key Settings' and verify or change the option to toggle between the two keyboard layouts.  The default is to use Left Alt+Shift but this can be changed to something else if you wish.



Once set, when you launch the RemoteApp you can now toggle the keyboard layout by simply pressing the toggle command you set and away you go!  Simple....as IT should be! :)

Friday 18 January 2013

Missing 'Unknown Media Changer' from device manager

We recently performed an expansion to one of our tape libraries which was a Quantum Scalar i500 unit.  These are very flexible units which range from 2 to 18 tape drives depending on configuration.
We we expanding this unit from 4 to 10 LTO4 drives which meant that we had to add two new drives to the existing enclosure and then bolt on a 9u expansion unit to house the other 4 drives.

All of this went well and after running all of the library tests and coming up good we were ready to re-present this library back to the backup servers to be re-incorporated to our backup policies.
It was here that we ran into a problem which I suspect is quite common in this situation.
When we brought our backup server online again the server could now see 10 LTO4 drives, but could no longer see the 'unknown media changer' which appears within device manager representing the library configuration.

After being sent on a bit of a goose chase by a Quantum article stating that this is caused by the wrong HBA driver installed on our fibre HBA (The article states that this is the case when a storport driver is used instead of a scsiport or fcport driver and to check with the HBA vendor for the correct driver to resolve the issue) it turns out that the issue was a lot simpler to fix (isn't it always the case?).

On the library configuration portal there is an option within the 'Setup' menu called 'Control Path'. This had nothing configured when we checked our library after the upgrade and as this sets up one of the tape drives to present all of the library configuration through to the connected server this was obviously our issue.
To resolve this we simply selected one of the drives to be the control path and applied the setting.  A quick re-scan of the devices within the media server and up popped our library and all of the correct slots etc.

The reason why this setting was no longer there was worrying to me though so I did some digging and it turns out that the way that the drive numbers are allocated can change when you add additional drives to the library, even if the original drives are not moved from their current slots.
Also, keep in mind that if you have more than one partition on your library that you will need to set the control path for each partition within the library.

PS: once you get the media changer presented to your server be sure to install the correct driver for the changer as this will then name the media changer with the make/model of the library etc.