Last week I was at a customer trying to find out why earlier that week the entire Hyper-V cluster crashed. This was a cluster environment with three Hyper-V nodes, Cluster Shared Volumes (about 10 of them) and SCVMM as a management tool.
Because “it crashed” is not so a clear description in my opinion, since a crash involves a BSOD and there for a memory dump, I’ve spoken to the customer a little bit before I started troubleshooting.
Now what happened… She created a VHD through SCVMM and suddenly she realized that creating a 500GB VHD through may take quite some time… So she cancelled the job in SCVMM and started to create the VHD by using the VHDTool. But here comes the gotcha… When you create a VHD through SCVMM, it will set a lock on that location.
So if you create a VHD in the root of a CSV, the entire CSV will get a lock set on it. Let’s say that some critical component of the cluster is stored on that location, something like the quorum disk… the cluster won’t like it when that gets a lock. Also, if there are machines stored on that location the cluster nodes will, again, not like this.
In the case of my customer, the cluster service crashed and therefor the entire cluster.
FYI: Some of the symptoms I’ve seen in the logs:
1) Cluster node was removed from active membership.
2) Storage locations were not reachable.
3) Disks defined as a cluster resource were offline.
4) MPIO errors were logged… lots of them!
Conclusion: Never ever create a VHD in the root of a cluster shared volume!
Note: This is very well documented by the way… so if you RTFM this issue is something you won’t experience.