How to troubleshoot Azure with custom Debian VHD; error message “provisioning failed”?

I have built a customised Debian (Jessie/8) VHD which I am trying to get running on Azure. According to my research (and the MS docs I’ve read) it meets the Azure Linux VHD requirements. I have tried to err on the side of caution to reduce or eliminate possible causes of issue. Here’s some general info about my VHD:

  • static VHD file (some sources suggest that dynamic ones are supported, but some say they’re not).
  • VHD is in VPC format and aligned to 1 MB.
  • no LVM (again some sources say that it is supported, others say it’s not).
  • a single bootable primary partition (no swap) that use all of the space on the VHD (some sources note that this is a requirement).
  • grub has been tweaked as per requirements (as per https://azure.microsoft.com/en-us/documentation/articles/virtual-machines-linux-debian-create-upload-vhd/)
  • waagent installed, configured and the server has been “deprovisioned”.

I don’t have access to MS HyperV so I have only tested it locally in VirtualBox. FWIW it works fine there.

I am using azure-cli (why you would write a commandline client in NodeJS blows my mind, but that’s a whole other story…!). I am also using the newer “Resource Manager” mode (as opposed to the old “Classic” one).

The upload goes fine and I have already worked around a few issues (like the vhd file extension being removed by Azure) but now I’m stuck. When I try to provision the server it takes ages (literally hours) chugging away and then eventually fails. azure-cli is not giving any useful error messages; just something like “provisioning failed”.

I have a fair bit of experience working with AWS and Linux, but basically none with Azure. Regardless, surely it shouldn’t be this hard!?

So how do I troubleshoot what my issue is?

Is there somewhere within Azure (or an undocumented debug switch for azure-cli) that I can get more detailed error messages?

Is there a list of common issues somewhere that i could work through?

Any help would be deeply appreciated! Thanks in advance.

[update] Here’s the commands that I’m using:

azure storage blob upload <path-to-vhd> <container-name> <vhd-on-azure-name> -k <access-key> -a <storage-account-name>

FWIW the storage account used is in the same resource group as the
vm.

Then this is the problematic part (that goes forever and eventually fails):

azure vm create -u <admin-username> -p <admin-password> --location "AustraliaEast" -g "<my-resource-group>" "<vm-name>" -Q <full-vhd-urn> -f <nic-name> -F <vnet-name> -P 10.0.0.0/8 -j <subnet-name> -k 10.0.0.0/24 -o <storage-account-name> -y linux

The nic, vnet and subnet were originally auto-created by the first vm-create that failed. I have retired with auto creation since, but to avoid polluting the azure account I just took note of their names for re-use.

Other additional info:

  • image initially built to raw (and then vhd) on Debian Jessie.
  • to test newer version of QEMU I resized the original raw and converted to vhd on Arch Linux using QEMU 2.7 using this command:
    qemu-img convert -f raw -O vpc -o subformat=fixed

  • also tested both with and without “force_size” to maintain alignment.

  • all attempts work fine in VirtualBox, and all fail with Azure.

[another update]
Thanks @Sam for the suggestion to try using “Classic”. It seems that the classic interface has better error reporting. It says that there is something wrong with the VHD (although doesn’t give any clues on what). I’ll try tweaking it some more and post back if I discover anything. Thanks guys.

Answer

Attribution
Source : Link , Question Author : Jeremy Davis , Answer Author : Community

Leave a Comment