Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support for using clones+cloud-init #34

Merged
merged 5 commits into from
Jul 29, 2020
Merged

Conversation

travisghansen
Copy link
Contributor

@travisghansen travisghansen commented Jul 19, 2020

This adds support for using clone+cloud-init images.

Basic requirement of the template are:

  • quemu-guest installed
  • cloud-init installed

Currently only 2 changed flags are needed to make it work:

  • --proxmoxve-provision-strategy clone
  • --proxmoxve-vm-clone-vmid <vmid>

I've not tested on full clones yet, my storage setup supports shallow cloning so I enjoy the space saving etc.
Full clones tested.

There are some other minor updates/fixes/additions as well.

@travisghansen travisghansen mentioned this pull request Jul 19, 2020
@travisghansen
Copy link
Contributor Author

Notes I have (still evolving):

  • support full clone? how to allow full manipulation of all the opts?

  • sockets?

  • instructions for use?

    • image requirements (guest agent, cloud-init, docker optional, maybe want to make sure cloud-init doesn't do a full update on boot)
  • strategy param?

  • proper value for onboot?

  • does proxmox need to be so sticky to a specific node? can we tell it to just deploy to the 'cluster' and let it go from there?

  • support updating the network card/bridge with clone mode

  • use cloud-init drive with rancher-os (for ssh key injection instead of hacky ssh commands)?

  • remove ssh password logic and associated code

  • protection flag on VMs

  • flag for citype

@travisghansen travisghansen mentioned this pull request Jul 21, 2020
@travisghansen travisghansen changed the title WIP: support for using clones+cloud-init support for using clones+cloud-init Jul 22, 2020
@travisghansen
Copy link
Contributor Author

@lnxbil I consider this complete and ready for review. It should support all the previous cdrom use-case along with many many improvements across the board on top of the obvious cloud-init support.

I've ran many tests locally on my development machine and also created several clusters from rancher including nodes backed by Ubuntu, CentOS, and Rancher OS. In it's current state it appears to be robust enough to handle failure scenarios and other generally bad stuff (like a very overtaxed proxmox taking many hours to install etc).

Open to review/suggestions at this point.

@lnxbil
Copy link
Owner

lnxbil commented Jul 23, 2020

Wow, thank you. Didn't had time to build and review the changes yet, but hopefully on the weekend.

@lnxbil
Copy link
Owner

lnxbil commented Jul 28, 2020

So I had a better look at the code and also built an test environment. After reverting back to eth0 as the NIC naming scheme I got a little bit further, but not quite running. It stopped at:

&{[-F /dev/null -o ConnectionAttempts=3 -o ConnectTimeout=10 -o ControlMaster=no -o ControlPath=none -o LogLevel=quiet -o PasswordAuthentication=no -o ServerAliveInterval=60 -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null [email protected] -o IdentitiesOnly=yes -i /Users/andreas/.docker/machine/machines/docker-clone/id_rsa -p 22] /usr/bin/ssh <nil>}

So, there is still something missing. Can you describe which what arguments you tried?

@travisghansen
Copy link
Contributor Author

@lnxbil what are you using for the template image? My guess is that it's failing due to root username there but can't be sure (and yes, eth0 is a requirement across the board currently...I considered adding an option to pick the nic name).

@lnxbil
Copy link
Owner

lnxbil commented Jul 28, 2020

@lnxbil what are you using for the template image? My guess is that it's failing due to root username there but can't be sure (and yes, eth0 is a requirement across the board currently...I considered adding an option to pick the nic name).

The root was my guess, because before that, there was no username and @<IP> (without username) is a syntax error and it halted there.

My system is a plain old buster + cloudinit + docker I setup for this test.

@travisghansen
Copy link
Contributor Author

I normally use the various 'cloud images' from the different vendors but haven't done so for debian. Did you build it manually? Which user gets the ssh key injected by cloud-init?

@travisghansen
Copy link
Contributor Author

I'll build a viable image with buster cloud image real quick and send over all the details..

@travisghansen
Copy link
Contributor Author

OK, worked like a charm first try for me. Here's my script to create the image/template (alter to your liking, it simply has the baseline stuff I use for k8s scenarios):

#!/bin/bash

set -x
set -e

export IMGID=9007
export BASE_IMG="debian-10-openstack-amd64.qcow2"
export IMG="debian-10-openstack-amd64-${IMGID}.qcow2"
export STORAGEID="bitness-nfs"

if [ ! -f "${BASE_IMG}" ];then
  wget https://cloud.debian.org/images/cloud/OpenStack/current-10/debian-10-openstack-amd64.qcow2
fi

if [ ! -f "${IMG}" ];then
  cp -f "${BASE_IMG}" "${IMG}"
fi

# prepare mounts
guestmount -a ${IMG} -m /dev/sda1 /mnt/tmp/
mount --bind /dev/ /mnt/tmp/dev/
mount --bind /proc/ /mnt/tmp/proc/

# get resolving working
mv /mnt/tmp/etc/resolv.conf /mnt/tmp/etc/resolv.conf.orig
cp -a --force /etc/resolv.conf /mnt/tmp/etc/resolv.conf

# install desired apps
chroot /mnt/tmp /bin/bash -c "apt-get update"
chroot /mnt/tmp /bin/bash -c "DEBIAN_FRONTEND=noninteractive apt-get install -y net-tools curl qemu-guest-agent nfs-common open-iscsi lsscsi sg3-utils multipath-tools scsitools"

# https://www.electrictoolbox.com/sshd-hostname-lookups/
sed -i 's:#UseDNS no:UseDNS no:' /mnt/tmp/etc/ssh/sshd_config

sed -i '/package-update-upgrade-install/d' /mnt/tmp/etc/cloud/cloud.cfg

cat > /mnt/tmp/etc/cloud/cloud.cfg.d/99_custom.cfg << '__EOF__'
#cloud-config

# Install additional packages on first boot
#
# Default: none
#
# if packages are specified, this apt_update will be set to true
#
# packages may be supplied as a single package name or as a list
# with the format [<package>, <version>] wherein the specifc
# package version will be installed.
#packages:
# - qemu-guest-agent
# - nfs-common

ntp:
  enabled: true

# datasource_list: [ NoCloud, ConfigDrive ]
__EOF__

cat > /mnt/tmp/etc/multipath.conf << '__EOF__'
defaults {
    user_friendly_names yes
    find_multipaths yes
}
__EOF__

# enable services
chroot /mnt/tmp systemctl enable open-iscsi.service || true
chroot /mnt/tmp systemctl enable multipath-tools.service || true

# restore systemd-resolved settings
mv /mnt/tmp/etc/resolv.conf.orig /mnt/tmp/etc/resolv.conf

# umount everything
umount /mnt/tmp/dev
umount /mnt/tmp/proc
umount /mnt/tmp

# create template
qm create ${IMGID} --memory 512 --net0 virtio,bridge=vmbr0
qm importdisk ${IMGID} ${IMG} ${STORAGEID} --format qcow2
qm set ${IMGID} --scsihw virtio-scsi-pci --scsi0 ${STORAGEID}:${IMGID}/vm-${IMGID}-disk-0.qcow2
qm set ${IMGID} --ide2 ${STORAGEID}:cloudinit
qm set ${IMGID} --boot c --bootdisk scsi0
qm set ${IMGID} --serial0 socket --vga serial0
qm template ${IMGID}

# set host cpu, ssh key, etc
qm set ${IMGID} --scsihw virtio-scsi-pci
qm set ${IMGID} --cpu host
qm set ${IMGID} --agent enabled=1
qm set ${IMGID} --autostart
qm set ${IMGID} --onboot 1
qm set ${IMGID} --ostype l26
qm set ${IMGID} --ipconfig0 "ip=dhcp"

After that, I launched the machine with:

docker-machine --debug create --driver proxmoxve --engine-install-url https://get.docker.com --proxmoxve-provision-strategy clone --proxmoxve-proxmox-host 172.29.2.1 --proxmoxve-proxmox-node cloud01 --proxmoxve-proxmox-user-name root --proxmoxve-proxmox-user-password password --proxmoxve-proxmox-realm pam --proxmoxve-vm-storage-size 20 --proxmoxve-vm-cpu-sockets 2 --proxmoxve-vm-cpu-cores 2 --proxmoxve-vm-memory 8 --proxmoxve-vm-storage-path '' --proxmoxve-vm-image-file bitness-nfs:iso/rancheros-proxmoxve-autoformat-v1.5.6.iso --proxmoxve-vm-clone-vmid 9007 --proxmoxve-vm-clone-full 2 --proxmoxve-vm-start-onboot 1 --proxmoxve-vm-protection 0 --proxmoxve-vm-citype nocloud --proxmoxve-ssh-username debian --proxmoxve-ssh-password '' --proxmoxve-debug-resty --proxmoxve-debug-driver docker-rancher

Some of the args above are irrelevant for clone and/or optional generally...but that should get you going..

@lnxbil
Copy link
Owner

lnxbil commented Jul 29, 2020

Thank you, that is also great as an example for the README.md.

In the end I discovered what my problem was: I forgot to add the cloudinit drive to the VM I was cloning. Yet without providing the ssh-username, it still was not able to run, so I threw away my container and went with your script. After changing nfs to ZFS, it worked out of the box and very, very fast.

@travisghansen
Copy link
Contributor Author

Nice! I have similar scripts for all the major distros and even the rancher os image as well. I do need to clean them up a bit and could probably organize them a little better but they work. I’ll commit them all to a github repo and if we want we can point to that here or even just copy them and include if desired.

@lnxbil lnxbil merged commit ee5f9ca into lnxbil:master Jul 29, 2020
@benosman
Copy link

Nice! I have similar scripts for all the major distros and even the rancher os image as well. I do need to clean them up a bit and could probably organize them a little better but they work. I’ll commit them all to a github repo and if we want we can point to that here or even just copy them and include if desired.

@travisghansen - I wondered if you ever got the the chance to commit the scripts for other distros? I'm mainly interested in an Ubuntu one.

@travisghansen
Copy link
Contributor Author

@benosman there are a few bits that are specific to my env, but it should be pretty easy to alter to your needs.

@nayrnet
Copy link

nayrnet commented Mar 26, 2021

I use cloudinit to make templates off minimal cloud images: https://gist.github.com/nayrnet/066e3963397de02594d4963a9258e22f

unfortunately right now w/proxmox its either/or using a yaml or proxmox UI/API with cloudinit.. I've asked them to implement vendordata which would allow for both to be used together.. but it still works for creating templates, then remove the snipplet to switch it to proxmox cloudinit.
https://bugzilla.proxmox.com/show_bug.cgi?id=2429#c5

@benosman
Copy link

Thanks @travisghansen and @nayrnet, I will try those out tomorrow.

@nayrnet: I like your suggestion of using the vendordata, I hope they take it up. Proxmox's cloudinit does seem quite limited compared to other hypervisors and cloud platforms.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants