VMware/ESX
Installation
Booting the ESX DVD on a SunFire v40z worked...until the boot screen came up, and we only had a serial console attached to it. We could type blindly or cheat and go to the box and attach a monitor to it:
* press DOWN, to select "text install" * press 5 times BACKSPACE, to remove the "quiet" from the boot arguments * type "console=ttyS0,115200n8" * press ENTER
Or we just edit the ISO, so that it'll boot in textmode with the correct arguments. Yeah, we're so l337, baby :-)
If all goes well, the installation should proceed:
ESX 4.0 -- Virtual Infrastructure for the Enterprise------------ Welcome to the ESX Text Installer Release 4.0 This wizard will guide you through the installation of ESX.
Postinstall
Serial Console
First, enable we have to setup the physical part serial console. This might be a real terminal or an emulated terminal, accessible from the service processor of the host system. On this particular system (Sun V40z), this can be done in the system's ILOM:
> platform set console --speed 115200
> platform get console
Rear Panel Console Redirection Speed Pruning Log Trigger
SP Console Enabled 115200 No 244 KB
This will get allow us to access the bootloader (GRUB). Here, we'll configure the kernel parameter which will allow us to watch Linux booting. During bootup, press "e" to edit the boot-entries and modify the "kernel" as follows:
title VMware ESX 4.0 #vmware:autogenerated esx root (hd0,0) uppermem 307200 kernel /vmlinuz ro root=/dev/sda2 mem=300M console=ttyS0,115200n8 text initrd /initrd.img
Note that the "text" parameter was also added, so that VMware would start into a text install properly.
However, after booting has finished, the kernel will load /sbin/init which is not (yet) configured for our serial console and we will lose control over the serial line now. We could login via SSH to the newly installed ESX server, but we only created the root during the installation process and root access is NOT possible via SSH right now. Luckily, KB 8375637 has a way how to enable root access for SSH connections:
- Login with the vSphere Client directly on the ESX server (NOT on the vCenter server!)
- Once logged in, there will be a "Users & Groups" tab. Create a new, temporary user with a password and shell access
- When the user has been created, login via SSH with the username we just created
- Use "su" to gain root access.
Now we can add the following to /etc/inittab to support a serial console:
S:2345:respawn:/sbin/agetty 115200 ttyS0
Reload init with init q. Add ttyS0 to securetty to allow logins from the serial console:
$ tail -3 /etc/securetty
tty10
tty11
ttyS0
From that point on, login via the serial console should be possible. While we're at it, enable root logins via SSH:
$ grep ^Permit /etc/ssh/sshd_config PermitRootLogin yes $ service sshd reload
With all that in place, it might be a good idea to remove the (temporary) user we just created.
Firewall
While inbound ssh traffic is enabled we want to enable outbound ssh client connections. Outbound NetBIOS and NFS might come in handy too:
esxcfg-firewall -e sshClient esxcfg-firewall -e smbClient esxcfg-firewall -e nfsClient
As a shortcut, (temporarily) allow all outgoing traffic:
esxcfg-firewall --allowOutgoing
Be sure to disable it again:
esxcfg-firewall --blockOutgoing
To allow a specific service, i.e. for outbound syslog messages:
esxcfg-firewall -o 514,udp,out,syslog
Software
DAG
Import their public keys:
$ wget http://apt.sw.be/RPM-GPG-KEY.dag.txt $ rpm --import RPM-GPG-KEY.dag.txt $ rpm -qa gpg-pubkey* gpg-pubkey-6b8d79e6-3f49313d
Let's see how our installation names itself:
$ rpm -qf /bin/ls
coreutils-5.97-14.el5
This "el5" stands for "Enterprise Linux 5" - more specifically for RHEL 5. Thus we can install packages from an "el5"-repository:
$ export REL=el5 $ rpm -hiv http://apt.sw.be/redhat/$REL/en/`uname -i`/extras/RPMS/rsync-3.0.9-1.$REL.rfx.x86_64.rpm
pkgs.org
pkgs.org is a meta-site, we can install from one of the available mirror sites:
rpm -hiv http://mirror.centos.org/centos/5/os/`uname -i`/CentOS/rsync-3.0.6-4.el5_7.1.x86_64.rpm
Disable .vswp files
To disable (the use of) .vswp files, set "Memory Reservation" to the configured memory size of the VM. This way an empty .vswp file will be created.
vSphere Client
Once VMware ESX is installed, the client can be downloaded from the ESX server:
https://ESX_SERVER.example.com/client/VMware-viclient.exe
vSphere client needs at least the following ports to connect to vCenter Server and/or ESXi/ESX hosts:
443/tcp 902/tcp 903/tcp
SSH forwarding should work just fine with these ports. In some cases, Windows might need a hint where to connect to:
vSphere Client could not connect to vCenter Server client01 Details: A connection failure occured (Unable to connect to remote server)
- Add "127.0.0.1 esx01" to %systemroot%\system32\drivers\etc\hosts.
- Point vSphere Client to esx01 and the connection should work now.
Syslog
Configure a syslog server:
$ tail -1 /etc/syslog.conf *.*;auth,authpriv,cron.warning @loghost $ service syslog restart
Update
ESX updates can be applied in different ways:
- vihostupdate (must be installed)
- esxupdate
- VMware vCenter Update Manager (only for ESXi hosts)
Before the actual upgrade, we'll shut down or migrate all running VMs and enter the maintenance mode:
$ vimsh -n -e /hostsvc/maintenance_mode_enter
$ vimsh -n -e /hostsvc/runtimeinfo | grep inMaintenanceMode
inMaintenanceMode = true,
vihostupdate
Download vSphere CLI, then unpack:
sha1sum ../VMware-vSphere-CLI-*.tar.gz # Verify the checksum! tar -xzf ../VMware-vSphere-CLI-*.tar.gz cd vmware-vsphere-cli-distrib
Install:
$ ./vmware-install.pl --prefix=/opt/vmware/cli Creating a new vSphere CLI installer database using the tar4 format. Installing vSphere CLI. Installing version 253290 of vSphere CLI which: no ld in (/bin:/usr/bin:/sbin:/usr/sbin) No Crypt::SSLeay Perl module or linker could be found on the system. Please either install SSLeay from your distribution or install a development toolchain and run this installer again for encrypted connections. The following Perl modules were found on the system but may be too old to work with vSphere CLI: Compress::Zlib Please wait while copying vSphere CLI files... The installation of vSphere CLI 4.0.0 build-253290 for Linux completed successfully. You can decide to remove this software from your system at any time by invoking the following command: "/opt/vmware/cli/bin/vmware-uninstall-vSphere-CLI.pl".
Add /opt/vmware/cli to our PATH:
$ printf 'PATH=$PATH:/opt/vmware/cli/bin\nexport PATH\n' >> /etc/profile.d/local.sh $ . /etc/profile.d/local.sh $ vihostupdate --version VI Perl Toolkit version: 4.0 Script 'vihostupdate' version: 4.0
esxupdate
Note: ESX Updates are meant to be cumulative - however patches are comprised as "sets" and not every set is included in the next update. For example:
- Patch_01 updates the following sets: VMkernel, hostd and Tools
- Patch_02 updates the following sets: VMkernel, hostd
- → Patch_02 contains all available updates for "VMkernel" and "hostd" but leaves out "Tools". So, although patches are cumulative we will have to apply them one after another after all :-\
After downloading the updates they need to be transferred to the ESX host. Alternatively we could also access them via a network share:
mount -t cifs -o ro,user=guest //server/updates /mnt/cdrom cd /mnt/cdrom
List all updates included in the package:
esxupdate --bundle update-from-esx4.0-4.0_update04.zip info | less
After reviewing the output, perform the actual update:
esxupdate --bundle update-from-esx4.0-4.0_update04.zip update sync reboot
After the reboot (and possibly further updates), let's review all installed updates:
$ esxupdate query ----Bulletin ID---- -----Installed----- ----------------Summary---------------- ESX400-Update04 2012-05-17T11:35:27 VMware ESX 4.0 Complete Update 4 ESX400-201203402-SG 2012-05-17T12:26:39 Updates Python package ESX400-201203403-SG 2012-05-17T12:26:39 Updates Curl RPM ESX400-201203404-SG 2012-05-17T12:26:39 Updates samba RPM and libsmbclient ESX400-201203405-SG 2012-05-17T12:26:39 Updates popt, rpm, rpm-libs, rpm-python ESX400-201203406-SG 2012-05-17T12:26:39 Updates libuser ESX400-201203407-SG 2012-05-17T12:26:39 Updates Kerberos RPMs ESX400-201203408-BG 2012-05-17T12:26:39 Updates tzdata ESX400-201205401-SG 2012-05-17T12:38:28 Updates VMkernel, VMX, and others
The specific update path for this machine was:
esxupdate --bundle update-from-esx4.0-4.0_update04.zip update # Released 2011-11-17 esxupdate --bundle ESX400-201112401.zip update # Released 2011-12-13 esxupdate --bundle ESX400-201203001.zip update # Released 2012-03-30 esxupdate --bundle ESX400-201205001.zip update # Released 2012-05-03
After the update has been completed, exit the maintenance mode:
$ vimsh -n -e hostsvc/maintenance_mode_exit
$ vimsh -n -e /hostsvc/runtimeinfo | grep inMaintenanceMode
inMaintenanceMode = false,
vmware-cmd
Get status of each registered virtual machine:
$ for v in `vmware-cmd -l`; do printf "$v "; vmware-cmd "$v" getstate; done /vmfs/volumes/3cfe21dd-c4f646d3-063f-00013d143b12/netbsd0/netbsd0.vmx getstate() = on /vmfs/volumes/3cfe21dd-c4f646d3-063f-00013d143b12/fedora0/fedora0.vmx getstate() = on /vmfs/volumes/3cfe21dd-c4f646d3-063f-00013d143b12/gentoo0/gentoo0.vmx getstate() = off
Start virtual machine:
export VMX=/vmfs/volumes/3cfe21dd-c4f646d3-063f-00013d143b12/gentoo0/gentoo0.vmx vmware-cmd $VMX start
Stop virtual machine, even when no VMware tools are installed:
vmware-cmd $VMX stop hard
Reset virtual machine, even when no VMware tools are installed:
vmware-cmd $VMX reset hard
vmware-cmd $VMX createsnapshot gentoo-20120928 "My first snapshot" 1 0 # QuiesceFilesystem=1, IncludeMemory=0
vim-cmd
vim-cmd can also be used to access the virtual machines.
$ vim-cmd vmsvc/getallvms
Vmid Name File Guest OS Version
128 fedora0 [v40z2] fedora0/fedora0.vmx rhel6Guest vmx-07
16 debian1 [v40z2] debian1/debian1.vmx debian5Guest vmx-07
160 netbsd0 [v40z2] netbsd0/netbsd0.vmx freebsdGuest vmx-07
Start/shutdown virtual machine:
vim-cmd vmsvc/power.on 160 vim-cmd vmsvc/power.shutdown 160 # Works only with VMware Tools installed
Power off virtual machine:
vim-cmd vmsvc/power.off 160
List snapshots of one virtual machine:
$ vim-cmd vmsvc/snapshot.get 160 Get Snapshot: |-ROOT --Snapshot Name : 2011-08-06 --Snapshot Desciption : --Snapshot Created On : 8/6/2011 13:56:37 --Snapshot State : powered off
List snapshots of all virtual machine:
$ for vm in `vim-cmd vmsvc/getallvms | awk '!/^Vmid/ {print $1}'`; do printf "VMID: $vm" vim-cmd vmsvc/get.summary $vm | grep name vim-cmd vmsvc/snapshot.get $vm echo done | less
Create snapshot:
vim-cmd vmsvc/snapshot.create 160 2012-09-28 "test" 0 1 # includeMemory=0, quiesced=1
Revert to snapshot:
$ vim-cmd vmsvc/snapshot.revert 160 0 0 0 # suppressPowerOff=0, snapshotLevel=0, snapshotIndex=0 Remove Snapshot: |-ROOT --Snapshot Name : 2012-12-16 --Snapshot Desciption : --Snapshot Created On : 12/16/2012 23:19:29 --Snapshot State : powered off
Remove snapshot:
vim-cmd vmsvc/snapshot.remove 160 0 # removeChildren=0
Note: unlike vmware-cmd, the vim-cmd returns to the command-prompt immediately and the issued task continues in the background! Use vmsvc/get.tasklist to see running tasks.
Known Issues
Repeating characters in VMware Console
On a slow link, the console sometimes repeats characters, making it hard to type correctly. Add this to your VM configuration:
keyboard.typematicMinDelay = 2000000
This will set the repeat time to 2000000µs, or 2 seconds.
IPMI hangs
Booting might hang on IPMI:
* ipmi ... [ !! ] * vmci ... [ ok ]
In /var/log/messages we see:
sfcb[3843]: RawIpmiProvider::initialize: No IPMI Interface. Will not be polling. \ Error Message: File /dev/ipmi0 not found
According to vm-help.com this most likely indicates a server or BIOS issue. To disable ipmi:
sed -i.orig '/Exec/s/^/return ${SUCCESS} # disable IPMI\n\n/' /etc/vmware/init/init.d/72.ipmi mv /etc/vmware/init/init.d/72.ipmi.orig /var/tmp
Timed out waiting for vmware-aam to startup
During bootup, this happens:
Starting vmware-aam:Timed out waiting for vmware-aam to startup... backgrounding[FAILED]
The AAM (Automated Availability Manager) logfile (/var/log/vmware/aam/vmware_bob.log) shows:
Info FT Fri May 27 10:37:51 2011 By: FT/Agent on Node: bob MESSAGE: Need to reconfigure heartbeat settings. Not yet set up. =================================== Info FT Fri May 27 10:37:51 2011 By: FT/Agent on Node: bob MESSAGE: Starting reconfiguration of heartbeat settings. =================================== Error FT Fri May 27 10:37:52 2011 By: FT/Agent on Node: alice MESSAGE: ftProcMon failed. Being restarted =================================== Info FT Fri May 27 10:37:53 2011 By: FT/Agent on Node: bob MESSAGE: Finished reconfig of heartbeat settings in 2 seconds. =================================== Info NODE Fri May 27 10:37:53 2011 By: FT/Agent on Node: alice MESSAGE: Node v40z1 is running. =================================== Info PROC Fri May 27 10:38:01 2011 By: FT/Agent on Node: alice MESSAGE: Started process VMap_bob on bob [pid = 9312]
And indeed, booting continued and AAM seems to be running:
# /etc/init.d/vmware-aam status vmware-aam is running [ OK ]
Unable to get COS default route
During boot, this happens:
Starting VMware ESX services: 'IpSecConfig' warning] Ipv6 not Enabled 'RoutingInfo' warning] Unable to get COS default route 'RoutingInfo' warning] Unable to restore VMkernel default gateway (10.0.0.1): \ Unable to set VMkernel gateway address. Please verify your IP settings and try again
I only found KB 1002729 where this happened in combination with iSCSI and DHCP enabled. The solution was to specify a static default route:
# echo "GATEWAY=10.0.0.1" >> /etc/sysconfig/network # cat /etc/sysconfig/network NETWORKING=yes HOSTNAME=bob.example.com IPV6_AUTOCONF=no NETWORKING_IPV6=no GATEWAY=10.0.0.1
Changing the boot order for a VM
According to "Changing the default boot sequence for newly created virtual machines" this can only be done via the VM's virtual BIOS, to be access after POST via F2.
Linux/x86-64 as a guest VM
We're currently unable to boot Linux/x86-64 as a guest VM (Host: AMD Opteron 848).
Status of other host hardware objects
Sometimes both ESX hosts are tagged with the following error:
v40z1.int.consol.us Warning Status of other host hardware objects 11/22/2012 1:31:54 AM
...which isn't really helpful. The "Hardware Status" tab should list the root cause for this, but sometimes the tab is not visible. Select "Plug-ins" → "Manage Plug-ins" and try to enable the "vCenter Hardware Status" and the "vCenter Service Status" plugin. But maybe this isn't working either:
vCenter Hardware Status VMware, Inc. 4.0 Disabled Displays the hardware status of hosts (CIM monitoring) The following error occured while downloading the script plugin from https://natascha:8443/cim-ui/scriptConfig.xml: Unable to connect to the remote server
Check if the "VMware VirtualCenter Management Webservices" service is running (and set to "Automatic"). Once started, try to enable the plugins again. Now the "Hardware Status" tab should be visible and the real warning should be printed:
System Management Software 0 Event Logging: Log full,out of 94 sensors
Aha! :-) Login to the ILOM and clear the SEL:
ilom$ ipmi clear sel
- vCenter Service Status and Hardware Status errors when trying to update host data (1014213)
- Resolve Hardware Status Alert SEL_FULLNESS
vmware-webAccess
Access the vSphere Web Access URL (https://esx01.example.org/ui/) might generate an error:
503 Service Unavailable
Looking at the latest /var/log/vmware/hostd-8.log logfile, we can see the following:
Connection to localhost:8308 failed with error N7Vmacore15SystemExceptionE(Connection refused).
The vmware-webAccess service might not be running:
$ service vmware-webAccess status webAccess is stopped $ chkconfig --list vmware-webAccess vmware-webAccess 0:off 1:off 2:off 3:off 4:off 5:off 6:off
Let's enable it for runlevels 3, 4 and 5 and start it now:
$ chkconfig --level 345 vmware-webAccess on $ chkconfig --list vmware-webAccess $ vmware-webAccess 0:off 1:off 2:off 3:on 4:on 5:on 6:off $ service vmware-webAccess start $ service vmware-webAccess status webAccess (pid 26395) is running...
Now the Web Access URL should be working.
Links
- esxupdate manual
- Native vs. ESX - some I/O benchmarks (e.g. bonnie++)
- Troubleshooting the VMware VirtualCenter Server service when it does not start or fails on vCenter Server
- Diagnosing the vSphere Client when it fails to connect to vCenter Server
- After installing vCenter Server, the VMware VirtualCenter Server service fails to start
- vCenter Service Not Starting - Service Dependancies such as SQL Server
- VirtualCenter Server will not start after server reboot
- Event ID 7024 - The VMware VirtualCenter Server service terminated with service-specific error 2 (0x2) upon starting up VC Server
Files
These kernel configurations should be able to start an ESX virtual machine: