Yet another pxe boot solution
Hopefully this isn’t just “yet another pxe boot solution” – It may not be as smart as puppet or chef, but hopefully you come to see that this is at least semi automated and a bit clever. So what sets this apart from so many other pxe solution/kickstart.
Well first of it a custom build if you will, meaning its tailored to the environment, in this case it means HP C7000 enclosures and network documentation in excel spreadsheet. So if you run HPs C7000 enclosure and your network team just loves excel spreadsheet, this might be a solution for you aswell.
The server
This is installed on a VM in the management cluster which services the VMware infrastructure, this way I feel comfortable having this as a VM. If you want to you could do it as a physical server, but why do that if your infrastructure is properly protected. The OS it self is Ubuntu 12.04 LTS, which was the flavor available at the time the PXE solution was created. On top of that the following services were installed.
Component |
Version |
Syslinux |
3.86 |
tftpd-hpa |
5.2-1ubuntu |
isc-dhcp-server (DHCPD3) |
4.1.ESV-R4 |
apache2 |
2.2.22 |
Again version wise this was what was available at the time the PXE server was created. (There is no reason to not use the latest version.)
That’s the story of the software used, hardware wise, I’ve talk a bit about, its a VM with the following specs.
Attribute |
Specification |
HW version |
7 |
Memory |
2GB |
vCPU |
1 |
Hard disk 1 (sda) |
16 GB |
Hard disk 2 (sdb) |
40 GB |
Network |
The network for the ESXi, hosts VMkernel port |
There is two hard disks, the first is used for OS, logs etc. and the second one is used for ISO etc. The second harddisk is mount under it’s own directory in the root of the partition, this needs to be sized after the amount of data you are expected to use.
Paths
Hard disk 2 is mounted under /media and all files are mounted under /media/pxeserver
ESXi image: /media/pxeserver/tftpboot/esxi/
Kickstart files: /media/pxeserver/www/kickstart/
PXE menu files: /media/pxeserver/tftpboot/pxelinux.cfg/
Not Yet another ESXi pxe boot solution
As I state in the beginning this is very much a custom solution fitted for the needs of this organization. To understand why it came to be this way, I’m going talk about some of the challenges this solution were set out to fix (and maybe in another blog post expand on what it expanded to fix of other challenges). Before I do that I’ll like to set the stage – Before this solution came to be, installations were done manually following a word doc, which was not in a strict step by step manner and also used a few different technologies, perl, powershell, esxcli etc. This made the procedure complicated, slow, error prone and susceptible to human error. It was these challenges I set out to fix doing this custom pxe solution. In it core the pxe is much different from any other solution out there, what sets this a part this the way it done – with a little help from some custom scripts.
Because the organization mainly work in a Windows environment, the file that is used to kick of the process of updating the pxe with the latest info is a bat file – again this is only used to simplify the execution of the scripts need. So first execute the bat file which will execute a powershell script. This powershell script does two important things, first it get read two spread sheets, with network information about host management network and blade enclosure and second it logs into every blade enclosure get a complete inventory after which it creates a csv file with host information including Name,IP,UUID,Bay position and enclosure. This file gets uploaded to the pxe server and a custom bash script get executes locally on the pxe server. The bash script removes old boot menu files and kickstart files, after which it creates new once based on the csv file that got uploaded. All custom tailored to the blade it need to be used for. In order for this to work the blades UUID is used as the means for the pxe boot script to show the custom boot menu that will use the specific blades kickstart file, which as I mentioned before was tailor to the exact blade. The make the installation 100% seamless and automated.
The bat script only need to be executed where there are changes made to the over all setup, such as blade being physically move between enclosure, new enclosure be deployed etc.
The scripts
The bat file, this just ties it all to together, not much to say other then you need a way to do secure username/password, and it used plink.exe to communicate with the pxe server.
powershell.exe powershell ../Powershell/PXE_OA.ps1 plink.exe -ssh username@host -pw PASSWORD "rm run.sh; rm OA" plink.exe -ssh username@host -pw PASSWORD "cat > OA" < C:\Temp\Host_UUID.csv plink.exe -ssh username@host -pw PASSWORD "cat > run.sh" < "..\Bash\run.sh" plink.exe -ssh username@host -pw PASSWORD "chmod +x run.sh; dos2unix run.sh; ./run.sh"
PXE_OA.ps1
This script does two important things, first it reads spread sheets for data and second is locks into all the OAs to get an inventory. The last part is some way overkill today, as there is a web server running on all OAs from where you can get all the same info without needing to login to every OA. But as I haven´t got around to correct it, this is still how its done in this version.
The script contains two functions which I didn ‘t make, one for importing excel spread sheet and one for doing tcp connections. Both of them deserve credit for there work, but It seems that I’ve shamelessly have copied just the part I needed and not the scripts in there full. For that I apologize, to whomever you are.
From line 160 and onwards there are things that need to be change to fit your given environment, please do so accordingly.
#Moving to a folder where we got privilegeds CD c:\temp #Function for importing Excel files and convert to "CSV" Function Import-Excel { param ( [string]$FileName, [string]$WorksheetName, [bool]$DisplayProgress = $true ) if ($FileName -eq "") { throw "Please provide path to the Excel file" Exit } if (-not (Test-Path $FileName)) { throw "Path '$FileName' does not exist." exit } $FileName = Resolve-Path $FileName $excel = New-Object -com "Excel.Application" $excel.Visible = $false $workbook = $excel.workbooks.open($FileName) if (-not $WorksheetName) { Write-Warning "Defaulting to the first worksheet in workbook." $sheet = $workbook.ActiveSheet } else { $sheet = $workbook.Sheets.Item($WorksheetName) } if (-not $sheet) { throw "Unable to open worksheet $WorksheetName" exit } $sheetName = $sheet.Name $columns = $sheet.UsedRange.Columns.Count $lines = $sheet.UsedRange.Rows.Count Write-Warning "Worksheet $sheetName contains $columns columns and $lines lines of data" $fields = @() for ($column = 1; $column -le $columns; $column ++) { $fieldName = $sheet.Cells.Item.Invoke(1, $column).Value2 if ($fieldName -eq $null) { $fieldName = "Column" + $column.ToString() } $fields += $fieldName } $line = 2 for ($line = 2; $line -le $lines; $line ++) { $values = New-Object object[] $columns for ($column = 1; $column -le $columns; $column++) { $values[$column - 1] = $sheet.Cells.Item.Invoke($line, $column).Value2 } $row = New-Object psobject $fields | foreach-object -begin {$i = 0} -process { $row | Add-Member -MemberType noteproperty -Name $fields[$i] -Value $values[$i]; $i++ } $row $percents = [math]::round((($line/$lines) * 100), 0) if ($DisplayProgress) { Write-Progress -Activity:"Importing from Excel file $FileName" -Status:"Imported $line of total $lines lines ($percents%)" -PercentComplete:$percents } } $workbook.Close() $excel.Quit() } #Function for creating a TCP session to a host/OA etc. Function TCP-Connect { param( [string] $remoteHost = "localhost", [int] $port = 23, [string[]] $commands ) $TempLogFilePath = "Temp.log" Start-Transcript -Path "$TempLogFilePath" try { ## Open the socket, and connect to the computer on the specified port write-host "Connecting to $remoteHost on port $port" $socket = new-object System.Net.Sockets.TcpClient($remoteHost, $port) if($socket -eq $null) { throw ("Could Not Connect") } $stream = $socket.GetStream() $writer = new-object System.IO.StreamWriter($stream) $buffer = new-object System.Byte[] 1024 $encoding = new-object System.Text.AsciiEncoding #Loop through $commands and execute one at a time. for($i=0; $i -lt $commands.Count; $i++) { ## Allow data to buffer for a bit start-sleep -m 500 ## Read all the data available from the stream, writing it to the ## screen when done. while($stream.DataAvailable) { $read = $stream.Read($buffer, 0, 1024) write-host -n ($encoding.GetString($buffer, 0, $read)) } write-host $commands[$i] ## Write the command to the remote host $writer.WriteLine($commands[$i]) $writer.Flush() } #runs CheckLogs.ps1 script and sends in the output from the telnet emulation and searches for HTML string #.\CheckLogs.ps1 -LogFile "$TempLogFilePath" -SearchStrings @('HTML') if($LASTEXITCODE -eq 0) { # If string wasnt found then an error is thrown and caught throw ("Text Not found") } } catch { #When an exception is thrown catch it and output the error. #this is also where you would send an email or perform the code you want when its classed as down. write-host $error[0] $dateTime = get-date $errorOccurence = "Error occurred connecting to $remoteHost on $port at $dateTime" write-host $errorOccurence } finally { ## Close the streams ## Cleans everything up. $writer.Close() $stream.Close() stop-transcript } } #Script Starting #Define Array to contain output $Details = @() #Read the Excel files and the given Sheet $oob1 = Import-Excel "PATH TO EXCEL" -WorksheetName:"NAME OF SHEET" $oob2 = Import-Excel "PATH TO EXCEL" -WorksheetName:"NAME OF SHEET" $IP1 = Import-Excel "PATH TO EXCEL" -WorksheetName:"NAME OF SHEET" $IP2 = Import-Excel "PATH TO EXCEL" -WorksheetName:"NAME OF SHEET" #Join the results to one file $oob = $oob1 + $oob2 #Join the results to one file $IPALL = $IP1 + $IP2 #Get OA IP and Name $OAs = $oob | where {$_.Column2 -match "OA"} Foreach($OAline in $OAs){ If (Test-Connection $OALine.Column1 -quiet -count 1){ TCP-Connect $OALine.Column1 23 @("USERNAME","PASSWORD","show Server info all","quit") $OAfile = Get-Content .\Temp.log $OAfiles += $OAfile $InBayLoop = $FALSE Foreach($line in $OAfile){ if($InBayLoop){ if($line -match "UUID:"){ $Report = "" | Select Bay,UUID,OA,Name,IP $Report.UUID = (($line).TrimStart()).TrimStart("UUID: ") $Report.Bay = $Bay $Report.OA = $OALine.Column2 $Report.Name = "ESX-"+($OALine.Column2 -replace('OA[0-9]','')) $Report.Name += if(($Bay.ToString()).Length -eq "1"){"0"+$Bay}elseif(($Bay.ToString()).Length -eq "2"){$Bay} $Temp = ($IPALL | where {$_.Column2 -match "ESX-" -and $_.Column2 -match $Report.Name} | where {$_."VLAN1" -or $_."VLAN2"}) if($Temp."VLAN1"){ $Report.IP = $Temp."VLAN1" }elseif($Temp."VLAN2"){ $Report.IP = $Temp."VLAN2" } $Bay = "" $InBayLoop = $FALSE $Details += $Report } } if($line -match "Server Blade #"){ $Bay = (($line).TrimEnd(" Information:")).TrimStart("Server Blade #") $InBayLoop = $TRUE } } } } rm .\Temp.log $Details | Export-Csv Host_UUID.csv -NoTypeInformation
Run.sh
This is the last script which does the formatting and creation of custom files for the pxe solution to work as designed. Again there will be something that need to be change to fit your solution. This is some what easy to read so I wouldn’t go into to much detail – Line four is all about housekeeping, line 6-8 is about restructuring the UUID to fit with the format of the way the pxe uses UUID for boot menu and line sixteen does the customization of the kickstart config file in order to make every install unique.
#!/bin/bash rm /media/pxeserver/www/kickstart/* rm /media/pxeserver/tftpboot/pxelinux.cfg/* sed s/\"//g /home/USER/OA | sed s/\@{IP=//g | sed s/\}//g | grep -v "Bay" | grep -v "00000000-0000-0000-0000-000000000000" | while IFS=, read col1 col2 col3 col4 col5; do if [[ $col5 =~ [0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3} ]]; then col2=${col2:6:2}${col2:4:2}${col2:2:2}${col2:0:2}-${col2:11:2}${col2:9:2}-${col2:16:2}${col2:14:2}-${col2:19:2}${col2:21:2}-${col2:24:12} col2=$( echo "$col2" | tr -s '[:upper:]' '[:lower:]' ) cp -f /home/USER/pxelinux.cfg /media/pxeserver/tftpboot/pxelinux.cfg/$col2; echo "LABEL Install Standard Image" >> /media/pxeserver/tftpboot/pxelinux.cfg/$col2; echo " KERNEL /esxi/esxi-5.0.0-update1-623860-hp-5.21.17/mboot.c32" >> /media/pxeserver/tftpboot/pxelinux.cfg/$col2; echo " APPEND -c /esxi/esxi-5.0.0-update1-623860-hp-5.21.17/boot.cfg ks=http://FQDNorIP/kickstart/$col4.cfg" >> /media/pxeserver/tftpboot/pxelinux.cfg/$col2; echo " TEXT HELP" >> /media/pxeserver/tftpboot/pxelinux.cfg/$col2; echo " Install the host automated based on the host UUID" >> /media/pxeserver/tftpboot/pxelinux.cfg/$col2; echo " ENDTEXT" >> /media/pxeserver/tftpboot/pxelinux.cfg/$col2; sed s/IPADDR=\"1.5.1.20\"/IPADDR=\"$col5\"/g /home/admpxe/kickstart.cfg | sed s/HOSTNAME=FQDN/HOSTNAME=$col4.DOMAINFQDN/g > /media/pxeserver/www/kickstart/$col4.cfg; fi done
The files
If you look at the run.sh script there is two files been copied repeatedly pxelinux.cfg and kickstart.cfg. pxelinux.cfg is the boot menu you see once the host pxe boots. The reason I am copying a file instead of just creating one from scratch is that it I can have a default static menu which is then customized with the dynamic entries needed. An example of a static boot menu option could be stress testing, but also firmware patching, ILO setup, bios settings etc. The last part is something I have done, using HPs Scripting ToolKit, but thats something for at different blog post.
The kickstart file, is the automation part of the ESXi install, it make sure that I don’t have to lift a finger during the installation and it guaranties the correct install and settings every time. The last part is what important, to have a simple, yet efficient way of guaranteeing a perfect install every single time! In my view that’s what pxe and kickstart is all about.
Pxelinux.cfg
Below is an example of how a boot menu is structured, the first five lines is global but then comes LABEL, which is what a selectable boot option is to be called, then comes how/what to boot and last some text which can give the menu option a meaning description. All this is static menu options, the run.sh script will then add the custom options for the ESXi install as that part isn’t static.
DEFAULT vesamenu.c32 TIMEOUT 600 ONTIMEOUT BootLocal PROMPT 0 NOESCAPE 1 LABEL BootLocal localboot 0 TEXT HELP Boot to local disk ENDTEXT LABEL HPSUM 2014.02.0 MENU LABEL Automatic Firmware Update Version 2014.02.0 kernel /spp2013020/vmlinuz append initrd=/spp2013020/initrd.img media=net rw root=/dev/ram0 ramdisk_size= init=/bin/init loglevel=3 ide=nodma ide=noraid nopat pnpbios=off vga=791 splash=silent hp_fibre showopts noexec32=off numa=off nox2apic iso1=nfs://pxe-server-ip/export/nfs/HP_Service_Pack_for_Proliant.iso iso1mnt=/mnt/bootdevice TEXT HELP This boots HPSUM and updates firmware Supported for ESXi 4.1U2, 5.0U3, 5.1U2 and 5.5 ENDTEXT LABEL HP Scriptig Tool kit MENU LABEL HP Scriptig Tool kit kernel /HP_boot_files/vmlinuz append initrd=/HP_boot_files/initrd.img root=/dev/ram0 rw ramdisk_size=498800 ide=nodma ide=noraid pnpbios=off numa=off media=net iso1=nfs://pxe-server-ip/export/nfs/hp_stk sstk_conf=toolkit.conf sstk_script=/shell.sh hostname=customprefix- TEXT HELP Test ENDTEXT LABEL HP Scriptig Tool kit2 MENU LABEL HP Scriptig Tool kit2 kernel /HP_boot_files2/vmlinuz append initrd=/HP_boot_files2/initrd.img root=/dev/ram0 rw ramdisk_size=498800 ide=nodma ide=noraid pnpbios=off numa=off media=net iso1=nfs://pxe-server-ip/export/nfs/hp_stk sstk_conf=toolkit.conf sstk_script=/shell.sh hostname=customprefix- TEXT HELP Test ENDTEXT
Kickstart.cfg
This is an early example, I’m not going to explain it, as it should be straightforward
accepteula install --firstdisk --overwritevmfs rootpw --iscrypted skdskdpspmdpsmp39 keyboard Danish reboot #dryrun %include /tmp/networkconfig %pre --interpreter=busybox # mask all FC storage before the installer runs localcli storage core claimrule add -r 2012 -P MASK_PATH -t transport -R fc localcli storage core claimrule load localcli storage core claiming unclaim -t plugin -P NMP localcli storage core claimrule run # extract network info from bootup IPADDR="1.5.1.20" NETMASK="255.255.255.0" GATEWAY="1.3.0.1" DNS="1.3.8.1,1.3.8.1" HOSTNAME=FQDN/HOSTNAME echo "network --bootproto=static --addvmportgroup=false --device=vmnic0 --ip=${IPADDR} --netmask=${NETMASK} --gateway=${GATEWAY} --nameserver=${DNS} --hostname=${HOSTNAME}" > /tmp/networkconfig %firstboot --interpreter=busybox # enable & start remote ESXi Shell (SSH) #vim-cmd hostsvc/enable_ssh #vim-cmd hostsvc/start_ssh # enable & start ESXi Shell (TSM) #vim-cmd hostsvc/enable_esx_shell #vim-cmd hostsvc/start_esx_shell # supress ESXi Shell shell warning #esxcli system settings advanced set -o /UserVars/SuppressShellWarning -i 1 ## SATP CONFIGURATIONS ## esxcli storage nmp satp set --satp VMW_SATP_SVC --default-psp VMW_PSP_RR esxcli storage nmp satp set --satp VMW_SATP_DEFAULT_AA --default-psp VMW_PSP_RR # set default domain lookup name esxcli network ip dns search add --domain=DOMAIN.local # enter maintenance mode vim-cmd hostsvc/maintenance_mode_enter # enable lockdown mode #vim-cmd -U dcui vimsvc/auth/lockdown_mode_enter # remove FC claim rule to present storage back to the hosts cat >> /etc/rc.local << __CLEANUP_MASKING__ localcli storage core claimrule remove -r 2012 __CLEANUP_MASKING__ cat > /etc/init.d/maskcleanup << __CLEANUP_MASKING__ sed -i 's/localcli.*//g' /etc/rc.local rm -f /etc/init.d/maskcleanup __CLEANUP_MASKING__ chmod +x /etc/init.d/maskcleanup # set keyboard layout to Danish #esxcli system settings keyboard layout set -l Danish # rename local datastore vim-cmd hostsvc/datastore/rename datastore1 "local-$(hostname -s)" # installes the ssl certificate mv -f /etc/vmware/ssl/rui.crt /etc/vmware/ssl/rui.crt.bak mv -f /etc/vmware/ssl/rui.key /etc/vmware/ssl/rui.key.bak wget http://FQDN/SecureSocketLayer/rui.key -O /etc/vmware/ssl/rui.key wget http://FQDN/SecureSocketLayer/rui.crt -O /etc/vmware/ssl/rui.crt # restarts the services and certificates gets reloaded services.sh restart # Needed for configuration changes that could not be performed in esxcli reboot
Operational procedures
I been through a lot of configs and setup, but last I will show you how to create a custom build of ESXi to fit the need of your exact install and how to update the pxe server with a new ESXi install.
Image builder
The “script” below is for ESXi 5.1 and HP blades, and I have created it as generic as possible for ease of management. The first two lines add VMware and HPs esx depot to be used.
$ESXi variable has the lastes 5.1 version image in it
$Profile makes a writable copy of $ESXi, which we will use to customize the image
The next lines first installs all HP packages for 5.0, then remove all 5.0 package which are in 5.1 and lastly installs the remaining 5.1 packages. For some reason HP haven’t felt like updating the versioning numbers to fit with the different builds
The last to lines exports the custom image, first to ISO and then to zip (Bundle). The bundle can be load if changes needs to be made, just like with the online depots.
Add-EsxSoftwareDepot https://hostupdate.vmware.com/software/VUM/PRODUCTION/main/vmw-depot-index.xml Add-EsxSoftwareDepot http://vibsdepot.hp.com/hpq/latest/index-drv.xml $ESXi = (Get-EsxImageProfile -Name "ESXi-5.1.*" | where {$_.name -match (Get-Date -format yyyy) -and $_.Name -notmatch "s-standard" -and $_.Name -notmatch "no-tools"} | Sort Name -Descending)[0] $Profile = New-EsxImageProfile -CloneProfile $ESXi -Name ($ESXi.Name+'-CUSTOM') -Vendor $ESXi.Vendor Add-EsxSoftwarePackage -SoftwarePackage (Get-EsxSoftwarePackage | sort Vendor,name | where {$_.vendor -ne "vmware" -and $_.version -match "500"}) -ImageProfile $PROFILE (Get-EsxSoftwarePackage | sort Vendor,name | where {$_.vendor -ne "vmware" -and $_.version -match "510"}) | foreach($_.Name){Remove-EsxSoftwarePackage -SoftwarePackage $_.Name -ImageProfile $PROFILE} Add-EsxSoftwarePackage -SoftwarePackage (Get-EsxSoftwarePackage | sort Vendor,name | where {$_.vendor -ne "vmware" -and $_.version -match "510"}) -ImageProfile $PROFILE Export-EsxImageProfile -ImageProfile $Profile -ExportToISO -FilePath ('c:\temp\'+$Profile.name+'.iso') Export-EsxImageProfile -ImageProfile $Profile -ExportToBundle -FilePath ('c:\temp\'+$Profile.name+'.zip')
Update pxe image
The next part is the small changes there need to be done in order for a new image to work with the pxe, as there are comments below I’m not going to elaborate anymore.
1. Unpack the ISO and transfer the files to the PXE server. The files are to be placed in a folder in the following directory /media/pxeserver/tftpboot/esxi/ 2. Change all the filenames to lowercase find /media/pxeserver/tftpboot/esxi/PATH/ -depth -exec rename 's/(.*)\/([^\/]*)/$1\/\L$2/' {} \; 3. Edit the boot.cfg file. The below line removes all "/" sed 's/\///g' /media/pxeserver/tftpboot/esxi/PATH/boot.cfg >> /media/pxeserver/tftpboot/esxi/PATH/boot.cfg The below line adds prefix to the boot.cfg echo "prefix=/esxi/PATH/" >> /media/pxeserver/tftpboot/esxi/PATH/boot.cfg 4. Make the files executable chmod +x /media/pxeserver/tftpboot/esxi/PATH/* 5. Edit the run.sh to work with the new image (Edit the current menu or create a new one)
Wrap up
This post became some what longer than expected and also took some what longer to finish then I planned. Just looked at the revisions of this post, I first started writing this post on the 10th of December 2013, hope it doesn’t show. But more so time has changed a lot since this solution was created. I still find it a good option, but I also see other good options which can make for a more complete solution. Which brings me to the things that a pxe solution can’t do… Which shortly put is everything which has to do with vCenter “services”, such as vDS, AD intergration etc. So these things need to be handle otherwise, which begs for at better way, that way could be vRO. But if vCenter “services” is not needed things like vSwitch can easily be part of the kickstart which would make installation complete.
Don’t think I have more to say at this point, but if you got this far, thank you for reading.
One thought on “Yet another ESXi pxe boot solution”
Excellent Post !