The deeper the technology stack runs, the more administrators and engineers struggle to keep everything in good working order. VMWare offers a wonderful suite of benefits, but virtualization opens up a real can of worms, and with the host OS trying to be clever some really frustrating issues can appear.
At some point in the last few months, a VM on my laptop (an Ubuntu Intrepid Ibex image for some contract work I’m doing) started behaving really poorly. I wasn’t sure whether it was an issue on the host OS (which went from Jaunty to Koala), an issue on the guest OS (on which I installed X), or just a matter of time marching on (fragmentation).
After trying a few different fixes, I am of the opinion that my problem stemmed from CPU frequency scaling on the host OS playing havoc with VMWare. As best I can tell, this was causing the guest OS to be CPU starved, presumably due to a VMWare bug. That said, I (unscientifically) have done four total things since going from a state of fail to a state of grace…
- found a way to set the scaling governor on the host CPUs
- tweaked the VM to use only a single CPU
- defragmented the VM’s image
- disabled memory page trimming for the VM
Under Ubuntu (Karmic Koala, anyway) one can use the cpufreq-selector program to tweak the scaling governor for the host CPUs. For me, this meant running the following two commands on my dual CPU system…
cpufreq-selector -c 0 -g performance
cpufreq-selector -c 1 -g performance
It’s easy enough to validate that things got set…
If you need a list of possible values for the scaling governor…
To make VMWare not suck, I changed the values from “ondemand” to “performance”. The other possibilities were “conservative”, “userspace” and “powersave”, none of which I have yet explored. The downside to “performance” of course is that your laptop is going to blow through its battery. Of course, if you’re plugged into the wall, you probably don’t care as much. You’re just destroying the planet.
At some point before I found the CPU scaling frequency fix I had momentarily convinced myself that the other three tweaks had fixed things. I have the feeling that I was fooled as the result of my laptop’s power management software fiddling with the scaling governors; as confirmation, I note that other folks on the Internet state they have resorted to having scripts that set the governors after they return from a hibernate/suspend operation.
I am happy to have (apparently) fixed my problem, but I’m left with a nagging sensation of our technology stack gradually getting out of control. It has so many moving parts, and in this feature hungry world not nearly enough time gets allocated to robustness and clarity.
It’s digital turtles all the way down.