I've been working on developing applications for my PC using the wii lately. In order to get me going, I found an open source library called "cwiid" with a python wrapper, "python-cwiid". Yesterday, while coding, I kept facing a peculiar problem. Every time I ran my application (which was merely something that connected the nearest wii remote and started receiving accelerometer and key press information from it), my laptop got extremely hot and my computer shut down.
I was quick to understand that this had something to do with the CPU being pushed into thermal cutoff. Interestingly, a previous application that I wrote came in handy. My TempMonitor application allowed me to get a visual comparison of the CPU usage and temperature against time. So I decided to see for myself, how brutally my CPU was being ravaged by the wii application I had made. The results were quite astounding. My application pushed my CPU upto a 100 degrees centigrade in under 15 minutes (coupled with a few other apps like games or presentations for which the remote was suppose to act as a new interface) and caused it to shut down.
So I took a deeper look at my code. My app was essentially a wrapper for the "Wiimote" class provided by python-cwiid. It allowed me to run a thread that received data from the wii remote and invoke a callback routine to handle the processing. This was extremely essential as it was the only way to abstract the nitty gritties of the wii from the actual application. After an hour or two of breaking my head, it hit me like a tonne of bricks.
My main routine started the thread and died, making a "orphaned zombie" (Now this might sound strange since, an orphan thread is one whose parent thread has died. A zombie process is a defunct or redundant entry in the process table. In our case, the cycle of calls is Python Interpreter --> main --> background process. Here, the main routine dies, but the interpreter doesn't. So the background thread has lost its parent but still has a grandparent viz. the interpreter. And technically, since the application is over after the death of the main routine, the python interpreter is a redundant entry that continues to live on in the process table. So the nomenclature!). I quickly added a semaphore to my main routine, that kept it alive until the background task was not complete, and "peace was once again restored to the empire". But then I decided to put my theory to the test. I wrote a script that mimiced the structure of my Wii application.
It had a background task with some heavy processing (well not as heavy as my app, but enough to put a little squeeze on the core), and in order to test its effect on the CPU load and temperature, i create three kinds of main routines:
- the first one died after starting the background task, orphaning it
- the second one had a semaphore but was a bigger time hog than the background task (i.e. it didn't sleep and fairly share time with the background task)
- the third one had a semaphore that made the main routine doze every once in a while and provide more time to the background task
The processing was not too heavy, as i wanted to put just enough load to cause a steady rise in temperature. The background task's job was to open a JPEG and read out all the pixels (values between 0-255). Now the question to ask was, "what would be the threshold temperature"? I wanted each of the cases to eventually break down, and yet give comprehensive results on the effect a simple "sleep statement" can have. Normal range of operation of most cores is between 55 to 90 degrees centigrade. An average load on my computer is between 65-70% (which is higher than most regular computer users) and it usually works between 60 C to 70 C. So after a little deliberation, I chose 70 degrees centigrade as my threshold. The load that the application was going to put on my 2.8 GHz CPU was going to be much above ordinary and more importantly, it was going to be persistent (yes, that's something we normally don't face. Most applications have asynchronous demands which are resource heavy indeed but for brief periods of time. Therefore, we don't really feel the CPU running out of steam or heating up too much. Persistent applications are the best kind of load to make your CPU sweat!).
Here is the script:
import time,sys
from threading import Thread
#---------------------------------------------------------------------------
class backgroundTask (Thread):
""" simple background task """
def __init__(self):
""" simple thread constructor """
Thread.__init__(self)
self.stop = False
self.cnt = 0
#------------------------------------------------------------------
def bizzareProcessing(self,num):
""" some heavy duty processing for testing pruposes """
fileRead = open('APPLE.jpg')
image = fileRead.read()
for eachbyte in image:
print ord(eachbyte)
fileRead.close()
print "_"*30
#------------------------------------------------------------------
def run(self):
""" worker routine for the thread """
print 'starting background task'
while not self.stop:
self.bizzareProcessing(self.cnt)
self.cnt+=1
time.sleep(0.1)
print 'exiting backround task'
#------------------------------------------------------------------
def stop(self):
""" This is the kill switch """
self.stop = True
#------------------------------------------------------------------
def getCount(self): return self.cnt
#---------------------------------------------------------------------------
def mainZombie():
bgTaskZombie = backgroundTask()
bgTaskZombie.start()
#now the parent thread, i.e. this function will die, but the
#task bgTaskZombie is still going to be running
print '\nMainZombie is dying now'
#---------------------------------------------------------------------------
def mainHog():
bgTaskHog = backgroundTask()
bgTaskHog.start()
while bgTaskHog.getCount == 20000: pass
#the semaphore above keeps the parent thread from dying before
#the child thread, therefore not making it a zombie but no
#sleep in the loop causes the main routine to hog cpu
print '\nMainHog is dying now'
#---------------------------------------------------------------------------
def mainProper():
bgTaskProper = backgroundTask()
bgTaskProper.start()
while bgTaskProper.getCount == 20000: time.sleep(1)
#same as the mainHog routine, only with a little nap for the main
#routine. This causes the main routine to share time and stay alive
print '\nMainProper is dying now'
#---------------------------------------------------------------------------
if __name__ == '__main__':
arg = int(sys.argv[1])
if arg == 1: mainZombie()
elif arg == 2: mainHog()
elif arg == 3: mainProper()
The results are shown below. My temperature monitor application uses google charts API to generate a few graphs of CPU load (blue trace) and CPU temperature (green trace) against time. It records the changes in these two quantities over the last 400 seconds. I ran each of the main routines and clobbered the images together for a clear comparison. Now I did run each main routine over five times to make sure there was consistency in my readings. Here are the plots for the last run:

As you can see, it takes longer for the most properly structured application to reach the threshold, while the one with the zombie hanging around is quick to touch the threshold temperature in no time.
Problems like this may go undetected and surface in an untimely fashion to make your life a living hell. So structure your threads well. Make sure all the babies have their parents and everyone gets enough sleep.