A while ago, Jeff Atwood talked about actual performance and perceived performance, and funny enough, just after reading his post, I stumbled upon a similar deception of a performance indicator. It is really unbelievable how deceiving performance numbers can be, which seems to be a result of the bad sense of time of human beings. I believe that not only most of us suck estimating how much time a certain task will take, but we also suck at estimating how long a task actually took. It actually happened to me the same time I read Jeff Atwood’s post.
It all started with a new parallel version of a certain algorithm we were working on. Actually, a very important part of my daily work is optimization, and to improve the speed of our software, we use different forms of concurrency, and lately, we were parallelizing a certain algorithm using multithreading. To test out the performance difference using a single thread or multiple threads, we used an already existing test application. This test executes the algorithm on a given input and reports the time it spent running the algorithm. As soon as we had a preliminary version of the new multithreaded algorithm working, we started testing whether we could improve performance using both cores on a dual core machine.
We ran the test on a large sized input, using only a single thread on a dual core machine. Reported time 16:38. We ran the test on the same input, using two threads on the same machine. Reported time: 16:41. Using two threads taking more time than using one thread. Okay, a bit strange, but it is possible, considering there might be more overhead as a result of locking. We introduced some mutexes where necessary, and also the OS does its own locking too, for example for memory management. Anyway, nothing seemed to be executed in parallel, otherwise we would have to see an increase in time.
We reconsidered the mutexes, and reran the test. Reported times were again in the 16 minutes, where the multithreaded version was slower. We used a different input; same result with times around 20 minutes. It was only when we started investigating the multithreaded algorithm’s performance on smaller inputs, where the actual problem started surfacing.
It was in those cases where the test reported around 2 minutes for the dual threaded run where my co-worker started distrusting the reported time. “Hey, that didn’t feel like already two minutes passed!”, he cried. “What is this test reporting, CPU time or wall clock time?” Rerunning the test and using a good-old stopwatch gave us the final answer: the test application reported CPU time. That means we already almost doubled the performance by using two threads with the new algorithm, compared to the single threaded version. We were simply deceived by the reported times; the very first test reported 16:41, but it actually took only 8:20. We couldn’t believe our eyes, so we actually ran it again and confirmed it was true with our stopwatch approach.
So yes, I too believe that “performance is determined largely by the user’s perception rather than actual wall-clock time.”