[Python] Fast Loop
Apr. 9, 2021In this video, James Murphy compares the performance of different ways to do loop in Python. He pointed out the fact that Python is a pretty slow programming language on its own and giving a comprehensive conclusion to what might be the fastest way to loop.
To summarize, the loop is supposed to sum the numbers from 0 to n-1 (n=100_000_000). Given below is the performance of each way measured by profiling using timeit.
| loop | time |
|---|---|
| while | 14.7800181 |
| for | 10.090439 |
| sum range | 6.3937911000000001 |
| numpy sum | 0.73299640000000047 |
| math sum | 2.80000000042993634e-06 |
We could see while loop has the worst performance, being pure Python, and the performance increases as the function is implemented in faster language like C and optimized. James concludes that if possible, we should use built-in Python function for the optimization, or using computing module built especially for speed such as numpy. And ultimately, the best way is to avoid loop at all, due to not only the optimization of the language itself, but also due to the complexity of a loop.
There are some other implicit lessons that I realized from watching this video. First of all, in the process of optimization, having good profiling tools are very important, and even something as simple as timeit can become very useful at time. Secondly, there’s this question of where do you look first when you are going to optimize your code. Loops take up a large chunks of your code, and thus can be a starting point of your optimization process. This will lead you to think about how to optimize your loop and hopefully will lead away from using a loop, or make you rethink about the data structure that you are using. Understanding the complexity of data structure operations will help you choose the appropriate data structure that in turns reduce the complexity of your loop.