If you want to optimize the performance of your Python script you need to be able to analyze it. The best source of information I found on the web is the PerformanceTips page on the Python wiki. We are going to describe two types of performance analysis in Python. The first type uses a stopwatch to time the repeated execution of a specific piece of code. This allows you to change or replace the code and see whether or not this improved the performance. The other is by enabling a profiler that will track every function call the code makes. These calls can then be related, aggregated and visually represented. This type of profiling allows you to identify what part of your code is taking most time. We will show how to do both, starting with the stopwatch type.
Simple stopwatch profiling
You can apply basic stopwatch style profiling using the “timeit” module. It outputs the time that snippet of code takes to execute the specified number of times (in milliseconds), default number of times is one million. You can specify a startup statement that will be executed once and not counted in the execution time. And you can specify the actual statement and the number of times it needs to be executed. You can also specify the timer object if you do not want wall clock time but for example want to measure CPU time.
def lazyMethod(stringParts): fullString = '' for part in stringParts: fullString += part return fullString def formatMethod(stringParts): fullString = "%s%s%s%s%s%s%s%s%s%s%s" % (stringParts, stringParts, stringParts, stringParts, stringParts, stringParts, stringParts, stringParts, stringParts, stringParts, stringParts) return fullString def joinMethod(stringParts): return ''.join(stringParts) print 'Join Time: ' + str(timeit.timeit('joinMethod()', 'from __main__ import joinMethod')) print 'Format Time: '+ str(timeit.timeit('formatMethod()', 'from __main__ import formatMethod')) print 'Lazy Time: ' + str(timeit.timeit('lazyMethod()', 'from __main__ import lazyMethod'))
The output should be something like this:
Join Time: 0.358200073242 Format Time: 0.646985054016 Lazy Time: 0.792141914368
This shows us that the join method is more efficient in this specific case.
Advanced profiling using cProfile
To identify what takes how much time within an application we first need an application. Let us profile a very simple Flask web application. Below is the code of a very simple “Hello World” application in Flask. We replaced “app.run()” with “app.test_client().get(‘/’);” to make the application run only the one request.
from flask import Flask app = Flask(__name__) @app.route("/") def hello(): return "Hello World!" if __name__ == "__main__": #app.run() app.test_client().get('/');
Running the application with the profiler enabled can be done from the command line, so there is no need to change the code. The command is:
python -m cProfile -o flask.profile flaskapp.py
Visualizing cProfile results
“RunSnakeRun is a GUI tool by Mike Fletcher which visualizes profile dumps from cProfile using square maps. Function/method calls may be sorted according to various criteria, and source code may be displayed alongside the visualization and call statistics.” – source: Python PerformanceTips
We are now analyzing the generated “flask.profile” file by running the “runsnake” tool using with following command:
It gave us some real nice insights:
Picture 1: The visual output of RunSnakeRun
Picture 2: The list of function calls shows 77 calls to the regex library (re.py) acounting for only 0.5 of the 79 ms.
Picture 3: A map showing all calls, the rectangle in the upper right (testing.py) is the test client running.
We showed you how to profile your Python application, now go practice and optimize your code. One advice though: go for low hanging fruit only, because over-optimized code is not Pythonic.