Python performance optimization techniques

If you want to optimize the performance of your Python script you need to be able to analyze it. The best source of information I found on the web is the PerformanceTips page on the Python wiki. We are going to describe two types of performance analysis in Python. The first type uses a stopwatch to time the repeated execution of a specific piece of code. This allows you to change or replace the code and see whether or not this improved the performance. The other is by enabling a profiler that will track every function call the code makes. These calls can then be related, aggregated and visually represented. This type of profiling allows you to identify what part of your code is taking most time. We will show how to do both, starting with the stopwatch type.

Simple stopwatch profiling

You can apply basic stopwatch style profiling using the “timeit” module. It outputs the time that snippet of code takes to execute the specified number of times (in milliseconds), default number of times is one million. You can specify a startup statement that will be executed once and not counted in the execution time. And you can specify the actual statement and the number of times it needs to be executed. You can also specify the timer object if you do not want wall clock time but for example want to measure CPU time.

def lazyMethod(stringParts):
  fullString = ''
  for part in stringParts:
    fullString += part
  return fullString

def formatMethod(stringParts):
  fullString = "%s%s%s%s%s%s%s%s%s%s%s" % (stringParts[0], stringParts[1],
  stringParts[2], stringParts[3],
  stringParts[4], stringParts[5],
  stringParts[6], stringParts[7],
  stringParts[8], stringParts[9],
  return fullString

def joinMethod(stringParts):
  return ''.join(stringParts)

print 'Join Time: ' + str(timeit.timeit('joinMethod()', 'from __main__ import joinMethod'))
print 'Format Time: '+ str(timeit.timeit('formatMethod()', 'from __main__ import formatMethod'))
print 'Lazy Time: ' + str(timeit.timeit('lazyMethod()', 'from __main__ import lazyMethod'))

The output should be something like this:

Join Time: 0.358200073242
Format Time: 0.646985054016
Lazy Time: 0.792141914368

This shows us that the join method is more efficient in this specific case.

Advanced profiling using cProfile

To identify what takes how much time within an application we first need an application. Let us profile a very simple Flask web application. Below is the code of a very simple “Hello World” application in Flask. We replaced “” with “app.test_client().get(‘/’);” to make the application run only the one request.

from flask import Flask

app = Flask(__name__)

def hello():
  return "Hello World!"

if __name__ == "__main__":

Running the application with the profiler enabled can be done from the command line, so there is no need to change the code. The command is:

python -m cProfile -o flask.profile

Visualizing cProfile results

RunSnakeRun is a GUI tool by Mike Fletcher which visualizes profile dumps from cProfile using square maps. Function/method calls may be sorted according to various criteria, and source code may be displayed alongside the visualization and call statistics.” – source: Python PerformanceTips

We are now analyzing the generated “flask.profile” file by running the “runsnake” tool using with following command:

runsnake flask.profile

It gave us some real nice insights:


Picture 1: The visual output of RunSnakeRun


Picture 2: The list of function calls shows 77 calls to the regex library ( acounting for only 0.5 of the 79 ms.


Picture 3: A map showing all calls, the rectangle in the upper right ( is the test client running.

We showed you how to profile your Python application, now go practice and optimize your code. One advice though: go for low hanging fruit only, because over-optimized code is not Pythonic.


Leave a Reply

Your email address will not be published. Required fields are marked *