I began finally migrating some old scripts from PHP to Python late last year, and while I was happy to finally have my PHP days behind me, I noticed the script execution was disappointing. On average, a Python CGI script would run 20-80% slower than an equivalent PHP script. At first I chalked it up to slower libraries, but even basic ones that didn’t rely on database or anything fancy still seemed to be incurring a performance hit.
Yesterday I happened to come across mention of WSGI, which is essentially a Python-specific replacement for CGI. I realized the overhead of CGI probably explained why my Python scripts were slower than PHP. So I wanted to give WSGI a spin and see if it could help.
Like PHP, WSGI is an Apache module that is not included in many pre-packaged versions. So first step is to install it.
On Debian/Ubuntu:
sudo apt-get install libapache2-mod-wsgi-py3
The install process should auto-activate the module.
cd /etc/apache2/mods-enabled/
ls -la wsgi*
lrwxrwxrwx 1 root root 27 Mar 23 22:13 wsgi.conf -> ../mods-available/wsgi.conf
lrwxrwxrwx 1 root root 27 Mar 23 22:13 wsgi.load -> ../mods-available/wsgi.load
On FreeBSD, the module does not get auto-activated and must be loaded via a config file:
sudo pkg install ap24-py37-mod_wsgi
# Create /usr/local/etc/apache24/Includes/wsgi.conf
# or similar, and add this line:
LoadModule wsgi_module libexec/apache24/mod_wsgi.so
Like CGI, the directory with the WSGI script will need special permissions. As a security best practice, it’s a good idea to have scripts located outside of any DocumentRoot, so the scripts can’t accidentally get served as plain files.
<Directory "/var/www/scripts">
Require all granted
</Directory>
As for the WSGI script itself, it’s similar to AWS Lambda, using a pre-defined function. However, it returns an array or bytes rather than a dictionary. Here’s a simple one that will just spit out the host, path, and query string as JSON:
def application(environ, start_response):
import json, traceback
try:
request = {
'host': environ.get('HTTP_HOST', 'localhost'),
'path': environ.get('REQUEST_URI', '/'),
'query_string': {}
}
if '?' in request['path']:
request['path'], query_string = environ.get('REQUEST_URI', '/').split('?')
for _ in query_string.split('&'):
[key, value] = _.split('=')
request['query_string'][key] = value
output = json.dumps(request, sort_keys=True, indent=2)
response_headers = [
('Content-type', 'application/json'),
('Content-Length', str(len(output))),
('X-Backend-Server', 'Apache + mod_wsgi')
]
start_response('200 OK', response_headers)
return [ output.encode('utf-8') ]
except:
response_headers = [ ('Content-type', 'text/plain') ]
start_response('500 Internal Server Error', response_headers)
error = traceback.format_exc()
return [ str(error).encode('utf-8') ]
The last step is route certain paths to WSGI script. This is done in the Apache VirtualHost configuration:
WSGIPythonPath /var/www/scripts
<VirtualHost *:80>
ServerName python.mydomain.com
ServerAdmin nobody@mydomain.com
DocumentRoot /home/www/html
Header set Access-Control-Allow-Origin: "*"
Header set Access-Control-Allow-Methods: "*"
Header set Access-Control-Allow-Headers: "Origin, X-Requested-With, Content-Type, Accept, Authorization"
WSGIScriptAlias /myapp /var/www/scripts/myapp.wsgi
</VirtualHost>
Upon migrating a test URL from CGI to WSGI, the page load time dropped significantly:

The improvement is thanks to a 50-90% reduction in “wait” and “receive” times, via ThousandEyes:

I’d next want to look at more advanced Python Web Frameworks like Flask, Bottle, WheezyWeb and Tornado. Django is of course a popular option too, but I know from experience it won’t be the fastest. Flask isn’t the fastest either, but it is the framework for Google SAE which I plan to learn after mastering AWS Lambda.