Let’s say you have a web server behind a load balancer that acts as a reverse proxy.  Since the load balancer is likely changing the source IP of the packets with its own IP address, it stamps the client’s IP address in the X-Forwarded-For header and then passes it along to the backend server.  Assuming the web server has been configured to log this header instead of client IP, a typical log entry will look like this:

198.51.100.111, 203.0.113.222 – – [10/Mar/2020:01:15:19 +0000] “GET / HTTP/1.1” 200 3041 “-” “Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.122 Safari/537.36”

198.51.100.111 is the client’s IP address, and 203.0.113.222 is the Load Balancer’s IP address.   Pretty simple.  One would assume that it’s always the first entry that’s the client’s IP address, right?

Well no, because there’s an edge case.  Let’s say the client is behind a proxy server that’s already stamping X-Forward-For and thus leaking the client’s IP addresses to the internet.  When the load balancer receives the HTTP request, it will often pass the X-Forwarded-For header unmodified to the web server, which then logs the request like this:

192.168.1.49, 198.51.100.111, 203.0.113.222 – – [10/Mar/2020:01:15:05 +0000] “GET /  HTTP/1.1” 200 5754 “-” “Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36”

192.168.1.49 is the client’s true internal IP, but we don’t care about that since it’s RFC-1918 and not of any practical use.  So it’s actually the second to last entry (not necessarily the first!!!) that contains the client’s public IP address and the one that should be used for any Geo-IP functions.

Here’s some sample Python code:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from __future__ import print_function
import os

remote_addr = os.environ.get('REMOTE_ADDR', '127.0.0.1')
x_fwd_for = os.environ.get('HTTP_X_FORWARDED_FOR', '')

if ", " in x_fwd_for:
    client_ip = x_fwd_for.split(", ")[-2]
else:
    client_ip = remote_addr

print("Status: 200\n\nHello, " + client_ip + "\n")