How to Use Proxies with Python Requests: A Guide

Using proxies with Python Requests is essential for web scraping and automating workflows. Proxies allow you to hide your real IP address, bypass geographic restrictions, rotate IPs to avoid getting blocked, and more.

In this comprehensive guide, you‘ll learn how to configure, authenticate, and rotate proxies with the Python Requests library.

Contents

Why Use Proxies with Python Requests?
Prerequisites
Basic Proxy Setup
Using SOCKS Proxies
Authenticating Proxies
Using Proxy Sessions
Setting Proxy Environment Variables
Rotating Proxies with Python Requests
Conclusion

Why Use Proxies with Python Requests?

Here are some of the main reasons you may want to use proxy servers with Requests:

Avoid getting blocked – Websites have bot detection systems that can blacklist your IP if you send too many requests. Proxies allow you to rotate IPs and avoid getting blocked.
Bypass geographic restrictions – Some sites restrict content based on location. Proxies let you appear from different countries and access geo-restricted content.
Hide your identity – Your real IP reveals your location and identity. Proxies add a layer of anonymity to your web requests.
Access the internet from anywhere – If you are in a network with firewalls or internet restrictions, proxies allow you to bypass them.
Improve performance – Proxies can help distribute requests to multiple IPs, allowing faster scraping and data collection.
Debug requests – Logging requests through proxies makes it easier to monitor and troubleshoot your scripts.

Prerequisites

Before using proxies with Python Requests, you‘ll need:

Python 3 – The latest Python 3 version. Python 2 won‘t work.
Requests Module – Install it via pip install requests
A code editor – Any editor like VS Code, Atom, Sublime, etc.
Proxy addresses – A list of proxy IPs and credentials. You can get free public proxies or purchase private ones.

Basic Proxy Setup

Setting up a proxy with Requests involves just passing the proxy URL with your request:

import requests

proxies = {
  ‘http‘: ‘http://192.168.1.1:8000‘,
  ‘https‘: ‘http://192.168.1.1:8000‘,   
}

response = requests.get(‘https://example.com‘, proxies=proxies)

Here we define a dictionary called proxies with the proxy URLs for HTTP and HTTPS traffic. The IP address and port will depend on your specific proxy server.

Then we pass proxies as a parameter in the requests.get() method to route the request through the proxy.

This will route all calls through the proxy server instead of your own IP.

Using SOCKS Proxies

Besides basic HTTP proxies, Requests also supports SOCKS protocols like SOCKS4 and SOCKS5.

Here‘s how to use a SOCKS5 proxy:

proxies = {
  ‘http‘: ‘socks5://192.168.1.1:8000‘,
  ‘https‘: ‘socks5://192.168.1.1:8000‘
}

requests.get(‘https://example.com‘, proxies=proxies)

Notice the URL scheme is now socks5:// instead of http://. The rest of the code remains the same.

Authenticating Proxies

Some proxies require authentication with a username and password before use.

To authenticate proxies, include the credentials in the proxy URL like so:

proxies = {
    ‘http‘: ‘http://user:[email protected]:8000‘,
    ‘https‘: ‘http://user:[email protected]:8000‘  
}

Make sure to use your actual username and password configured on the proxy server.

This will authenticate all requests made through this proxy.

Using Proxy Sessions

When you need to make multiple requests with the same proxy, it‘s better to use sessions instead of passing the proxies dict every time.

Sessions allow connection reuse and are faster:

session = requests.Session() 

proxies = {
  ‘http‘: ‘http://192.168.1.1:8000‘,
  ‘https‘: ‘http://192.168.1.1:8000‘,   
}

session.proxies = proxies

response = session.get(‘https://example.com‘)

Here we create a new Session() object and set the proxies on it. All further requests made through this session will route through the assigned proxy.

This avoids reconfiguring the proxy on each request.

Setting Proxy Environment Variables

Hardcoding proxy credentials in your script isn‘t ideal, especially in teams. A better practice is using environment variables.

Here‘s how to configure proxy environment variables:

On Windows:

set http_proxy=http://username:[email protected]:8000
set https_proxy=http://username:[email protected]:8000

On Linux/MacOS:

export http_proxy=http://username:[email protected]:8000
export https_proxy=http://username:[email protected]:8000

Then access them in Python:

import os

proxies = {
    ‘http‘: os.environ[‘http_proxy‘],
    ‘https‘: os.environ[‘https_proxy‘]
}

requests.get(‘https://example.com‘, proxies=proxies)

This keeps the proxy credentials separate from code.

Rotating Proxies with Python Requests

Rotating proxies helps distribute requests across multiple IPs and avoid getting blocked by websites.

To rotate proxies:

Create a list of proxy URLs you want to rotate through:

proxy_list = [
  ‘http://user:[email protected]:8000‘,
  ‘http://user:[email protected]:8000‘,
  ‘socks5://user:[email protected]:8000‘
]

On each request, randomly select a proxy:

import random

# Select random proxy
proxy = random.choice(proxy_list)
proxies = {‘http‘: proxy, ‘https‘: proxy}

response = requests.get(url, proxies=proxies)

Alternatively, you can iterate through the list sequentially:

proxy_iter = iter(proxy_list)

for url in url_list:
    proxy = next(proxy_iter)
    # Code to make request

This way your requests use a different proxy every time, making it harder to detect and block your traffic.

Make sure to use a large pool of proxies and implement automatic proxy rotation in your code for seamless scraping at scale.

Conclusion

Configuring proxies in Python Requests is easy and offers many advantages like anonymity, geo-unblocking, and preventing bans.

The key steps are:

Define a proxy dictionary with URLs
Pass the proxies parameter to route requests through them
Use proxy sessions for multiple requests
Authenticate if required
Rotate proxies randomly or sequentially

With this guide, you should be able to start using proxies for Python web scraping and automation. Combine Requests with a good proxy service to extract data faster and more reliably.