sylvain durand

Use Tor with Python

This page will show you how to use Tor to anonymously access data with a Python script. This can be particularly useful if you want to create a scrapper without being banned by the server concerned.

Tor installation

The installation of Tor depends on your system, and is detailed on the official website. On a Debian or Raspbian, we use:

sudo apt-get install tor

To launch Tor, just run:

sudo service tor start

To check if it works, simply run the following command from a terminal:

curl --socks5 localhost:9050 --socks5-hostname localhost:9050 -s https://check.torproject.org/ | cat | grep -m 1 Congratulations | xargs

This command will display:

Congratulations. This browser is configured to use Tor.

Usage with Python

With requests library

To request a page, use the requests library. If you do not have it, just install it:

pip install requests
pip install requests[socks]
pip install requests[security]

If there is an error for the last command, try to install cryptography requirements:

sudo apt-get install build-essential libssl-dev libffi-dev python-dev

We then use, in Python:

import requests

You can check your IP address without Tor with the command:

requests.get('https://ident.me').text

To use Tor, we tell it to use a proxy:

proxies = {
    'http': 'socks5://127.0.0.1:9050',
    'https': 'socks5://127.0.0.1:9050'
}

requests.get(url, proxies=proxies).text

So, you should have a new IP address with:

requests.get('https://ident.me', proxies=proxies).text

Obtaining a new identity

If you need a new identity, and change your IP address, you need to install stem:

pip install stem

The Tor controller must also be configured to request identity renewal:

sudo nano /etc/tor/torrc

We use the parameters:

ControlPort 9051
CookieAuthentication 1

Then we restart Tor to take into account these modifications:

sudo service tor restart

With Python, we now use the following command:

from stem import Signal
from stem.control import Controller

with Controller.from_port(port = 9051) as c:
    c.authenticate()
    c.signal(Signal.NEWNYM)

To check it, we look if we get a new IP with:

requests.get('https://api.ipify.org', proxies=proxies).text

Strengthen anonymity by changing the User-Agent

If anonymity is required, it may be useful to change the user-agent , which betrays our identity to the server. To do this, install fake_useragent:

pip install fake_useragent

We can then use, in Python:

from fake_useragent import UserAgent
headers = { 'User-Agent': UserAgent().random }
requests.get(url, proxies=proxies, headers=headers).text

Automation with Cron

If your Python script is to be used regularly using a Cron job, it may be useful to add a random delay to prevent the access time from being too regular:

import random, time
wait = random.uniform(0, 2*60*60)
time.sleep(wait)