Unconventional Load Testing: Leveraging Python Locust for PostgreSQL Load Tests
Many people have probably used Locust or a tool built on top of it (like Load Forge) in the past. These tools are primarily used for web application load testing by hitting public endpoints and monitoring how they respond. In this article we review how to use Locust to test a PostgreSQL database.
Last Updated: 7/21/24
Locust Documentation: https://docs.locust.io/en/stable/index.html
My code for this: https://github.com/guel-codes/db_loadtesting
Conference Talk about this: https://youtu.be/C2PggXqIJtE?si=WZivDDWI1ageXUsB
Prerequisites:
- Python 3.9 or higher
- psycopg (3.1 or higher) — this is because greenlet supports version 3 out of the box
- locust
** Before we dive in, I think it is important to note that this article is highlighting a very simple example of Locust. I quickly created a PostgreSQL database inside of my local Docker server and used that to test for this article. This is also being run on MacOS**
What is database load testing?
Database load testing is the process of simulating an expected load on your database, such as concurrent users and queries to see how it responds. It is essential for engineering and product teams to know how their system will respond during normal and peak traffic. This information gives the team better insight on the user experience and helps mitigate costly downtime of the application.
Locust — How is it typically used?
Locust is an open-source load testing tool written in Python. The best part about this tool is that it is easily extendable and allows you to do some really cool things as long as you have the time and patience. In most cases you will see Locust used to stress test websites and specific API endpoints. However, due to Locust being simple and hackable, you can create your own client to test nearly any system that you want. 😊
That is what we are doing in this article. We will be creating our own Postgres client in order to stress test a database instead of just the application, of which, the database is connected. Let’s first introduce you to Locust and some of the things it can do out of the box.
Getting Setup with Locust 🦗
Install Locust — https://docs.locust.io/en/stable/installation.html#
import time
from locust import HttpUser, task, between
# This is an example class with example methods/tasks
class QuickstartUser(HttpUser):
wait_time = between(1, 5)
@task
def hello_world(self):
self.client.get("/hello")
self.client.get("/world")
@task(3)
def view_items(self):
for item_id in range(10):
self.client.get(f"/item?id={item_id}", name="/item")
time.sleep(1)
def on_start(self):
self.client.post("/login", json={"username":"foo", "password":"bar"})
We need to create a QuickstartUser
class that inherits attributes from the the HttpUser
class. This class is important because it allows each simulated user to utilize the client
attribute, which establishes as instance of the HttpSession
class. HttpSession is used to make HTTP requests to endpoints of our running application.
The simulated users are controlled by the @task decorator
, which you can see above the functions/methods in the QuickstartUser
class. The hello_world
and view_items
methods are making call the the endpoints inside of their self.client.get()
methods.
One really cool thing that Locust has is the on_start and on_stop methods. These allow you to define the behavior of each user when the test starts and stop. This makes it easy to test users logging in and out of the the website, making load test that much more realistic.
Running the test:
Now it is time to run the code. This can be done 2 different ways:
- Run
locust
in the terminal, if you named the filelocustfile.py
, but if you didn’t, then it would belocust -f name_of_file.py
. This will show something along the lines of this:
If you go to http://0.0.0.0:8089 in your browser, then you will see the Locust UI which allows you to customize all the parameters for your test in your browser. We will touch more on this UI element when we test our database.
2. The other way to run these tests is to do it entirely inside of your terminal. You can configure everything you need, using command line arguments
locust -f loadtest.py --users=500 --spawn-rate=10 --run-time=’3m’ --autostart --autoquit 3
This will still start the web interface but will also give you the output of your test in the command line
Load Testing PostgreSQL 🐘
This testing file is used for testing a Postgres database, so I am leveraging psycopg
as the adapter. However, just about any SQL database could be tested by making a couple changes. Now, let’s implement the code below and start testing our database!
import psycopg
from locust import User, TaskSet, task, between, events
import time
def create_conn(conn_string):
print("Connect and Query PostgreSQL")
conn = psycopg.connect(conn_string)
return conn
def execute_query(conn_string, query):
db_conn = create_conn(conn_string)
db_query = db_conn.cursor().execute(query)
return db_query
class PostgresClient:
def __getattr__(self, name):
def request_handler(*args, **kwargs):
start_time = time.time()
try:
res = execute_query(*args, **kwargs)
response_time = int((time.time() - start_time) * 1000)
events.request.fire(
request_type="postgres",
name=name,
response_time=response_time,
response_length=0,
)
except Exception as e:
response_time = int((time.time() - start_time) * 1000)
events.request.fire(
request_type="postgres",
name=name,
response_time=response_time,
response_length=0,
exception=e,
)
print("error {}".format(e))
return request_handler
class CustomTaskSet(TaskSet):
# update this connection string with your specific on
conn_string = "postgresql://postgres:postgres@localhost:5432/loadtesting_db"
@task # this task decorator is used so each spawned users knows which method to run
def run_query(self):
self.client.execute_query(
self.conn_string,
f"SELECT * FROM loadtesting.user",
)
# This class will be executed when you run locust
class PostgresLocust(User):
min_wait = 0
max_wait = 1
tasks = [CustomTaskSet]
wait_time = between(min_wait, max_wait)
def __init__(self, *args):
self.client = PostgresClient()
Code breakdown:
- First we import all of the packages that we need.
- Define our
create_conn
andexecute_query
functions. These will be used to connect to the database and run our queries. - The
PostgresClient
class is a custom client to interact with the Postgres database. The__getattr__
method intercepts attribute calls and is run if the expected attribute is not found. Therequest_handler
method will execute the database query and measures the response time. Then it fires a Locust event with the relevant metrics - The
CustomTaskSet
class will hold the tasks that will be run by each simulated Locust user during the test. My code is just doing one task for the test but there can be multiple by defining the behavior in a method and then tagging it with the@task decorator
- The
PostgresLocust
class represents a Locust user that will simulate load on the database. It sets the min and max wait times between each task that is run. It also initializes thePostgresClient
, which will be used by each Locust user to execute queries on the database
After the code above is updated with a new connection string and any additional task methods, it should be ready to run. Again, we can run the test in 2 different ways either in the browser or in the terminal
locust -f loadtest.py --users=500 --spawn-rate=10 --run-time=’3m’ --autostart --autoquit 3
The test should look something like this in the UI. It will track total requests per second, response times and number of users
Confirm the test is actually hitting the database:
Connect to your database in PGAdmin4, DBeaver or in your terminal and run SELECT pid, query_start, state, query FROM pg_stat_activity
. The pg_stat_activity table shows the queries that are actively running on the database. You might have to run the query a couple times to catch it, but you should see the same queries in this table that are in your Locust testing file.
Conclusion/Next Steps
It was fun figuring this out. I spent a few days reading the Locust and greenlet docs really trying to understand the coroutine aspects a little more. Next, I would love to make a couple more articles with some other database clients. Maybe I could use pymongo
and do some load testing for a MongoDB database. 🤯