When SQL injections come up in conversation, I often hear: “how are those still a thing!?” The answer is the same as why firewalls get configured with default passwords: security is difficult to implement at scale because humans tend to do what’s easiest. While frameworks continue to bake in SQL injection prevention, real-world developers have combinations of preferences and legacy code that lead to many possible implementations.
SQL injections are difficult to prevent and there are many ways of implementing protections, all of which have various pros and cons. It’s important to keep in mind that every development team is different and there are many reasons why there is no universal “right way” to fix anything. This article will help you identify mitigations that may be in place and understand the pros and cons of various prevention methods.
The examples we’ll use are in Python but apply to most languages.
Here’s an example of some vulnerable code:
import sqlite3
# Connect to DB
connection = sqlite3.connect('your_database.db')
# Get customer name from request
customer_name = request.GET.get('customerName')
# Create Query to fetch the account balance
query = f"SELECT account_balance FROM user_data WHERE user_name = '{customer_name}'"
try:
cursor = connection.cursor()
cursor.execute(query)
results = cursor.fetchall()
for row in results:
print(row)
except sqlite3.Error as e:
print(f"An error occurred: {e}")
finally:
cursor.close()
connection.close()
The recommended protection against SQL injection that applies the most widely is parameterized queries. SQL injections most commonly occur due to developers hand crafting their own SQL. In these use cases, replacing an f-string like from the vulnerable code with a parameterized query is the smallest code change to fix the issue.Here’s an example:
Vulnerable f-string:
query = f"SELECT account_balance FROM user_data WHERE user_name = '{customer_name}'"
Parameterized Fix:
query = "SELECT account_balance FROM user_data WHERE user_name = ?"
cursor.execute(query, (customer_name,))
Here’s what the fix would look like all together:
import sqlite3
#Connect to SQL DB
connection = sqlite3.connect('your_database.db')
#Get customer name from request
customer_name = request.GET.get('customerName')
#Create Query to fetch the account balance
query = "SELECT account_balance FROM user_data WHERE user_name = ?"
try:
cursor = connection.cursor()
# This passes in the customer name as a parameter
cursor.execute(query, (customer_name,))
results = cursor.fetchall()
for row in results:
print(row)
except sqlite3.Error as e:
print(f"An error occurred: {e}")
finally:
cursor.close()
connection.close()
To better understand what’s happening with a parameterized query, let’s say our attacker friend used this as their customerName:
' OR '1'='1
For a non-parameterized query, this turns into the following SQL statement:
SELECT account_balance FROM user_data WHERE user_name = '' OR '1'='1'
Because this statement now evaluates to true for all users, the attacker would receive everyone’s account balance.
Conversely, a parameterized query turns into this SQL statement:
SELECT account_balance FROM user_data WHERE user_name = ?
And uses the value (' OR '1'='1'
) as a literal string for comparison, preventing injection.
Overall, this is the easiest way to fix SQL injections; however, in today’s world of sprawling frameworks and methodologies, here are some other fixes that you might see in the wild, to help you spot if something’s a real vulnerability.
Pros | Cons |
Prevents SQL injection by treating inputs as data. | Requires understanding the specific libraries and how they implement queries properly |
Simple and effective for preventing SQL injection in raw SQL queries. | Slightly more verbose than non-parameterized queries. |
Widely supported by all major database libraries and frameworks. | Does not prevent other types of attacks (e.g., logic errors, over-permissive queries). |
Don’t need to import new data related libraries | Still easy for a developer to make a mistake |
Low effort on rewriting the code | Can become more complicated when constructing complex queries |
Slight performance improvement |
Another approach to fixing SQL injections is using Object-Relational Mappers (ORMs). ORMs make it easy for developers to interact with databases by translating between the database schema and object-oriented code. This reduces the need to write raw SQL queries directly and helps prevent SQL injections by default, as the ORM frameworks usually handle parameterization internally.
For instance, let’s consider SQLAlchemy, a popular ORM for Python. Here’s an example:
from sqlalchemy.orm import sessionmaker
from sqlalchemy import create_engine
from mymodels import UserData # assuming UserData is a mapped class
engine = create_engine('postgresql://user:password@localhost/mydatabase')
Session = sessionmaker(bind=engine)
session = Session()
customer_name = request.args.get('customerName')
account_balance = session.query(UserData.account_balance).filter(UserData.user_name == customer_name).first()
print(account_balance)
In this example, the ORM is obfuscating the SQL queries in the form of object-oriented code, preventing SQL injection by default. Under the hood, sqlalchemy is also using parameterized queries to prevent the attack, but it’s being obfuscated away by ORM. Some of the pro’s to this approach are: it’s harder to create insecure queries. The con’s to this approach are: all queries need to fully be rewritten, there can be a steep learning curve, can be difficult to get super performant queries in complex situations.
Pros | Cons |
Automatically handle parameterization and prevent SQL injection by default. | Can add complexity and overhead to the project. |
Simplifies database interactions by abstracting SQL queries into object-oriented code. | May have a steeper learning curve for developers unfamiliar with ORM concepts. |
Enhances code readability and maintainability. | Performance may be impacted compared to raw SQL in certain specific cases. |
Provides additional features like schema migrations and data validation. | Debugging can be more difficult due to abstraction. |
Makes it more difficult to write insecure queries, which mostly prevents developers from accidentally introducing vulnerabilities | Migrating to ORM is a heavy lift as the previous queries don’t just need a little change, they need to be fully rewritten from scratch in a new notation |
Escaping and validation are common neutralization implementations for developers because they seem to solve the immediate problem, like making sure certain characters just aren’t allowed. You might see examples like this implemented, but they carry with them many issues:
from sqlalchemy.orm import sessionmaker
from sqlalchemy import create_engine
from mymodels import UserData # assuming UserData is a mapped class
import re
# Create a database engine
engine = create_engine('postgresql://user:password@localhost/mydatabase')
Session = sessionmaker(bind=engine)
session = Session()
# Get the customer name from the request
customer_name = request.args.get('customerName')
# Flawed validation allowing alphanumeric, underscores, and spaces
if not re.match("^[a-zA-Z0-9_ ]*$", customer_name):
raise ValueError("Invalid username")
# Query the database using the validated customer name
account_balance = session.query(UserData.account_balance).filter(UserData.user_name == customer_name).first()
print(account_balance)
Beyond performance issues with validation (where the query time increases with complexity), this approach also puts you in a dangerous cat-and-mouse game with attackers. It is also dependent on your database schema and implementation. For example, this payload: (admin' OR '1'='1
) would pass our validation, while commenting out the rest of our query.
Overall, for ad hoc fixes, parameterized queries offer a great way to quickly fix existing SQL queries. However, if you’re starting from scratch, beginning with an ORM is a great way to prevent SQL injection with many benefits for usability.
SQL injections can be reliably detected from static application security testing (SAST) tools, even free open source tools. In a quick scan, you can discover if you’re vulnerable to these common attacks. SQL injections are the costliest types of attacks, because applications typically run with privileged read and write access to your data, which also includes all sensitive information. Usually, companies consider these kinds of attacks as the highest priority to stop and fix.