wk11 - SQL
Offsec wk11 - SQL database
Tools
General reminders
end your injected query with
;and comment out whatever might follow using--(we’re blind to)the docker container resets every 4 hrs
Advice:
Before automating…
Use burp suite. Look at the response packets
Do you want to use response codes as feedback?
Or do you
Start big, narrow it down
Look at the schema from
SELECT * from sqlite_schema;
figure out:
how many tables does this database have?
SELECT * from sqlite_schema where type=’table’ > get the count
what table are we in?
SELECT name FROM pragma_table_info(’Users’)
gives us all the columns for the table ‘Users’
Brute-force
the table name
normal query AND (comparison that checks if the first letter of the table name is equal to a certain letter)
normal query could be: SELECT * from Users where username=’<USER INPUT>’
where our injected query is. Contains the subquery we design
If the normal query executes > we know the comparison passed
If the normal query does not execute > we know the comparison failed
Signal is whether the normal query executes
The whole point is: as a client, entering the data won’t allow you to see the result of a custom query. You can only chain the original, intended query with a subquery and see if the original query executes. Through a long series of yes or no, you can find out facts!
Recall - day after lecture
SQL as database language (structured query language)
there are many variants of SQL
SQLite 3 was the focus
Others: MySQL, PostgresQL,
Characteristics: database contains tables
vs other types of databases (relational), which lack tables?
How to attack?
Entry point: user input
e.g. when the website prompts for user log-in, entering username and password. If the inputs are not sanitized and assumed safe, as a malicious user, we can enter whatever’s prompted PLUS custom queries.
Key point: we control the end of the query
What if we don’t actually control the end of the query? in other words what if there are more checks happening in the backdrop unbeknownst to us? Ans: we can be clever and insert comment tags such that whatever follows get commented out
Intended query (not coded by you; invisible to you) + custom que
Flavours of attacks
Blind
You enter a custom query (which returns a boolean) on top of the intended query. You don’t see the direct result from the custom query. Rather, your signal is to gauge whether the intended query executed. If it does, it means your custom query returned true.
E.g. if it executes, maybe it redirects you to another page, where the HTTP status code returned to us would be 320. Else, it’s a 400?
In essence, you work things out by asking the server a bunch of implied True/False questions using SQL injections
Not blind
You enter a custom query and the results get returned to you directly.
Databases form the backbone of web, mobile, and desktop applications. They store user info—which might potentially be private or sensitive—and perhaps store user secrets—which are hopefully hashed and salted>](https://en.wikipedia.org/wiki/Salt_(cryptography).)). Database leaks in the form of breaches are all too common in modern cybersecurity. It is consequently imperative we stress test database design and queries to ensure the data is secured from nefarious attackers.
Database design and programming is its own complex discipline. Choosing the correct database for an application or service can be the difference between correct and incorrect responses, and can significantly impact query speed. We don't need to be database engineers to attack web applications, but we must be familiar with their structures and how to interact with them.
We'll first take a look at how to interact with a database, how those interactions can be abused, and practice attack strategies against a modern database.
Table of Contents
[[#SQL Database Structure and Queries]]
[[#SQL Injection]]
[[#Blind SQLi]]
[[#Blind SQLi Techniques]]
[[#Schema]]
[[#Practical Example]]
[[#UNION queries]]
[[#Recitation]]
[[#Next Steps]]
SQL Database Structure and Queries
Databases are generally split into two categories: relational or non-relational. Relational databases have a primary key that distinguishes each entry from another. Examples are the ubiquitous MySQL and PostgreSQL. Non-relational databases are more flexible and useful for storing blobs of information with different structures. The popular MongoDB is an example of a non-relational database.
Understanding the differences between these databases is a full course in itself, so we will restrict this lesson to a single relational database: SQLite.
It's useful to create our own local database research environment to improve our understanding of the system and test hypotheses. One option are SQL "fiddles,” which are small online sandboxes to test SQL queries against. These are very quick to spin up without any of the supporting command line tools or front-end wrappers. It's important to note that the fiddles vary based on SQL database type, as syntax can be a bit different between implementations. Let's use this SQLite fiddle to create our own database and queries.
First, we must input some data. We can create a table using the CREATE TABLE command and indicate which columns we want in that table. Next, we can insert rows into the table using the INSERT INTO command followed by VALUES that match the required columns.
In the second pane we run database queries, such as SELCECTing the data column for the admin user so long as the provided admin_password matches the expected password in the database:
We see that this query filters the database to all rows that match the username of admin with the admin_password password. It then SELECTs the data column for matching rows, outputting a single column with a single row:
data
Admin data
SQL Injection
We can interact with a database from a variety of front-end languages, so let's consider an example using a PHP web server. The following code uses a GET request's parameters to log into a website, querying the database to see if the username and password match. The database's query is defined as a string with the GET request parameters injected into it. It's important to note that the input parameters are wrapped in single quotes to treat them as strings themselves, as required by the SQL syntax we saw previously.
Given normal parameters, this query will work as intended. But, as we saw in binary exploitation, we are interested in undefined behavior. What if we know there is an admin user and we want to leak its data, but do not know its password? We could try doing something like putting a SQL comment character in our username to try and bypass the password check. We want the query to be something like:
But unfortunately the single quotes in the query prevent this; if we try and login with admin--, the query is actually:
This fails because admin-- is not in the database. However, what if we include a single quote, ', at the end of the username and before the comment? This can close the substring so that the query takes the form:
Trying this in our fiddle confirms that it is a valid query which returns the data for the admin user! This is our first example of SQL injection to bypass query checks. In production, servers should ensure that the input is validated against known malicious characters, as reinforced by this classic xkcd comic.
![[exploits_of_a_mom_2x.png]]
There are plenty of resources online with extensive lists of database payloads, such as this list from PayloadAllTheThings. These are useful for orienting our testing and making sure that we don't leave anything out.
Blind SQLi
when you make query but the server does not deliver the output of the query
you’re forced to ask the server a series of true/false questions and rig facts from those
What happens when we do not know the target data, or have a query that returns data that is not useful to us? It's unlikely a login query will return secrets, like a user's password. In this situation we must use more complex queries to leak information, sometimes byte by byte, to enumerate the database and its information. This is a technique called blind SQLi, as we'll use feedback from the server—not the output displayed on the page—to leak information.
For this, let's use the sqlite command line tool to create our own database and test our queries. We can quickly install with our operating system's package manager and spin it up with sqlite3. The query format is the same as the online fiddle, so we can create the same database locally for our next tests:
Blind SQLi Techniques
There are two main strategies for leaking information through blind SQLi: timing-based and error/response code-based. Timing-based attacks use a conditional check that will execute in a noticeably different amount of time based on the result of the check. These queries may leverage SLEEP to suspend the query for a number of seconds or execute a computational expensive operation (such as EXP(1000000)) to help distinguish the result. Conversely, error or response code-based blind SQLi uses a differentiated response from the server to leak info. This can include the HTTP response codes or the presence of errors. The key is establishing a baseline of server behavior, whether that be a 200 or 302 redirect on a true condition, and then force a different behavior with the false condition.
If using Python's requests for blind SQLi, it is important to note that the library's default behavior is to follow 302 redirects and return the final response code in response.response_code parameter after all the redirects are resolved. This can be a problem if our blind SQLiI depends on a 200 versus 302 response. Use response.history to see if the server generated any intermediate responses, including 302 redirects, that requests automatically resolved.
For both blind SQLi strategies, we need conditional queries that can differentiate a chosen query as true or false. Some SQL languages have IF THEN statements. In SQLite we can use the CASE WHEN and a valid (true) condition followed by an invalid (false) condition to force a response change. We can use this with some sort of conditional check, such as counting the number of rows in a table:
Understand the syntax:
SELECT CASE WHEN (condition) THEN (A) ELSE (B)
check condition: if condition is fulfilled, return A, else B
END AS result FROM users;
AS <alias for virtual column>
We create a temp column called “result” to store the returned value from the CASE-Conditional
Why? Such that when the results of this SELECT query is displayed, we can see CASE results under a custom column
Note: This column is not actually added to the table. It’s just there for display. It only exists in the display of the query’s results.
COUNT(*) > 1
does this explicitly count the number of entries returned?
Ans: number of rows returned by a SELECT … WHERE
That's great, but what if we do not control the whole query, like with SQLi? We need some sort of data that we can return from this comparison that distinguishes true from false. There are a handful of ways to do this, such as UNION attacks, which we cover in a bit. Another particularly slick way is using the AND operator, which is SQL's logical && we are already familiar with. Using a numeric value in a conditional check passes as true, while a string or failing comparison results as false. Note how the first query in the snippet below returns the table data with a numerical value in the second half of the query, but the output is suppressed in the second query because the conditional check fails with a string.
We will combine CASE WHEN and AND with true and false conditions to force the server into revealing whether the condition we checked passed or failed. But first, we need to know what to check.
Schema
A blind attack generally means we do not have any understanding of the underlying database structure and contained data. We need to leak these specifics if we have any hope of finding our desired information. For example, the database's secrets might not even be stored in the same table as the one operated on by the current query! We need table metadata, such as the list of table names and the structure of those tables. SQLite stores this sort of information in its sqlite_schema table. This is a table of tables (though it doesn't contain itself as an entry), which is very helpful as a starting point. Dumping the tbl_name column of all type='table' entries shows the list of tables in the database.
We'll start by targeting this database to get an understanding of the database as a whole before focusing on a specific table, column, and data cell of choice.
Practical Example
count the number of tables in the database
With blind SQLi it is extremely important to set up tests locally to ensure our queries accomplish their intended goal. If we don't, we could be sending invalid or incorrect queries to the server and interpreting the failures as false conditions! Let's set up a local environment and test leaking a ton of different information to enumerate anything we want in the target SQLite database.
Each of our queries assumes the database query starts with something out of our control, specifically
SELECT * from <unknown tablename> WHERE username='We use the Users table for the sake of the demonstration, however it could be any table as it is out of our control. We also assume that the query continues with some AND password=' clause, but we know from the beginning of this module that we can bypass the second half of the query using comments. The comment character(s) aren't shown below, for the sake of brevity, but in an attack it/they would be required.
For each goal (enumerating the database's tables, a table's columns, or a column's data) it is useful to quantify the number of entries before moving on. We can do this with the COUNT(*) > X conditional and the aforementioned CASE WHEN clause to generate true and false conditions based on the check. By incrementing X, we will eventually run into a false condition which selects a string 'no', which fails the query. This signals that our count is too large by failing to return any data, as shown below:
Chain with AND to leak more info
Great, so we have a single table in the database! We can easily get that database name using SELECT tbl_name FROM sqlite_schema WHERE ='table'. But what happens if there was more than one table returned? We need to break down our queries into single comparisons, and we cannot compare two results at the same time. The answer is to limit the results using LIMIT X where X should always be 1 in our case. That will automatically return only the first row so we can use that as our target. But what if we end up wanting the second (or other) row? LIMIT can be combined with OFFSET X to index into a different row in the returned results. OFFSET is 0 by default, but increasing it to 1 returns results starting with the second row.
We'll use LIMIT 1 OFFSET 0, for the sake of posterity, and target the first table in the database. But we do not know its name (we likely cannot see the query when attacking a remote server, so we could not ascertain that the Users table is what the query references), so we need to dump it in a way that creates a true/false condition. We could guess common names, but this would fail for anything non-standard.
A better strategy is brute forcing the name letter by letter using a substring comparison. We can limit the comparison to a single (unknown) character with the SUBSTR function to isolate the test. SUBSTR takes a string as its first argument, so we need to select the table name as the input. Putting together our knowledge of the sqlite_schema table and LIMIT with OFFSET, we can return a subquery containing the database name for the comparison:
If we iterate through the possible alpha-numeric characters and symbols in the comparison, eventually one should pass, at which point we know the first character in the name. This should be done programmatically using a tool like Python's requests or Burp's Intruder (which is unfortunately a paid feature).
Once we match the first character, we can either move to the second character on its own or increase the number of characters compared against and iterate through a larger substring. The latter is shown below with a two-character comparison using the leaked U as the known first character.
Great! Using this query programmatically yields the table Users. Since that is the only table in the database, we can focus our attention there. Unfortunately we are useless without knowing the table's structure, specifically what columns hold what data. We can query table information with pragma_table_info, which contains metadata on the columns. We care specifically about the name field, so let's just SELECT that.
We can use this subquery just as we did the sqlite_schema subquery to do a count of the columns to know what we're working with. Again, we want to LIMIT and OFFSET:
This informs us we have three columns, which we can iterate through in the same way as the table name:
For this target, we need to iterate through the OFFSET to figure out the column of interest. Changing offset to 1 shows that the column name is password, which definitely looks of interest! Before moving on, let's check how many entries we have in the table by SELECT * from Users and using the result in our COUNT comparison:
Ok good, we only have two rows. If we had more rows it might be useful to iterate through the users to find one of interest before simply dumping passwords. In this case we might surmise that the target is the admin user, so we can add WHERE username='admin' to our subquery when performing the string comparisons. This takes the same form as before, using SELECT password as the column of interest.
Eventually, our brute force will get to the point where the following query with admin_password passes the comparison and no further appended letters succeed. We can conclude this is the password and try it for ourselves!
This is another good moment to step back and appreciate what we accomplished. We turned a simple SQLi vulnerability into configurable primitives: conditional checks that can be used to count and find strings. We used those strings to leak the tables in the database, the columns in the tables, and the data in the columns. We could enumerate the entire database with enough time! This is what makes blind SQLi such a powerful attack vector.
UNION queries
UNION attacks are another powerful way to leak information about and in the database. Let's take a look at how UNION queries work. First, we need to create a second table in our database, such as a secrets table:
UNION allows us to join queries and append the results of the second to the first. Let's merge the results of two separate queries from two separate tables together:
We see that the UNION does not care about data type or column name during the merge, it just stacks data on top of one another. But there is one important consideration, the number of columns returned by all united queries must be equal. Trying to merge a single column result with two columns results in an error. In a web query, this usually results in an Internal Sever Error or 500 response code.
Use UNION to leak the number of columns in a table
Fact: cannot UNION if you’re unionizing two things with different amount of columns
Iterate through different test columns.
This advantageous to us, as we can use it to enumerate the number of columns returned by the beginning of a query, such as our SELECT X from Users WHERE username=' login query:
Without knowing the beginning of the query we can still ascertain that it results in three columns. This works if we know a column and a table name to end the query with, which we may not. In this case we can use the generic NULL column to find a match:
We could also accomplish this with an ORDER BY query, which orders the results by column index. Once the query fails we know we have gone one index too far:
sqlite> SELECT * from Users WHERE username='admin' ORDER BY 1; admin|admin_password|Admin data sqlite> SELECT * from Users WHERE username='admin' ORDER BY 2; admin|admin_password|Admin data sqlite> SELECT * from Users WHERE username='admin' ORDER BY 3; admin|admin_password|Admin data sqlite> SELECT * from Users WHERE username='admin' ORDER BY 4; Parse error: 1st ORDER BY term out of range - should be between 1 and 3
If we know the table and maybe a column, we can select data from that table and move it into the query results, as shown above! If the webpage displays all results in the page HTML then we successfully leaked database information. If it only chooses a certain number of rows, then we must filter the results with filtering clauses such as LIMIT, OFFSET, and/or ORDER BY to get the leaked data into the parsed result.
Recitation
Recitation 11.0
Different databases have different comment operators. What are some common symbols to try when performing a SQL injection attack against an unknown database schema?
/* */ multiline comment
--single-line comment (until the end of the line)
Recitation 11.1
Practice making requests with Python's requests library. pip3 install requests and create a short script that makes a GET and POST request to one of the challenges on the challenge server.
Hint: you'll need to set the CHALBROKER_USER_ID cookie to your NetID to access the challenge routes.
CHALBROKER_USER_ID= hw3172
steps
burp suite
open chromium on burp
navigate to our challenge page
try to send something (log-in) and get a response back
go back to burp and inspect http history
Look at the POST request packet
Copy this, tell ChatGPT to write up a Python script using requests to write
Recitation 11.2
Practice making a request to one of the SQL challenges on the challenge server using requests. Output the response code to check if the request succeeded or failed.
Next Steps
This has been a crash course in SQL injection in SQLite, and these strategies work on a variety of relational databases. But each language has its own quirks and some have special techniques to account for differences in query options and design. We will also look at exploitation of a new database concept, NoSQL databases, in our next lesson.
Last updated