Monday 8 October 2018

What's the best method for sanitizing user input with PHP?

It's a common misconception that user input can be filtered. PHP even has a (now deprecated) "feature", called magic-quotes, that builds on this idea. It's nonsense. Forget about filtering (Or cleaning, or whatever people call it).
What you should do, to avoid problems, is quite simple: whenever you embed a string within foreign code, you must escape it, according to the rules of that language. For example, if you embed a string in some SQL targeting MySql, you must escape the string with MySql's function for this purpose (mysqli_real_escape_string). (Or, in case of databases, using prepared statements are a better approach, when possible)
Another example is HTML: If you embed strings within HTML markup, you must escape it with htmlspecialchars. This means that every single echo or print statement should use htmlspecialchars.
A third example could be shell commands: If you are going to embed strings (Such as arguments) to external commands, and call them with exec, then you must use escapeshellcmd and escapeshellarg.
And so on and so forth ...
The only case where you need to actively filter data, is if you're accepting preformatted input. Eg. if you let your users post HTML markup, that you plan to display on the site. However, you should be wise to avoid this at all cost, since no matter how well you filter it, it will always be a potential security hole.



No. You can't generically filter data without any context of what it's for. Sometimes you'd want to take a SQL query as input and sometimes you'd want to take HTML as input.
You need to filter input on a whitelist -- ensure that the data matches some specification of what you expect. Then you need to escape it before you use it, depending on the context in which you are using it.
The process of escaping data for SQL - to prevent SQL injection - is very different from the process of escaping data for (X)HTML, to prevent XSS.



No, there is not.
First of all, SQL injection is an input filtering problem, and XSS is an output escaping one - so you wouldn't even execute these two operations at the same time in the code lifecycle.
Basic rules of thumb
  • For SQL query, bind parameters (as with PDO) or use a driver-native escaping function for query variables (such as mysql_real_escape_string())
  • Use strip_tags() to filter out unwanted HTML
  • Escape all other output with htmlspecialchars() and be mindful of the 2nd and 3rd parameters here.






What you are describing here is two separate issues:
  1. Sanitizing / filtering of user input data.
  2. Escaping output.
1) User input should always be assumed to be bad.
Using prepared statements, or/and filtering with mysql_real_escape_string is definitely a must. PHP also has filter_input built in which is a good place to start.
2) This is a large topic, and it depends on the context of the data being output. For HTML there are solutions such as htmlpurifier out there. as a rule of thumb, always escape anything you output.
Both issues are far too big to go into in a single post, but there are lots of posts which go into more detail:



There's no catchall function, because there are multiple concerns to be addressed.
  1. SQL Injection - Today, generally, every PHP project should be using prepared statements via PHP Data Objects (PDO) as a best practice, preventing an error from a stray quote as well as a full-featured solution against injection. It's also the most flexible & secure way to access your database.
    Check out (The only proper) PDO tutorial for pretty much everything you need to know about PDO. (Sincere thanks to top SO contributor, @YourCommonSense, for this great resource on the subject.)
  2. XSS - Sanitize data on the way in...
    • HTML Purifier has been around a long time and is still actively updated. You can use it to sanitize malicious input, while still allowing a generous & configurable whitelist of tags. Works great with many WYSIWYG editors, but it might be heavy for some use cases.
    • In other instances, where we don't want to accept HTML/Javascript at all, I've found this simple function useful (and has passed multiple audits against XSS):
      /* Prevent XSS input */ function sanitizeXSS () { $_GET = filter_input_array(INPUT_GET, FILTER_SANITIZE_STRING); $_POST = filter_input_array(INPUT_POST, FILTER_SANITIZE_STRING); $_REQUEST = (array)$_POST + (array)$_GET + (array)$_REQUEST; }
  3. XSS - Sanitize data on the way out... unless you guarantee the data was properly sanitized before you add it to your database, you'll need to sanitize it before displaying it to your user, we can leverage these useful PHP functions:
    • When you call echo or print to display user-supplied values, use htmlspecialchars unless the data was properly sanitized safe and is allowed to display HTML.
    • json_encode is a safe way to provide user-supplied values from PHP to Javascript
  4. Do you call external shell commands using exec() or system() functions, or to the backtick operator? If so, in addition to SQL Injection & XSS you might have an additional concern to address, users running malicious commands on your server. You need to use escapeshellcmd if you'd like to escape the entire command OR escapeshellarg to escape individual arguments.



Just wanted to add that on the subject of output escaping, if you use php DOMDocument to make your html output it will automatically escape in the right context. An attribute (value="") and the inner text of a <span> are not equal. To be safe against XSS read this: OWASP XSS Prevention Cheat Sheet



There is the filter extension (howto-linkmanual), which works pretty well with all GPC variables. It's not a magic-do-it-all thing though, you will still have to use it.



The best BASIC method for sanitizing user input with PHP:

    function sanitizeString($var)
    {
        $var = stripslashes($var);
        $var = strip_tags($var);
        $var = htmlentities($var);
        return $var;
    }

    function sanitizeMySQL($connection, $var)
    {
        $var = $connection->real_escape_string($var);
        $var = sanitizeString($var);
        return $var;
    }

0 comments:

Post a Comment