There are a number of good practises that you should follow when developing web applications in PHP. Most of these are extremely easy to pick up and some of them will even apply to web application development in general.
1. Redirect after a successful POST request.
This is not PHP-specific. To avoid a situation where the user refreshes their browser and submits the same form data twice, you should always use the Post/Redirect/Get (PRG)pattern. A basic example of this:
1
2
3
4
5
6
7
8
9
10
|
<?php
//Process form data here.
//If the form submission was successful.
if($success){
//Redirect the user.
header('Location: page.php?msg=success');
exit;
}
|
2. Don’t use the mysql_* functions.
As of PHP 5.5, the mysql functions have been officially deprecated. According to the official PHP website, the default MySQL extension will be completely removed in the future. If that doesn’t persuade you to find an alternative, then you should also consider the fact that it lacks support for a number of MySQL features. Most notably:
- Prepared statements.
- Transactions.
- Stored procedures.
- Asynchronous queries.
- Multiple statements.
The fact is, this ageing extension was built for MySQL version 3.23. Since then, very little has actually been added in the way of features. To put all of this into perspective, the current MySQL version is at 5.6, with 3.23 having been released back in 1999!
OK, so what do I use instead?
Two good alternatives are PDO and MySQLi. Personally, I prefer using PDO as it provides a data-access abstraction layer, which basically means that you can use the same functions to access other databases as well (PostgreSQL, SQLite, etc).
3. Do not close your PHP tags.
A lot of developers will (often religiously) place a closing PHP tag at the end of their files like so:
1
2
3
4
5
6
7
8
9
|
<?php
class MyClass{
public function test(){
//do something, etc.
}
}
?>
|
The problem with this is that it can introduce whitespace or newline characters if the developer is not careful. This can cause headaches later on when headers are interrupted or whitespace characters inexplicably appear in the output (this actually happened to a co-worker of mine last week).
Ok, so what do I do?
This is perfectly acceptable:
1
2
3
4
5
6
7
|
<?php
class MyClass{
public function test(){
//do something, etc.
}
}
|
To be honest, the only time you should really close your PHP tags is when you are templating with PHP and HTML:
1
2
|
<h1><?php echo $title; ?></h1>
<p><?php echo $description; ?></p>
|
4. Guard against XSS!
XSS (aka Cross-site scripting) is a vulnerability that allows attackers to execute client-side code on your website. For example: If I enter some JavaScript into a comment form and you display that comment without sanitizing it, the code in question will execute whenever a user loads the page. To defend against this type of vulnerability, you should sanitize user-submitted data before it is printed out onto the page. To achieve this, you can use the function htmlspecialchars:
1
2
|
<?php
echo htmlspecialchars($userComment, ENT_QUOTES, 'utf-8');
|
This function will convert special characters into their relevant HTML entities so that they are safe for display.
5. Don’t echo out HTML!
Do not echo out HTML! This is a horrible practise! It looks messy, your IDE will fail to highlight the relevant HTML elements and designers who aren’t confident with PHP will find it difficult to edit or add new features. Instead, you should do something similar to the example that was shown in point three. Read More.
6. Separate your logic from your output!
There is nothing more daunting than trying to work on a sprawling code base where the logic is intertwined with the output. To separate your logic from your presentation, you can use an MVC framework such as Laravel or a templating engine such as Twig. There are a lot of different options out there, so be sure to shop around. At the very least, you should always try and enforce this principle while you are building your web apps.
7. Learn what DRY is.
DRY stands for Don’t Repeat Yourself. i.e. You should avoid a situation where you are repeating the exact same code. This can be done via the usage of includes, functions and classes. For example, if I have a piece of code that calculates a person’s age, I can create a function like so:
1
2
3
4
5
6
7
|
<?php
function calculateAge($dateOfBirth){
$birthday = new DateTime($dateOfBirth);
$interval = $birthday->diff(new DateTime);
return $interval->y;
}
|
Now, whenever I want to calculate a user’s age, I can just call the function above; as opposed to repeating the same logic. This is advantageous because:
- My code base is smaller as a result.
- If I need to tweak the logic, I can edit the calculateAge function, as opposed to trawling through my code and editing several redundant instances.
8. Never trust your users!
As highlighted in point four, users can be malicious. This means that you will have to build your web applications with the assumption that your visitors will actively attempt to exploit any vulnerabilities that they can find in your code. This frame of mind is needed, especially when you are developing sites that are open to the general public. In some cases, these vulnerabilities might be discovered by vulnerability scanners or automated web crawlers.
9. Do not run queries inside loops!
Running a query inside a large loop can be extremely expensive. In a lot of cases, the loop will grow in size, leading to more and more queries. This can lead to slow-loading web pages and a database that is under far more strain that it should be. Most of the time, these “looped queries” are used because the developer in question hasn’t learned about the importance of SQL joins.
10. Hash user passwords!
User passwords should be hashed, not stored in plain text or base encoded (seriously, I’ve actually encountered this before). A hashing function is a one way street. Once the plain text password has been passed through it, there is no way of getting it back (hence the reason we use the term “hashing”, not “encryption”). If you are using PHP >= 5.5, you should use the function password_hash. If you’re stuck on an earlier version, then you can make use of the password_combat library on Github.
NOTE! Functions such as md5 and sha1 are not fit for this purpose! A good password hashing function should be slow to the point where trying to crack a hash using rainbow tables etc is impracticable!
11. Use prepared statements!
If you look back at point two, you’ll see that I recommended using PDO over the mysql_*functions. One of the major advantages of using PDO is that it allows you to avail of prepared statements. In the past, PHP developers were forced to use functions such as mysql_real_escape_string like so:
1
2
3
|
<?php
$username = mysql_real_escape_string($username, $connection);
$result = mysql_query("SELECT name FROM users WHERE username = '$username'");
|
This function escapes special characters that might interfere with your SQL. i.e. It helps to protect against the scourge of SQL injection. However, what it won’t do is protect you from attacks that do not involve special characters such as x00, n, r, , ‘, ” and x1a. There’s also an issue where not setting the correct charset can render it useless against certain attacks.
Fortunately, SQL injection is no match for prepared statements. With prepared statements, the SQL statement is sent to the server before the data is, thus keeping them independent of one another. This means that the database knows what statement it needs to execute well before any potentially dangerous characters are sent through. An example of selecting rows with the PDO object:
1
2
3
4
5
6
7
8
9
|
<?php
//We prepare the SQL statement. At this stage it is sent off to the database server.
$stmt = $db->prepare("SELECT name FROM users WHERE username = :username");
//We bind our parameters / data.
$stmt->bindParam(':username', $username);
//We execute the statement.
$stmt->execute();
|
Important note: With the PDO extension, you will need to manually enable the use of “natural” prepared statements. For the purpose of portability, this extension uses emulated prepared statements by default. To switch off emulated prepared statements, you can use the following code:
1
2
|
<?php
$pdo->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);
|
12. “or die()” needs to die…
If you’ve decided to completely ignore all of the points about switching to PDO, then I think it is fair to say that you’re probably handling your failed queries like so:
1
2
|
<?php
mysql_query($sql, $conn) or die(mysql_error($conn));
|
The problem with the above approach is that you cannot catch the error or log it. You also can’t control whether or not it is outputted to the screen. In the eyes of the die function, a development server and a live server are exactly the same thing! It can’t be controlled via ini settings or a site-wide configuration file.
A better approach to this would be to use exceptions, simply because they can be caught or handled:
1
2
3
4
5
|
<?php
$res = mysql_query($sql, $conn);
if($res === false){
throw new Exception(mysql_error($conn));
}
|
The exception above can be caught using a TRY CATCH block or handled with a custom exception handler. This gives you far more control over how errors are dealt with. Of course, if are using the PDO extension, you could have your SQL errors throw exceptions by default by using the following attribute:
1
2
|
<?php
$pdo->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
|
13. Email validation.
Avoid using regular expressions to validate email addresses. The problem with most of the regex examples floating around on the Internet is that they fail to conform to RFC 822 grammar. i.e. They will actually reject valid email addresses. The best approach is to use the filter_var function like so:
1
2
3
4
|
<?php
if(!filter_var($emailAddress, FILTER_VALIDATE_EMAIL)) {
//Email is invalid. Show the user an error message.
}
|
Note that many websites validate email addresses by sending out a token-based email to the address in question.
14. Avoid short tags.
You should avoid using short tags unless you are sure that you will be deploying your code to a server that you can configure yourself. On some shared hosts, you’ll find that short tags are disabled by default and that you don’t have the ability to change any of PHP’s configuration settings. Therefore, in order to ensure some level of portability, you should stick to using <?php,as opposed to <?. It might save you a lot of headaches in the future.
15. Avoid micro-optimizations.
During my time on some of the various help forums, I’ve come across several instances of beginner PHP developers stressing out over insignificant micro-optimizations. For example, you’ll often come across questions such “Does copying variables slow down my application?” The answer is: It probably won’t matter. If you’ve reached the point where copying variables becomes a performance issue, then you’ve got much bigger problems at hand! Instead, you should follow best practices and focus on writing clean, readable code. Most of these things are so small and insignificant that they will never become an issue.
16. Learn about database normalization.
Database Normalization is something that all developers should know about, regardless of what language they are using. A bad database design can lead to bad code, simply because the programmer in question may be forced to compensate for the lack of proper structure / design. If you are a web developer, then it is highly probable that you will be forced to create a relational database at some point or another. When that time comes, you’ll be forced to become acquainted with the process of normalizing entities and building relationships.
17. Be consistent.
Consistency is one of the most important attributes that a programmer can have. Having to rummage through a code base that was written by an inconsistent coder is similar to running through a mine field. Folder structures that are messed up. Classes that have functions that are out of place. Camel case and underscores used interchangeably throughout the project. It can be a complete mess.
18. Version control.
It doesn’t matter if your application is a one-page website or a sprawling complex monster; using version control software such as Git is a must in this day and age. The advantages?
- You can easily revert to older versions of your application.
- It makes it much easier to maintain multiple different versions, allowing you to experiment with certain features.
- It helps prevent situations where you overwrite somebody else’s work.
- It gives other people the ability to submit changes to your code base.
- It can help you figure out what changes led to the introduction of a new bug.
19. Bytecode caching.
By default, PHP will always execute your code as if it is new (it isn’t a compiled language). On each request, PHP will parse your code and turn it into bytecode so that it can be executed. This will happen, regardless of whether or not your code has been left untouched for months. Obviously, this can be a waste of server resources.
Fortunately, you can easily implement a bytecode cache if you have full control over the server that your application is hosted on. If you’re running PHP 5.5 or higher, you can avail of Opcache, which is currently built into PHP. If you’re unfortunate enough to be stuck on an older version, you can avail of another bytecode cache called APC. These work by storing the resulting bytecode in memory so that it can be re-used.
20. Learn about common design patterns.
Design patterns are reusable solutions to common problems. Knowing about the ins and outs of proven design patterns can help you speed up the development process. It can also make it easier for other developers to read and understand your code. Fortunately for us, there are plenty of PHP-specific examples on Github. There, you’ll see PHP implementations of patterns such as the Factory Pattern and Dependency Injection.
21. When in doubt, use UTF-8!
Not sure what charset to use on your webpage? Use UTF-8:
1
|
<meta charset="utf-8">
|
Unsure about what encoding to use for your database tables? Use utf8_unicode_ci! Connecting to your database? Well, uh, you get the point:
1
2
|
<?php
$db = new PDO('mysql:host=localhost;dbname=database;charset=utf8', 'root', '');
|
Basically, you will need to use UTF-8 throughout your entire application! And remember: A chain is only as strong as its weakest link! Fail to use UTF-8 somewhere in your application and you might end up with garbled strings!
22. Know about the advantages of using an MVC Framework.
You might not need to use an MVC framework, but you should learn about the advantages of using one.
- Enforces the separation of concerns design principle.
- Can allow for better re-use of code.
- A lot of MVC Frameworks come pre-packaged with many helpful utility classes and libraries that help you quickly address common problems such as user authentication and sending emails. This can help you speed up development.
- As touched on in point 20: Developers who understand MVC frameworks will find your code easier to understand (as opposed to a custom design that you conjured up yourself).
There are a lot of great PHP MVC frameworks out there. To name a few:
23. Get a grasp on some of the fundamentals of web application security.
XSS and SQL injection are not the only vulnerabilities that you need to be aware of. The security risks posed by CSRF and Session Fixation should also be at the forefront of your mind. If you’re new to developing in PHP, you should probably take a look at the official site’s page on Security. There, you’ll find a lot of helpful information about the different types of security vulnerabilities that you will need to take into account while you are developing your applications. Take the issue relating to null bytes as an example, where $_GET[‘file’] is “../../etc/passwd”:
1
2
3
4
5
6
7
|
<?php
$file = $_GET['file'];
if (file_exists('/home/wwwrun/'.$file.'.php')) {
// file_exists will return true as the file /home/wwwrun/../../etc/passwd exists
include '/home/wwwrun/'.$file.'.php';
// the file /etc/passwd will be included
}
|
Here, you can see that an attacker could potentially force your web application to include the /etc/passwd file.
For further reading on web application security, be sure to check out OWASP.
24. Know what database column types to use.
In the past, I’ve come across multiple instances of developers using incorrect data types. Sometimes, it is a bit disheartening to see a great project being weighed down by bad database-related decisions. Some will store their price data in a VARCHAR column. Others won’t know the difference between SMALLINT, INT and BIGINT. Example: Did you know that the 3 in TINYINT (3) has absolutely nothing to do with the storage size of the column? Did you know that storing dates in a VARCHAR column is completely stupid, and that you won’t be able to avail of the date functions as a result? Did you know that a BIGINT can be wasteful, simply because it is unlikely that you will ever need to store a number that is as big as 18446744073709551615? Will your text column require you to use TEXT or MEDIUMTEXT? All of these are questions that can be answered by having a quick read of the manual. Basically, RTFM!
25. Don’t parse HTML with regular expressions.
Why use regular expressions to parse HTML when you can use a DOM parser such as the Document Object Model? Parsing HTML with regex can be tricky at best and it can ultimately lead to code that is bulky and unmaintainable. What if a newly-added element attribute breaks your code? Just use an XML parser / library like a sane person.
26. var_dump, don’t echo.
I’ve often come across developers using echo to “debug” their variables, despite the fact that echo will leave out a lot of significant information. var_dump, on the other hand, will tell you what type of variable you are dealing with. It will also help you locate whitespace and newline issues. Run the following example:
1
2
3
|
$str = 'Hello '; //Example whitespace issue.
echo $str;
var_dump($str);
|
Which one is more useful?
27. Testing your application.
Unit Testing is one of the most popular methods of testing. Basically, it involves breaking your code base down into smaller pieces (typically, functions and object methods) so that you can test them in an isolated fashion. As it stands, PHPUnit is the most popular PHP testing framework. Be sure to have a run through their start-up guide. Doing a few examples and getting your hands dirty will help you understand it better.
For some further reading, be sure to have a look at the Wiki article on Test-driven development.
28. Storing uploaded images.
Uploaded images should be stored on the file system and then referenced via the database. i.e. Upload the file to your web server and then store the file path of the image in a table column. You should NOT be storing images in your database unless it is completely necessary! Why?
- In many cases, you’ll find that database storage is more expensive / limited than file system storage. This is the case with many hosting solutions.
- It may add an (unnecessary) strain to your database.
- Accessing an image in a database may be noticeably slower than accessing an image on the file system.
- The database becomes much larger. i.e. Backups take longer and the complexity of maintaining the database typically increases.
- High-traffic websites such as Facebook prefer file system storage.
- You will not be able to take advantage of any cloud storage solutions.
- No extra coding / processing is needed to access images on a web server.
- If you store your images in a database, you may lose out on OS-based optimizations such as sendfile.
29. Re-size your images on upload.
Using PHP to re-size images on-the-fly can be extremely resource intensive. In most cases, you will use more of your CPU and RAM on re-sizing an image than you would on serving a typical PHP web page. To make matters worse, the impact of re-sizing images on-the-fly will worsen with each thumbnail that is being displayed. A more-robust solution is to re-size images as soon as they have been uploaded. i.e. Re-size the image and create one or two differently-sized copies. Disk space is cheap. CPU power and RAM? Not so much.
30. Documentation.
Pick a commenting “style” and then stick with it (see point 17 about being consistent). For example: phpDocumentor is a tool that allows you to automatically generate documentation for your code, provided you stick to their style / syntax:
1
2
3
4
5
6
7
8
9
10
11
12
|
/**
* Description of the function goes here.
*
* @param int $num Small description about this parameter.
* @return boolean Small description about the return value.
*/
public function isOne($num){
if($num === 1){
return true;
}
return false;
}
|
31. Understand the difference between == and ===
As a PHP developer, you should definitely take the time to read the official documentation page on comparison operators. Knowing the difference between == and === is vital. Consider the following:
1
2
3
4
5
6
|
<?php
$a = 1;
$b = "01";
if($a == $b){
echo 'True!';
}
|
The above IF statement will equate to true, despite the fact that $b is a string and $a is an integer. This is because of type juggling. Basically, $a and $b are the considered to be equal because PHP will cast $a and $b into integers before making the comparison (in this case, we consider it to be a “loose” comparison). ===, on the other hand, will only equate to true if both variables are equal AND they are of the same type. Run the following piece of code and you’ll find that the output will be “False!” This is because $a and $b are not of the same type:
1
2
3
4
5
6
7
8
9
|
<?php
$a = 1; //integer
$b = "01"; //string
//This will equate to FALSE because $a is an int and $b is a string.
if($a === $b){
echo 'True!';
} else{
echo 'False!';
}
|
It is fair to say that not knowing the difference between == and === will inevitably lead to buggy code. Take the following example / pit-fall:
1
2
3
4
5
6
|
<?php
$a = false;
$b = "";
if($a == $b){
echo 'Both $a and $b are considered to be false!';
}
|
A while back, I came across an issue where one of our existing cron scripts was throwing false errors. Basically, the script in question was making a HTTP request to a particular URL. After the request was completed, the custom error handler would report that it had failed, despite the fact that we all knew that it was being completed successfully. After digging into the code, I discovered that the person who had wrote the error handler had used a loose comparison like so:
1
2
3
4
5
|
<?php
$res = file_get_contents($url);
if($res == false){
//request failed
}
|
The problem with the code above is that the request will be considered a failure if the output of the $url is blank (in this case, it was). A message on the documentation page for file_get_contents warns us about this:
This function may return Boolean FALSE, but may also return a non-Boolean value which evaluates to FALSE. Please read the section on Booleans for more information. Use the === operator for testing the return value of this function.
32. Object Caching.
In many cases, it can be beneficial to cache the results of expensive database queries by storing them in memory (retrieving data from memory is a lot faster than retrieving data from disk). Think about the following scenario:
- You own a social networking website.
- On your home page, you display a list of newly-registered users.
- Your home page is accessed 500 times per minute.
- As a result, the database query that selects these users is executed 30,000 times per hour.
It is fair to say that all of this is a bit wasteful, simply because:
- It is not an important feature. If you think about it: Its only real purpose is to let other users know that the website is active.
- Visitors aren’t going to care if the list in question is five or ten minutes old.
To cache the result of this query in memory, we could use an object caching system such as Memcached or Redis. Take the following example, which uses PHP’s Memcached Extension:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
|
//Attempt to get the newly-registered list from Memcached.
$userList = $memcached->get('newly_registered');
//If $userList is FALSE, it means that our list doesn't exist in Memcached.
if($userList === false){
//Select the last 10 users from our database.
$userList = $this->User->getNewlyRegistered(10);
//Store the result in Memcached for 5 minutes so that any preceding
//visitors in the next 5 minutes will be able to access it via
//the cache.
$memcached->set('newly_registered', $userList, time() + 300);
}
var_dump($userList);
|
The code above will attempt to retrieve the user list from Memcached. If the list doesn’t exist, it will run the query and then store the result in Memcached for 5 minutes (once the 5 minutes are up, Memcached will evict the data from the cache and Memcached::get will return FALSE again). This means that our database query will only be executed if the key has expired and the data has been automatically “evicted.” i.e. We now run 12 queries per hour instead of 30,000.
33. HTML does NOT provide validation!
This ties back into my point about not trusting your users! On too many occasions, I’ve come across developers who seem to think that the value of a SELECT element doesn’t need to be validated by the server.
1
2
3
4
|
<select name="gender">
<option value="1">Male</option>
<option value="2">Female</option>
</select>
|
“The value can only be 1 or 2, r-right?!” Wrong!
A lot of beginner developers seem to be under the impression that the value of a SELECT element can be trusted because it “restricts” the user and forces them to select from a predefined list of values.
Unfortunately, this isn’t the case, as an attacker can easily edit the values of a SELECT element by opening up Firebug or Chrome Developer Tools. This goes for hidden form fields as well! What if I were to open up Firebug and change “1” to “Hello”? What if I were to replace “2” with “3”? What would happen if I modified the value of your hidden input field? How will your application react?
Conclusion: HTML can be edited by anyone. Form values can be tampered with and fields can be deleted with ease.
34. JavaScript validation is not a substitute for server-side validation!
The amount of web forms that rely on client-side validation is absolutely frightening! What happens if the user decides to disable JavaScript? What happens if an attacker decides to open up Chrome Developer Tools and modify your code? To put it bluntly: JavaScript should be treated in the same respect as HTML. Both are sent to the browser. Both can be modified by the end-user. Neither of them can be trusted.
35. Learn about error reporting in PHP.
There are number of different types of errors in PHP . Some of the most common ones are:
- E_ERROR: This is a fatal runtime error. Your application cannot recover from an E_ERROR. Therefore, the script is halted. Example cause: Attempting to call a function that doesn’t exist.
- E_PARSE: This occurs whenever PHP fails to parse / compile your code. Your script will not run as a result. Example cause: Failing to close your brackets properly.
- E_WARNING: This is a runtime warning that does not prevent the rest of your application from running. Example: Trying to access a file or URL that doesn’t exist.
- E_NOTICE: An E_NOTICE occurs whenever PHP encounters something that may indicate an error. Example: Trying to access an array index that doesn’t exist.
- E_STRICT: Occurs whenever PHP warns you about the future compatibility of your code. Example cause: Using functions or language features that have been deprecated.
Development Environment
In a development environment, you should display ALL errors. Hiding warnings and notices in a development environment is bad practice, simply because you should be fixing the root causes; not attempting to cover them up! Sweeping “dirt” like this under the rug may lead to the appearance of annoying bugs that are difficult to identify! To display all of the possible PHP errors, you can insert the following directives into your PHP.ini file:
1
2
3
4
|
display_errors = On
display_startup_errors = On
error_reporting = -1
log_errors = On
|
If you do not have access to your PHP.ini file, you can place the following piece of code at the top of your script:
1
2
|
error_reporting(-1);
ini_set("display_errors", 1);
|
Production / Live Environment
In a production / live environment, errors should be logged, but not displayed to the end user. Displaying PHP errors in a live environment is not recommended. This is because:
- They are not user friendly.
- Warnings and notices can break the display / layout of your website.
- They can provide an attacker with critical information about the internal workings of your application.
To hide PHP errors, you can insert the following directives into your PHP.ini file:
1
2
3
4
|
display_errors = Off
display_startup_errors = Off
error_reporting = E_ALL
log_errors = On
|
If you do not have access to your PHP.ini file, you can place the following code at the top of your script:
1
|
error_reporting(0);
|
0 comments:
Post a Comment