Unbuffered queries for large data sets: MYSQLI_USE_RESULT
So far we've been using the mysqli_query() function to do all our data searching in PHP, and it works well enough for the vast majority of cases. However, consider this: how does the mysqli_num_rows() function know how many rows mysqli_query() returned? The answer is simple: mysqli_query() runs the query, and fetches and buffers it all so that it has the complete result set available. The mysqli_num_rows() function then has access to all the rows, and so can return the true row count.
But, what do you do if you have a large number of rows and don't want to wait before MySQL has finished fetching them all before you start using them? In this scenario, passing a special parameter to mysqli_query() causes it to execute the query and return a resource pointing to the result of the query while MySQL is still working, which means you can start reading before the query has finished. This is called an unbuffered query, and is triggered using the special query mode MYSQLI_USE_RESULT.
Earlier I used the example of a golfscores table with 1,000,000 rows. It takes MySQL about 20 seconds to return all that data to PHP when using mysqli_query(), and if you consider that on top of that PHP might do another 20 seconds of work to format and print out that data, the total is 40 seconds of work, of which you see nothing for the first 20 seconds. Using mysqli_query() in its default mode therefore has several obvious disadvantages:
- PHP must wait while the entire query is executed and returned before it can start processing.
- In order to return the whole result to PHP at once, all the data must be held in RAM. Thus, if you have 100MB of data to return, the PHP variable to hold it all will be 100MB.
The disadvantages of mysqli_query() happen to be the advantages of these unbuffered queries:
- The PHP script can parse the results immediately, giving immediate feedback to users.
- Only one row at a time need be held in RAM.
One nice feature of unbuffered queries is that, internally to PHP, it is almost identical to mysqli_query() – you just provide an extra parameter. As a result, you can almost use them interchangeably inside your scripts. For example, this script works fine in either mode:
$db = mysqli_connect("localhost", "php", "alm65z", "phpdb");
$result = mysqli_query($db, "SELECT ID, Name FROM conferences;", MYSQLI_USE_RESULT);
while ($row = mysqli_fetch_assoc($result)) {
extract($row, EXTR_PREFIX_ALL, "conf");
print "$conf_Name\n";
}
Before you rush off to make all your queries unbuffered, be aware that there are drawbacks:
- You must read all rows from the return value. If you're thinking of using as a quick way to find something then stop processing the rows part way through, you're way off track - sorry!
- If you issue another query before you finish processing all the rows from the previous query, PHP will issue a warning. SELECTs within SELECTs are not possible with unbuffered queries.
- Functions such as mysqli_num_rows() only return the number of rows read so far. This will be 0 as soon as the query returns, but as you call mysqli_fetch_assoc() it will increment until it has the correct number of rows at the end.
- Between the time you start your unbuffered query and your processing the last row, the table remains locked by MySQL and cannot be written to by other queries. If you plan to do lengthy processing on each row, this is not good.
Choosing whether you want to run a buffered query or not takes a little thinking - if you're not sure what's best, you should almost certainly stay with the standard buffered queries.
0 comments:
Post a Comment