Wednesday 8 August 2018

Creating A “What’s Hot” Algorithm With PHP And MySQL

This is a small guide on how to implement a Reddit-style “what’s hot” algorithm with PHP and MySQL. There are two primary benefits to using a “what’s hot” algorithm:
  1. It gives newer posts a fighting chance. The problem with many “most popular” pages is that they give older posts and items a huge advantage over their younger counterparts. This leads to a situation where superior content is losing out to mediocre content that was created months beforehand.
  2. It keeps things “fresh” and prevents content stagnation. Most users will quickly lose interest in a website that only displays a “most popular” list. This is because the content does not change enough to warrant any kind of long-term attention.
For this example, we are going to assume that we are running a website that provides users with the functionality to “thumbs up” or “thumbs down” videos that have been submitted.

Our table design.

Note that I am going to try to keep our database design as simple as possible, lest we allow other topics to creep into the article. Here is the structure of an example table. Note that the three columns that are most important to us are “thumbs_up”“thumbs_down” and “created.” Without these columns, we cannot implement a “what’s hot” page:

Our SQL query.

Personally, I think that it is better to implement the “what’s hot” algorithm in SQL, and not in PHP. i.e. Let the database handle the heavy work. This is because:
  1. It allows us to avail of any indexes that we created.
  2. We can cache the results of the query in memory if we need to.
The query looks like this (based on the Reddit algorithm):
As you can see, the algorithm is implemented in the ORDER BY clause and we’ve limited the results to 100. Here is some quick test data, if you need it:
After adding a few videos to your table and running the “what’s hot” query above, you’ll see that a row’s total score does not guarantee it the top spot. Instead, our algorithm takes both the total score and the time that it was created into account.

Tying it together with PHP.

Because the SQL query does all of the hard work, tying it into PHP is as simple as running any kind of SELECT statement.
Here’s an example with PDO:
If, for whatever reason, you are forced to use the mysql extension:

Extra Pointers.

  • Your vote buttons should send off an Ajax request instead of redirecting the user or reloading the page. Websites like Youtube and Reddit have conditioned users to believe that their vote is in “real time.” Sending Ajax requests is extremely easy with JavaScript libraries such as JQuery.
  • If performance is a concern, be sure to read up on indexes. You might also want to look into the possibility of using an object caching daemon such as Memcached.
  • To prevent the user from voting multiple times, you should store votes against the User’s ID. There are other methods of preventing vote rigging, but they fall outside the scope of this article.

0 comments:

Post a Comment