Friday, 10 August 2018

PHP Paragraph Regular Expression

I quite often find the need to extract a section of text from the beginning of a blog post or similar to be used as the excerpt. I normally use a function that will count the number of whole words available and return the string containing those words.
A good alternative to this, although only applicable if the original post is in HTML, is to use a regular expression to extract the contents. The following code will take a string and extract just the first paragraph of text.
  1. $intro = '';
  2. preg_match("/(.*?)/is", $string, $matches);
  3. if (isset($matches[1])) {
  4. $intro = trim(strip_tags($matches[1]));
  5. }
If the regular expression finds any matches to paragraph tags then it strips out the HTML and trims the string so that the final output doesn't have any formatting or whitespace. The i modifier is used to make the matching case insensitive and the s modifier is used to make the "." match all characters, including new lines. Without the s modifier the result wouldn't return anything if the paragraph text contains a newline character.

0 comments:

Post a Comment