Tuesday, July 12, 2011

PHP truncate words(shorten text string)

We want to truncate/shortern a string of text into a specific number of characters and add three dots (...) to the end. For example, we have a string "For he was looking forward to the city with foundations, whose architect and builder is God."; and we want to show it as "For he was looking forward to the city with foundations..."

This requirement can be more generic that we want to truncate words into a given $max number of characters and add $symbol to the end. For our example, $max = 60 and $symbol = '...'

Let's see how we can do it.

function truncate($text, $max, $symbol)
{
                //step 1: we get the part of the text with maximum $max number of characters
                $sub = substr($text, 0, $max);
                //step 2: we find the position of last occurence of space
                //   this is where we start to truncate the words
                $last = strrpos($sub, ' ');
                //step 3: get the sub string from beginning to the truncate position
                $sub = substr($sub, 0, $last);
                //step 4: padding the sub string with $symbol and return
                return $sub . $symbol;
}

We can try this function
$text = "For he was looking forward to the city with foundations, whose architect and builder is God.";
echo truncate($text, 28, '...');
The result is: For he was looking forward...

The turncate function serves well in most cases and probably is most wildly used, but it does have some minor flaw. If we try
$text = "For he was looking forward to the city with foundations, whose architect and builder is God.";
echo truncate($text, 60, '...');
The result is: For he was looking forward to the city with foundations,...

See the last comma before '...'? If you can accept that, then it is fine. But if we don't want to leave punctuation or other non-word character as the final character, we need to modify our truncate function a bit.

function truncate($text, $max, $symbol)
{
                //step 1: we get the part of the text with maximum $max number of characters
                $sub = substr($text, 0, $max);
                //step 2: we find the position of last occurence of space
                //   this is where we start to truncate the words
                $last = strrpos($sub, ' ');
                //step 3: get the sub string from beginning to the truncate position
                $sub = substr($sub, 0, $last);
                //step 4: remove any non word characters, padd the sub string with $symbol and return
                $sub = preg_replace("/([^\w])$/", "", $sub);
                return $sub . $symbol;
}

The difference is $sub = preg_replace("/([^\w])$/", "", $sub); This will replace any non-word character([^\w]) at the end($) with empty string(""). Now let's try again:
$text = "For he was looking forward to the city with foundations, whose architect and builder is God.";
echo truncate($text, 60, '...');
The result is: For he was looking forward to the city with foundations...

No comments: