preg_replace is not work correctly with UTF-8 chars?

i am using this function to replace bad words from phrases, but it works good with english letters except UTF-8 chars.

i found that \b boundary is not working properly with utf-8 chars. are there any alternative method to do this ?

i had to add '\b' as i need to replace the exact word only. as a example: dont want to replace popo_one with p***o i need only to replace popo with p***o. hope it is clear to understand.

public function wordfilter($phrase) {
    $filter = array('/popo\b/i','/blabla\b/i'); 
    $replace = array('p***o','b***a'); 
    $newphrase = preg_replace($filter, $replace, $phrase); 
return $newphrase;
}

any ideas appreciated.

Answers


\b (a word boundary) is the limit between a character from the \w character class and an other character or the limit of the string (begining or end).

By default \w contains only [a-zA-Z0-9_], but if you use the u modifier the \w character class will contain all unicode letters and digits (and will be equivalent to [\p{L}\p{N}_]). So with this modifier the meaning of \b will change too.

The u modifier has a second effect. With it, the pattern and the subject string are no more treated as ascii strings, but as utf8 strings.

The u modifier is a combination of two directives: (*UCP) that changes the meaning of the shorthand character classes (\w, \d, \s...) and (*UTF8) that makes pattern and subject strings to be read as utf8 strings. These directives can be placed directly in the pattern at the very begining instead of using the u modifier.


Need Your Help

include mathjax in android app

android mathjax

I am trying to include MathJax in my app. I removed files not needed for my app and brought down the size of the MathJax folder to 3Mb.

Linux: Program/command to intentionally use lots of RAM

linux

I am writing a memory management program for linux, and I was wondering if there was an easy way to increase RAM usage significantly, so I can accurately check if my program is monitoring the RAM

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.