Utilities for processing "special" characters in UTF-8. "Special" largely means anything which would be regarded as a non-word character, like ASCII control characters and punctuation. This has a "Roman" bias - it would be unaware of modern Chinese "punctuation" characters for example.
Note: requires utils/unicode.php to be loaded
Checks a string for whether it contains only word characters. This
is logically equivalent to the \w PCRE meta character. Note that this is not a 100% guarantee that the string only contains alpha / numeric characters but just that common non-alphanumeric are not in the string, including ASCII device control characters.
Used internally. Builds a PCRE pattern from the $UTF8_SPECIAL_CHARS
array defined in this file The $UTF8_SPECIAL_CHARS should contain all special characters (non-letter/non-digit) defined in the various local charsets - it's not a complete list of non-alphanum characters in UTF-8. It's not perfect but should match most cases of special chars. This function adds the control chars 0x00 to 0x19 to the array of special chars (they are not included in $UTF8_SPECIAL_CHARS)
Removes special characters (nonalphanumeric) from a UTF-8 string
This can be useful as a helper for sanitizing a string for use as something like a file name or a unique identifier. Be warned though it does not handle all possible non-alphanumeric characters and is not intended is some kind of security / injection filter.
Documentation generated on Thu, 08 Jan 2009 17:48:33 +0100 by phpDocumentor 1.4.0a2