File/ext/utf8/utils/specials.php

Description

Utilities for processing "special" characters in UTF-8. "Special" largely means anything which would be regarded as a non-word character, like ASCII control characters and punctuation. This has a "Roman" bias - it would be unaware of modern Chinese "punctuation" characters for example.

Note: requires utils/unicode.php to be loaded

Functions
utf8_is_word_chars (line 106)

Checks a string for whether it contains only word characters. This

is logically equivalent to the \w PCRE meta character. Note that this is not a 100% guarantee that the string only contains alpha / numeric characters but just that common non-alphanumeric are not in the string, including ASCII device control characters.

boolean utf8_is_word_chars (string $str)
  • string $str: to check
utf8_specials_pattern (line 30)

Used internally. Builds a PCRE pattern from the $UTF8_SPECIAL_CHARS

array defined in this file The $UTF8_SPECIAL_CHARS should contain all special characters (non-letter/non-digit) defined in the various local charsets - it's not a complete list of non-alphanum characters in UTF-8. It's not perfect but should match most cases of special chars. This function adds the control chars 0x00 to 0x19 to the array of special chars (they are not included in $UTF8_SPECIAL_CHARS)

string utf8_specials_pattern ()
utf8_strip_specials (line 127)

Removes special characters (nonalphanumeric) from a UTF-8 string

This can be useful as a helper for sanitizing a string for use as something like a file name or a unique identifier. Be warned though it does not handle all possible non-alphanumeric characters and is not intended is some kind of security / injection filter.

string utf8_strip_specials (string $string, [string $repl = ''])
  • string $string: The UTF8 string to strip of special chars
  • string $repl: (optional) $repl Replace special with this string

Documentation generated on Thu, 08 Jan 2009 17:48:33 +0100 by phpDocumentor 1.4.0a2