Define UTF8_CORE as required
Unicode aware replacement for strlen(). Returns the number
of characters in the string (not the number of bytes), replacing multibyte characters with a single byte equivalent utf8_decode() converts characters that are not in ISO-8859-1 to '?', which, for the purpose of counting, is alright - It's much faster than iconv_strlen Note: this function does not count bad UTF-8 bytes in the string
UTF-8 aware alternative to strpos
Find position of first occurrence of a string Note: This will get alot slower if offset is used Note: requires utf8_strlen amd utf8_substr to be loaded
UTF-8 aware alternative to strrpos
Find position of last occurrence of a char in a string Note: This will get alot slower if offset is used Note: requires utf8_substr and utf8_strlen to be loaded
UTF-8 aware alternative to strtolower
Make a string lowercase Note: The concept of a characters "case" only exists is some alphabets such as Latin, Greek, Cyrillic, Armenian and archaic Georgian - it does not exist in the Chinese alphabet, for example. See Unicode Standard Annex #21: Case Mappings Note: requires utf8_to_unicode and utf8_from_unicode
UTF-8 aware alternative to strtoupper
Make a string uppercase Note: The concept of a characters "case" only exists is some alphabets such as Latin, Greek, Cyrillic, Armenian and archaic Georgian - it does not exist in the Chinese alphabet, for example. See Unicode Standard Annex #21: Case Mappings Note: requires utf8_to_unicode and utf8_from_unicode
UTF-8 aware alternative to substr Return part of a string given character offset (and optionally length)
Note arguments: comparied to substr - if offset or length are not integers, this version will not complain but rather massages them into an integer.
Note on returned values: substr documentation states false can be returned in some cases (e.g. offset > string length) mb_substr never returns false, it will return an empty string instead. This adopts the mb_substr approach
Note on implementation: PCRE only supports repetitions of less than 65536, in order to accept up to MAXINT values for offset and length, we'll repeat a group of 65535 characters when needed.
Note on implementation: calculating the number of characters in the string is a relatively expensive operation, so we only carry it out when necessary. It isn't necessary for +ve offsets and no specified length
Documentation generated on Thu, 08 Jan 2009 17:40:14 +0100 by phpDocumentor 1.4.0a2