[ Index ] |
PHP Cross Reference of phpBB 3.0 Beta 3 |
[Source view] [Print]
(no description)
Copyright: | (c) 2006 phpBB Group |
License: | http://opensource.org/licenses/gpl-license.php GNU Public License |
Version: | $Id: utf_tools.php,v 1.26 2006/11/12 14:29:32 naderman Exp $ |
File Size: | 1012 lines (31 kb) |
Included or required: | 0 times |
Referenced: | 0 times |
Includes or requires: | 0 files |
utf8_encode($str) X-Ref |
Implementation of PHP's native utf8_encode for people without XML support This function exploits some nice things that ISO-8859-1 and UTF-8 have in common param: string $str ISO-8859-1 encoded data return: string UTF-8 encoded data |
utf8_decode($str) X-Ref |
Implementation of PHP's native utf8_decode for people without XML support param: string $string UTF-8 encoded data return: string ISO-8859-1 encoded data |
utf8_strrpos($str, $needle, $offset = null) X-Ref |
UTF-8 aware alternative to strrpos |
utf8_strrpos($str, $needle, $offset = null) X-Ref |
UTF-8 aware alternative to strrpos |
utf8_strpos($str, $needle, $offset = null) X-Ref |
UTF-8 aware alternative to strpos |
utf8_strtolower($str) X-Ref |
UTF-8 aware alternative to strtolower |
utf8_strtoupper($str) X-Ref |
UTF-8 aware alternative to strtoupper |
utf8_substr($str, $offset, $length = null) X-Ref |
UTF-8 aware alternative to substr |
utf8_strlen($text) X-Ref |
Return the length (in characters) of a UTF-8 string |
utf8_strrpos($str, $needle, $offset = null) X-Ref |
UTF-8 aware alternative to strrpos Find position of last occurrence of a char in a string author: Harry Fuecks param: string haystack param: string needle param: integer (optional) offset (from left) return: mixed integer position or FALSE on failure |
utf8_strpos($str, $needle, $offset = null) X-Ref |
UTF-8 aware alternative to strpos Find position of first occurrence of a string author: Harry Fuecks param: string haystack param: string needle param: integer offset in characters (from left) return: mixed integer position or FALSE on failure |
utf8_strtolower($string) X-Ref |
UTF-8 aware alternative to strtolower Make a string lowercase Note: The concept of a characters "case" only exists is some alphabets such as Latin, Greek, Cyrillic, Armenian and archaic Georgian - it does not exist in the Chinese alphabet, for example. See Unicode Standard Annex #21: Case Mappings param: string return: string string in lowercase |
utf8_strtoupper($string) X-Ref |
UTF-8 aware alternative to strtoupper Make a string uppercase Note: The concept of a characters "case" only exists is some alphabets such as Latin, Greek, Cyrillic, Armenian and archaic Georgian - it does not exist in the Chinese alphabet, for example. See Unicode Standard Annex #21: Case Mappings param: string return: string string in uppercase |
utf8_substr($str, $offset, $length = NULL) X-Ref |
UTF-8 aware alternative to substr Return part of a string given character offset (and optionally length) Note arguments: comparied to substr - if offset or length are not integers, this version will not complain but rather massages them into an integer. Note on returned values: substr documentation states false can be returned in some cases (e.g. offset > string length) mb_substr never returns false, it will return an empty string instead. This adopts the mb_substr approach Note on implementation: PCRE only supports repetitions of less than 65536, in order to accept up to MAXINT values for offset and length, we'll repeat a group of 65535 characters when needed. Note on implementation: calculating the number of characters in the string is a relatively expensive operation, so we only carry it out when necessary. It isn't necessary for +ve offsets and no specified length author: Chris Smith<chris@jalakai.co.uk> param: string param: integer number of UTF-8 characters offset (from left) param: integer (optional) length in UTF-8 characters from offset return: mixed string or FALSE if failure |
utf8_strlen($text) X-Ref |
Return the length (in characters) of a UTF-8 string param: string $text UTF-8 string return: integer Length (in chars) of given string |
utf8_str_split($str, $split_len = 1) X-Ref |
UTF-8 aware alternative to str_split Convert a string to an array author: Harry Fuecks param: string UTF-8 encoded param: int number to characters to split string by return: string characters in string reverses |
utf8_strspn($str, $mask, $start = null, $length = null) X-Ref |
UTF-8 aware alternative to strcspn Find length of initial segment not matching mask author: Harry Fuecks param: string return: int |
utf8_ucfirst($str) X-Ref |
UTF-8 aware alternative to ucfirst Make a string's first character uppercase author: Harry Fuecks param: string return: string with first character as upper case (if applicable) |
utf8_recode($string, $encoding) X-Ref |
Recode a string to UTF-8 If the encoding is not supported, the string is returned as-is param: string $string Original string param: string $encoding Original encoding (lowered) return: string The string, encoded in UTF-8 |
utf8_encode_ncr($text) X-Ref |
Replace all UTF-8 chars that are not in ASCII with their NCR param: string $text UTF-8 string in NFC return: string ASCII string using NCRs for non-ASCII chars |
utf8_encode_ncr_callback($m) X-Ref |
Callback used in encode_ncr() Takes a UTF-8 char and replaces it with its NCR. Attention, $m is an array param: array $m 0-based numerically indexed array passed by preg_replace_callback() return: string A HTML NCR if the character is valid, or the original string otherwise |
utf8_ord($chr) X-Ref |
Enter description here... param: string $chr UTF-8 char return: integer UNICODE code point |
utf8_chr($cp) X-Ref |
Converts an NCR to a UTF-8 char param: integer $cp UNICODE code point return: string UTF-8 char |
utf8_decode_ncr($text) X-Ref |
Convert Numeric Character References to UTF-8 chars Notes: - we do not convert NCRs recursively, if you pass &#38; it will return & - we DO NOT check for the existence of the Unicode characters, therefore an entity may be converted to an inexistent codepoint param: string $text String to convert, encoded in UTF-8 (no normal form required) return: string UTF-8 string where NCRs have been replaced with the actual chars |
utf8_decode_ncr_callback($m) X-Ref |
Callback used in decode_ncr() Takes a NCR (in decimal or hexadecimal) and returns a UTF-8 char. Attention, $m is an array. It will ignore most of invalid NCRs, but not all! param: array $m 0-based numerically indexed array passed by preg_replace_callback() return: string UTF-8 char |
utf8_case_fold($text, $option = 'full') X-Ref |
Takes an array of ints representing the Unicode characters and returns a UTF-8 string. param: string $text text to be case folded param: string $option determines how we will fold the cases return: string case folded text |
utf8_normalize_nfc($strings) X-Ref |
A wrapper function for the normalizer which takes care of including the class if required and modifies the passed strings to be in NFC (Normalization Form Composition). param: mixed $strings Either an array of references to strings, a reference to an array of strings or a reference to a single string |
utf8_clean_string($text) X-Ref |
This function is used to generate a "clean" version of a string. Clean means that it is a case insensitive form (case folding) and that it is normalized (NFC). Additionally a homographs of one character are transformed into one specific character (preferably ASCII if it is an ASCII character). Please be aware that if you change something within this function or within functions used here you need to rebuild/update the username_clean column in the users table. And all other columns that store a clean string otherwise you will break this functionality. param: $text An unclean string, mabye user input (has to be valid UTF-8!) return: Cleaned up version of the input string |
Generated: Wed Nov 22 00:35:05 2006 | Cross-referenced by PHPXref 0.6 |