Differences Between: [Versions 310 and 402] [Versions 311 and 402] [Versions 39 and 402] [Versions 400 and 402]
Defines string apis
Copyright: | (C) 2001-3001 Eloy Lafuente (stronk7) {@link http://contiento.com} |
License: | http://www.gnu.org/copyleft/gpl.html GNU GPL v3 or later |
File Size: | 679 lines (25 kb) |
Included or required: | 0 times |
Referenced: | 0 times |
Includes or requires: | 0 files |
core_text:: (24 methods):
is_charset_supported()
reset_caches()
parse_charset()
convert()
substr()
str_max_bytes()
strrchr()
strlen()
strtolower()
strtoupper()
strpos()
strrpos()
strrev()
specialtoascii()
encode_mimeheader()
get_entities_table()
entities_to_utf8()
utf8_to_entities()
trim_utf8_bom()
remove_unicode_non_characters()
get_encodings()
code2utf8()
utf8ord()
strtotitle()
is_charset_supported(string $charset) X-Ref |
Check whether the charset is supported by mbstring. param: string $charset Normalised charset return: bool |
reset_caches() X-Ref |
Reset internal textlib caches. |
parse_charset($charset) X-Ref |
Standardise charset name Please note it does not mean the returned charset is actually supported. param: string $charset raw charset name return: string normalised lowercase charset name |
convert($text, $fromCS, $toCS='utf-8') X-Ref |
Converts the text between different encodings. It uses iconv extension with //TRANSLIT parameter. If both source and target are utf-8 it tries to fix invalid characters only. param: string $text param: string $fromCS source encoding param: string $toCS result encoding return: string|bool converted string or false on error |
substr($text, $start, $len=null, $charset='utf-8') X-Ref |
Multibyte safe substr() function, uses mbstring or iconv param: string $text string to truncate param: int $start negative value means from end param: int $len maximum length of characters beginning from start param: string $charset encoding of the text return: string portion of string specified by the $start and $len |
str_max_bytes($string, $bytes) X-Ref |
Truncates a string to no more than a certain number of bytes in a multi-byte safe manner. UTF-8 only! param: string $string String to truncate param: int $bytes Maximum length of bytes in the result return: string Portion of string specified by $bytes |
strrchr($haystack, $needle, $part = false) X-Ref |
Finds the last occurrence of a character in a string within another. UTF-8 ONLY safe mb_strrchr(). param: string $haystack The string from which to get the last occurrence of needle. param: string $needle The string to find in haystack. param: boolean $part If true, returns the portion before needle, else return the portion after (including needle). return: string|false False when not found. |
strlen($text, $charset='utf-8') X-Ref |
Multibyte safe strlen() function, uses mbstring or iconv param: string $text input string param: string $charset encoding of the text return: int number of characters |
strtolower($text, $charset='utf-8') X-Ref |
Multibyte safe strtolower() function, uses mbstring. param: string $text input string param: string $charset encoding of the text (may not work for all encodings) return: string lower case text |
strtoupper($text, $charset='utf-8') X-Ref |
Multibyte safe strtoupper() function, uses mbstring. param: string $text input string param: string $charset encoding of the text (may not work for all encodings) return: string upper case text |
strpos($haystack, $needle, $offset=0) X-Ref |
Find the position of the first occurrence of a substring in a string. UTF-8 ONLY safe strpos(), uses mbstring param: string $haystack the string to search in param: string $needle one or more charachters to search for param: int $offset offset from begining of string return: int the numeric position of the first occurrence of needle in haystack. |
strrpos($haystack, $needle) X-Ref |
Find the position of the last occurrence of a substring in a string UTF-8 ONLY safe strrpos(), uses mbstring param: string $haystack the string to search in param: string $needle one or more charachters to search for return: int the numeric position of the last occurrence of needle in haystack |
strrev($str) X-Ref |
Reverse UTF-8 multibytes character sets (used for RTL languages) (We only do this because there is no mb_strrev or iconv_strrev) param: string $str the multibyte string to reverse return: string the reversed multi byte string |
specialtoascii($text, $charset='utf-8') X-Ref |
Try to convert upper unicode characters to plain ascii, the returned string may contain unconverted unicode characters. With the removal of typo3, iconv conversions was found to be the best alternative to Typo3's function. However using the standard iconv call iconv($charset, 'ASCII//TRANSLIT//IGNORE', (string) $text); resulted in invalid strings with special character from Russian/Japanese. To solve this, the transliterator was used but this resulted in empty strings for certain strings in our test. It was decided to use a combo of the 2 to cover all our bases. Refer MDL-53544 for further information. param: string $text input string param: string $charset encoding of the text return: string converted ascii string |
encode_mimeheader($text, $charset='utf-8') X-Ref |
Generate a correct base64 encoded header to be used in MIME mail messages. This function seems to be 100% compliant with RFC1342. Credits go to: paravoid (http://www.php.net/manual/en/function.mb-encode-mimeheader.php#60283). param: string $text input string param: string $charset encoding of the text return: string base64 encoded header |
get_entities_table() X-Ref |
Returns HTML entity transliteration table. return: array with (html entity => utf-8) elements |
entities_to_utf8($str, $htmlent=true) X-Ref |
Converts all the numeric entities &#nnnn; or &#xnnn; to UTF-8 Original from laurynas dot butkus at gmail at: http://php.net/manual/en/function.html-entity-decode.php#75153 with some custom mods to provide more functionality param: string $str input string param: boolean $htmlent convert also html entities (defaults to true) return: string encoded UTF-8 string |
utf8_to_entities($str, $dec=false, $nonnum=false) X-Ref |
Converts all Unicode chars > 127 to numeric entities &#nnnn; or &#xnnn;. param: string $str input string param: boolean $dec output decadic only number entities param: boolean $nonnum remove all non-numeric entities return: string converted string |
trim_utf8_bom($str) X-Ref |
Removes the BOM from unicode string {@link http://unicode.org/faq/utf_bom.html} param: string $str input string return: string |
remove_unicode_non_characters($value) X-Ref |
There are a number of Unicode non-characters including the byte-order mark (which may appear multiple times in a string) and also other ranges. These can cause problems for some processing. This function removes the characters using string replace, so that the rest of the string remains unchanged. param: string $value Input string return: string Cleaned string value |
get_encodings() X-Ref |
Returns encoding options for select boxes, utf-8 and platform encoding first return: array encodings |
code2utf8($num) X-Ref |
Returns the utf8 string corresponding to the unicode value (from php.net, courtesy - romans@void.lv) param: int $num one unicode value return: string the UTF-8 char corresponding to the unicode value |
utf8ord($utf8char) X-Ref |
Returns the code of the given UTF-8 character param: string $utf8char one UTF-8 character return: int the code of the given character |
strtotitle($text) X-Ref |
Makes first letter of each word capital - words must be separated by spaces. Use with care, this function does not work properly in many locales!!! param: string $text input string return: string |