Search moodle.org's
Developer Documentation

See Release Notes

  • Bug fixes for general core bugs in 3.10.x will end 8 November 2021 (12 months).
  • Bug fixes for security issues in 3.10.x will end 9 May 2022 (18 months).
  • PHP version: minimum PHP 7.2.0 Note: minimum PHP version has increased since Moodle 3.8. PHP 7.3.x and 7.4.x are supported too.

Differences Between: [Versions 310 and 400] [Versions 310 and 401] [Versions 310 and 402] [Versions 310 and 403] [Versions 39 and 310]

Defines string apis

Copyright: (C) 2001-3001 Eloy Lafuente (stronk7) {@link http://contiento.com}
License: http://www.gnu.org/copyleft/gpl.html GNU GPL v3 or later
File Size: 780 lines (28 kb)
Included or required:0 times
Referenced: 0 times
Includes or requires: 0 files

Defines 1 class

core_text:: (24 methods):
  typo3()
  reset_caches()
  parse_charset()
  convert()
  substr()
  str_max_bytes()
  strrchr()
  strlen()
  strtolower()
  strtoupper()
  strpos()
  strrpos()
  strrev()
  specialtoascii()
  encode_mimeheader()
  get_entities_table()
  entities_to_utf8()
  utf8_to_entities()
  trim_utf8_bom()
  remove_unicode_non_characters()
  get_encodings()
  code2utf8()
  utf8ord()
  strtotitle()


Class: core_text  - X-Ref

defines string api's for manipulating strings

This class is used to manipulate strings under Moodle 1.6 an later. As
utf-8 text become mandatory a pool of safe functions under this encoding
become necessary. The name of the methods is exactly the
same than their PHP originals.

A big part of this class acts as a wrapper over the Typo3 charset library,
really a cool group of utilities to handle texts and encoding conversion.

Take a look to its own copyright and license details.

IMPORTANT Note: Typo3 libraries always expect lowercase charsets to use 100%
its capabilities so, don't forget to make the conversion
from every wrapper function!

typo3($reset = false)   X-Ref
Return t3lib helper class, which is used for conversion between charsets

param: bool $reset
return: t3lib_cs

reset_caches()   X-Ref
Reset internal textlib caches.


parse_charset($charset)   X-Ref
Standardise charset name

Please note it does not mean the returned charset is actually supported.

param: string $charset raw charset name
return: string normalised lowercase charset name

convert($text, $fromCS, $toCS='utf-8')   X-Ref
Converts the text between different encodings. It uses iconv extension with //TRANSLIT parameter,
falls back to typo3. If both source and target are utf-8 it tries to fix invalid characters only.

param: string $text
param: string $fromCS source encoding
param: string $toCS result encoding
return: string|bool converted string or false on error

substr($text, $start, $len=null, $charset='utf-8')   X-Ref
Multibyte safe substr() function, uses mbstring or iconv for UTF-8, falls back to typo3.

param: string $text string to truncate
param: int $start negative value means from end
param: int $len maximum length of characters beginning from start
param: string $charset encoding of the text
return: string portion of string specified by the $start and $len

str_max_bytes($string, $bytes)   X-Ref
Truncates a string to no more than a certain number of bytes in a multi-byte safe manner.
UTF-8 only!

Many of the other charsets we test for (like ISO-2022-JP and EUC-JP) are not supported
by typo3, and will give invalid results, so we are supporting UTF-8 only.

param: string $string String to truncate
param: int $bytes Maximum length of bytes in the result
return: string Portion of string specified by $bytes

strrchr($haystack, $needle, $part = false)   X-Ref
Finds the last occurrence of a character in a string within another.
UTF-8 ONLY safe mb_strrchr().

param: string $haystack The string from which to get the last occurrence of needle.
param: string $needle The string to find in haystack.
param: boolean $part If true, returns the portion before needle, else return the portion after (including needle).
return: string|false False when not found.

strlen($text, $charset='utf-8')   X-Ref
Multibyte safe strlen() function, uses mbstring or iconv for UTF-8, falls back to typo3.

param: string $text input string
param: string $charset encoding of the text
return: int number of characters

strtolower($text, $charset='utf-8')   X-Ref
Multibyte safe strtolower() function, uses mbstring, falls back to typo3.

param: string $text input string
param: string $charset encoding of the text (may not work for all encodings)
return: string lower case text

strtoupper($text, $charset='utf-8')   X-Ref
Multibyte safe strtoupper() function, uses mbstring, falls back to typo3.

param: string $text input string
param: string $charset encoding of the text (may not work for all encodings)
return: string upper case text

strpos($haystack, $needle, $offset=0)   X-Ref
Find the position of the first occurrence of a substring in a string.
UTF-8 ONLY safe strpos(), uses mbstring, falls back to iconv.

param: string $haystack the string to search in
param: string $needle one or more charachters to search for
param: int $offset offset from begining of string
return: int the numeric position of the first occurrence of needle in haystack.

strrpos($haystack, $needle)   X-Ref
Find the position of the last occurrence of a substring in a string
UTF-8 ONLY safe strrpos(), uses mbstring, falls back to iconv.

param: string $haystack the string to search in
param: string $needle one or more charachters to search for
return: int the numeric position of the last occurrence of needle in haystack

strrev($str)   X-Ref
Reverse UTF-8 multibytes character sets (used for RTL languages)
(We only do this because there is no mb_strrev or iconv_strrev)

param: string $str the multibyte string to reverse
return: string the reversed multi byte string

specialtoascii($text, $charset='utf-8')   X-Ref
Try to convert upper unicode characters to plain ascii,
the returned string may contain unconverted unicode characters.

param: string $text input string
param: string $charset encoding of the text
return: string converted ascii string

encode_mimeheader($text, $charset='utf-8')   X-Ref
Generate a correct base64 encoded header to be used in MIME mail messages.
This function seems to be 100% compliant with RFC1342. Credits go to:
paravoid (http://www.php.net/manual/en/function.mb-encode-mimeheader.php#60283).

param: string $text input string
param: string $charset encoding of the text
return: string base64 encoded header

get_entities_table()   X-Ref
Returns HTML entity transliteration table.

return: array with (html entity => utf-8) elements

entities_to_utf8($str, $htmlent=true)   X-Ref
Converts all the numeric entities &#nnnn; or &#xnnn; to UTF-8
Original from laurynas dot butkus at gmail at:
http://php.net/manual/en/function.html-entity-decode.php#75153
with some custom mods to provide more functionality

param: string $str input string
param: boolean $htmlent convert also html entities (defaults to true)
return: string encoded UTF-8 string

utf8_to_entities($str, $dec=false, $nonnum=false)   X-Ref
Converts all Unicode chars > 127 to numeric entities &#nnnn; or &#xnnn;.

param: string $str input string
param: boolean $dec output decadic only number entities
param: boolean $nonnum remove all non-numeric entities
return: string converted string

trim_utf8_bom($str)   X-Ref
Removes the BOM from unicode string {@link http://unicode.org/faq/utf_bom.html}

param: string $str input string
return: string

remove_unicode_non_characters($value)   X-Ref
There are a number of Unicode non-characters including the byte-order mark (which may appear
multiple times in a string) and also other ranges. These can cause problems for some
processing.

This function removes the characters using string replace, so that the rest of the string
remains unchanged.

param: string $value Input string
return: string Cleaned string value

get_encodings()   X-Ref
Returns encoding options for select boxes, utf-8 and platform encoding first

return: array encodings

code2utf8($num)   X-Ref
Returns the utf8 string corresponding to the unicode value
(from php.net, courtesy - romans@void.lv)

param: int    $num one unicode value
return: string the UTF-8 char corresponding to the unicode value

utf8ord($utf8char)   X-Ref
Returns the code of the given UTF-8 character

param: string $utf8char one UTF-8 character
return: int    the code of the given character

strtotitle($text)   X-Ref
Makes first letter of each word capital - words must be separated by spaces.
Use with care, this function does not work properly in many locales!!!

param: string $text input string
return: string