VLC  2.1.0-git
 All Data Structures Files Functions Variables Typedefs Enumerations Enumerator Macros Groups Pages
Functions
unicode.c File Reference
Include dependency graph for unicode.c:

Functions

int utf8_vfprintf (FILE *stream, const char *fmt, va_list ap)
 Formats an UTF-8 string as vfprintf(), then print it, with appropriate conversion to local encoding.
int utf8_fprintf (FILE *stream, const char *fmt,...)
 Formats an UTF-8 string as fprintf(), then print it, with appropriate conversion to local encoding.
size_t vlc_towc (const char *str, uint32_t *restrict pwc)
 Converts the first character from a UTF-8 sequence into a code point.
char * vlc_strcasestr (const char *haystack, const char *needle)
 Look for an UTF-8 string within another one in a case-insensitive fashion.
char * EnsureUTF8 (char *str)
 Replaces invalid/overlong UTF-8 sequences with question marks.
const char * IsUTF8 (const char *str)
 Checks whether a string is a valid UTF-8 byte sequence.
char * FromCharset (const char *charset, const void *data, size_t data_size)
 Converts a string from the given character encoding to utf-8.
void * ToCharset (const char *charset, const char *in, size_t *outsize)
 Converts a nul-terminated UTF-8 string to a given character encoding.

Function Documentation

char* EnsureUTF8 ( char *  str)

Replaces invalid/overlong UTF-8 sequences with question marks.

Note that it is not possible to convert from Latin-1 to UTF-8 on the fly, so we don't try that, even though it would be less disruptive.

Returns
str if it was valid UTF-8, NULL if not.

References likely, and vlc_towc().

Referenced by filename_sanitize(), input_item_SetURI(), InputMetaUser(), and test().

char* FromCharset ( const char *  charset,
const void *  data,
size_t  data_size 
)

Converts a string from the given character encoding to utf-8.

Returns
a nul-terminated utf-8 string, or null in case of error. The result must be freed using free().

References vlc_iconv(), vlc_iconv_close(), and vlc_iconv_open().

const char* IsUTF8 ( const char *  str)

Checks whether a string is a valid UTF-8 byte sequence.

Parameters
strnul-terminated string to be checked
Returns
str if it was valid UTF-8, NULL if not.

References likely, and vlc_towc().

Referenced by IsSDPString(), and test().

void* ToCharset ( const char *  charset,
const char *  in,
size_t *  outsize 
)

Converts a nul-terminated UTF-8 string to a given character encoding.

Parameters
charseticonv name of the character set
innul-terminated UTF-8 string
outsizepointer to hold the byte size of result
Returns
A pointer to the result, which must be released using free(). The UTF-8 nul terminator is included in the conversion if the target character encoding supports it. However it is not included in the returned byte size. In case of error, NULL is returned and the byte size is undefined.

References unlikely, vlc_iconv(), vlc_iconv_close(), and vlc_iconv_open().

int utf8_fprintf ( FILE *  stream,
const char *  fmt,
  ... 
)

Formats an UTF-8 string as fprintf(), then print it, with appropriate conversion to local encoding.

References utf8_vfprintf().

Referenced by Help(), libvlc_InternalInit(), ListModules(), print_help_on_full_help(), print_help_section(), PrintColorMsg(), PrintMsg(), Usage(), and Version().

int utf8_vfprintf ( FILE *  stream,
const char *  fmt,
va_list  ap 
)

Formats an UTF-8 string as vfprintf(), then print it, with appropriate conversion to local encoding.

References likely, unlikely, and vasprintf().

Referenced by PrintColorMsg(), PrintMsg(), and utf8_fprintf().

char* vlc_strcasestr ( const char *  haystack,
const char *  needle 
)

Look for an UTF-8 string within another one in a case-insensitive fashion.

Beware that this is quite slow. Contrary to strcasestr(), this function works regardless of the system character encoding, and handles multibyte code points correctly.

Parameters
haystackstring to look into
needlestring to look for
Returns
a pointer to the first occurence of the needle within the haystack, or NULL if no occurence were found.

References unlikely, and vlc_towc().

Referenced by playlist_LiveSearchUpdateInternal(), and test_strcasestr().

size_t vlc_towc ( const char *  str,
uint32_t *restrict  pwc 
)

Converts the first character from a UTF-8 sequence into a code point.

Parameters
stran UTF-8 bytes sequence
Returns
0 if str points to an empty string, i.e. the first character is NUL; number of bytes that the first character occupies (from 1 to 4) otherwise; -1 if the byte sequence was not a valid UTF-8 sequence.

References clz8, and unlikely.

Referenced by convert_xml_special_chars(), EnsureUTF8(), IsUTF8(), vlc_str2keycode(), and vlc_strcasestr().