VLC  3.0.15
Functions
vlc_charset.h File Reference
Include dependency graph for vlc_charset.h:

Go to the source code of this file.

Functions

double us_strtod (const char *, char **)
 us_strtod() has the same prototype as ANSI C strtod() but it uses the POSIX/C decimal format, regardless of the current numeric locale. More...
 
float us_strtof (const char *, char **)
 us_strtof() has the same prototype as ANSI C strtof() but it uses the POSIX/C decimal format, regardless of the current numeric locale. More...
 
double us_atof (const char *)
 us_atof() has the same prototype as ANSI C atof() but it expects a dot as decimal separator, regardless of the system locale. More...
 
int us_vasprintf (char **, const char *, va_list)
 us_vasprintf() has the same prototype as vasprintf(), but doesn't use the system locale. More...
 
int us_asprintf (char **, const char *,...)
 us_asprintf() has the same prototype as asprintf(), but doesn't use the system locale. More...
 
#define VLC_ICONV_ERR   ((size_t) -1)
 
#define FromLocale(l)   (l)
 
#define ToLocale(u)   (u)
 
#define LocaleFree(s)   ((void)(s))
 
#define FromLocaleDup   strdup
 
#define ToLocaleDup   strdup
 
typedef void * vlc_iconv_t
 
size_t vlc_towc (const char *str, uint32_t *restrict pwc)
 Decodes a code point from UTF-8. More...
 
static const char * IsUTF8 (const char *str)
 Checks UTF-8 validity. More...
 
static char * EnsureUTF8 (char *str)
 Removes non-UTF-8 sequences. More...
 
vlc_iconv_t vlc_iconv_open (const char *, const char *)
 
size_t vlc_iconv (vlc_iconv_t, const char **, size_t *, char **, size_t *)
 
int vlc_iconv_close (vlc_iconv_t)
 
int utf8_vfprintf (FILE *stream, const char *fmt, va_list ap)
 Formats an UTF-8 string as vfprintf(), then print it, with appropriate conversion to local encoding. More...
 
int utf8_fprintf (FILE *, const char *,...)
 Formats an UTF-8 string as fprintf(), then print it, with appropriate conversion to local encoding. More...
 
char * vlc_strcasestr (const char *, const char *)
 Look for an UTF-8 string within another one in a case-insensitive fashion. More...
 
char * FromCharset (const char *charset, const void *data, size_t data_size)
 Converts a string from the given character encoding to utf-8. More...
 
void * ToCharset (const char *charset, const char *in, size_t *outsize)
 Converts a nul-terminated UTF-8 string to a given character encoding. More...
 
static char * FromLatin1 (const char *latin)
 Converts a nul-terminated string from ISO-8859-1 to UTF-8. More...
 

Detailed Description

Characters sets handling

Macro Definition Documentation

◆ FromLocale

#define FromLocale (   l)    (l)

◆ FromLocaleDup

#define FromLocaleDup   strdup

◆ LocaleFree

#define LocaleFree (   s)    ((void)(s))

◆ ToLocale

#define ToLocale (   u)    (u)

◆ ToLocaleDup

#define ToLocaleDup   strdup

◆ VLC_ICONV_ERR

#define VLC_ICONV_ERR   ((size_t) -1)

Typedef Documentation

◆ vlc_iconv_t

typedef void* vlc_iconv_t

Function Documentation

◆ EnsureUTF8()

static char* EnsureUTF8 ( char *  str)
inlinestatic

Removes non-UTF-8 sequences.

Replaces invalid or over-long UTF-8 bytes sequences within a null-terminated string with question marks. This is so that the string can be printed at least partially.

Warning
Do not use this were correctness is critical. use IsUTF8() and handle the error case instead. This function is mainly for display or debug.
Note
Converting from Latin-1 to UTF-8 in place is not possible (the string size would be increased). So it is not attempted even if it would otherwise be less disruptive.
Return values
strthe string is a valid null-terminated UTF-8 sequence (i.e. no changes were made)
NULLthe string is not an UTF-8 sequence

References likely, and vlc_towc().

Referenced by AppendAttachment(), filename_sanitize(), and input_item_SetURI().

◆ FromCharset()

char* FromCharset ( const char *  charset,
const void *  data,
size_t  data_size 
)

Converts a string from the given character encoding to utf-8.

Returns
a nul-terminated utf-8 string, or null in case of error. The result must be freed using free().

References vlc_iconv(), vlc_iconv_close(), and vlc_iconv_open().

Referenced by vlc_readdir().

◆ FromLatin1()

static char* FromLatin1 ( const char *  latin)
inlinestatic

Converts a nul-terminated string from ISO-8859-1 to UTF-8.

◆ IsUTF8()

static const char* IsUTF8 ( const char *  str)
inlinestatic

Checks UTF-8 validity.

Checks whether a null-terminated string is a valid UTF-8 bytes sequence.

Parameters
strstring to check
Return values
strthe string is a valid null-terminated UTF-8 sequence
NULLthe string is not an UTF-8 sequence

References likely, and vlc_towc().

Referenced by IsSDPString(), and vlc_meta_Set().

◆ ToCharset()

void* ToCharset ( const char *  charset,
const char *  in,
size_t *  outsize 
)

Converts a nul-terminated UTF-8 string to a given character encoding.

Parameters
charseticonv name of the character set
innul-terminated UTF-8 string
outsizepointer to hold the byte size of result
Returns
A pointer to the result, which must be released using free(). The UTF-8 nul terminator is included in the conversion if the target character encoding supports it. However it is not included in the returned byte size. In case of error, NULL is returned and the byte size is undefined.

References unlikely, vlc_iconv(), vlc_iconv_close(), and vlc_iconv_open().

◆ us_asprintf()

int us_asprintf ( char **  ,
const char *  ,
  ... 
)

us_asprintf() has the same prototype as asprintf(), but doesn't use the system locale.

References us_vasprintf().

◆ us_atof()

double us_atof ( const char *  )

us_atof() has the same prototype as ANSI C atof() but it expects a dot as decimal separator, regardless of the system locale.

References us_strtod().

Referenced by config_ChainParse(), config_LoadCmdLine(), var_OptionParse(), and vlc_audio_replay_gain_MergeFromMeta().

◆ us_strtod()

double us_strtod ( const char *  ,
char **   
)

us_strtod() has the same prototype as ANSI C strtod() but it uses the POSIX/C decimal format, regardless of the current numeric locale.

References freelocale(), LC_NUMERIC_MASK, newlocale(), and uselocale().

Referenced by us_atof().

◆ us_strtof()

float us_strtof ( const char *  ,
char **   
)

us_strtof() has the same prototype as ANSI C strtof() but it uses the POSIX/C decimal format, regardless of the current numeric locale.

References freelocale(), LC_NUMERIC_MASK, newlocale(), strtof(), and uselocale().

◆ us_vasprintf()

int us_vasprintf ( char **  ,
const char *  ,
va_list   
)

us_vasprintf() has the same prototype as vasprintf(), but doesn't use the system locale.

References freelocale(), LC_NUMERIC_MASK, newlocale(), uselocale(), and vasprintf().

Referenced by us_asprintf().

◆ utf8_fprintf()

int utf8_fprintf ( FILE *  ,
const char *  ,
  ... 
)

Formats an UTF-8 string as fprintf(), then print it, with appropriate conversion to local encoding.

References vlc_memstream::stream, and utf8_vfprintf().

◆ utf8_vfprintf()

int utf8_vfprintf ( FILE *  stream,
const char *  fmt,
va_list  ap 
)

Formats an UTF-8 string as vfprintf(), then print it, with appropriate conversion to local encoding.

References likely, vlc_memstream::stream, unlikely, and vasprintf().

Referenced by utf8_fprintf().

◆ vlc_iconv()

size_t vlc_iconv ( vlc_iconv_t  ,
const char **  ,
size_t *  ,
char **  ,
size_t *   
)

◆ vlc_iconv_close()

int vlc_iconv_close ( vlc_iconv_t  )

◆ vlc_iconv_open()

vlc_iconv_t vlc_iconv_open ( const char *  ,
const char *   
)

◆ vlc_strcasestr()

char* vlc_strcasestr ( const char *  haystack,
const char *  needle 
)

Look for an UTF-8 string within another one in a case-insensitive fashion.

Beware that this is quite slow. Contrary to strcasestr(), this function works regardless of the system character encoding, and handles multibyte code points correctly.

Parameters
haystackstring to look into
needlestring to look for
Returns
a pointer to the first occurrence of the needle within the haystack, or NULL if no occurrence were found.

References unlikely, and vlc_towc().

Referenced by playlist_LiveSearchUpdateInternal().

◆ vlc_towc()

size_t vlc_towc ( const char *  str,
uint32_t *restrict  pwc 
)

Decodes a code point from UTF-8.

Converts the first character in a UTF-8 sequence into a Unicode code point.

Parameters
stran UTF-8 bytes sequence [IN]
pwcaddress of a location to store the code point [OUT]
Returns
the number of bytes occupied by the decoded code point
Return values
(size_t)-1not a valid UTF-8 sequence
0null character (i.e. str points to an empty string)
1(non-null) ASCII character
2-4non-ASCII character

Referenced by EnsureUTF8(), IsUTF8(), vlc_str2keycode(), and vlc_strcasestr().