stri_opts_collator: Generate a List with Collator Settings¶
Description¶
A convenience function to tune the ICU Collator’s behavior, e.g., in stri_compare
, stri_order
, stri_unique
, stri_duplicated
, as well as stri_detect_coll
and other stringi-search-coll functions.
Usage¶
stri_opts_collator(
locale = NULL,
strength = 3L,
alternate_shifted = FALSE,
french = FALSE,
uppercase_first = NA,
case_level = FALSE,
normalization = FALSE,
normalisation = normalization,
numeric = FALSE
)
stri_coll(
locale = NULL,
strength = 3L,
alternate_shifted = FALSE,
french = FALSE,
uppercase_first = NA,
case_level = FALSE,
normalization = FALSE,
normalisation = normalization,
numeric = FALSE
)
Arguments¶
|
single string, |
|
single integer in {1,2,3,4}, which defines collation strength; |
|
single logical value; |
|
single logical value; used in Canadian French; |
|
single logical value; |
|
single logical value; controls whether an extra case level (positioned before the third level) is generated or not |
|
single logical value; if |
|
alias of |
|
single logical value; when turned on, this attribute generates a collation key for the numeric value of substrings of digits; this is a way to get ‘100’ to sort AFTER ‘2’; note that negative or non-integer numbers will not be ordered properly |
Details¶
ICU’s collator performs a locale-aware, natural-language alike string comparison. This is a more reliable way of establishing relationships between strings than the one provided by base R, and definitely one that is more complex and appropriate than ordinary bytewise comparison.
Value¶
Returns a named list object; missing settings are left with default values.
References¶
Collation – ICU User Guide, https://unicode-org.github.io/icu/userguide/collation/
ICU Collation Service Architecture – ICU User Guide, https://unicode-org.github.io/icu/userguide/collation/architecture.html
icu::Collator
Class Reference – ICU4C API Documentation, https://unicode-org.github.io/icu-docs/apidoc/dev/icu4c/classicu_1_1Collator.html
See Also¶
The official online manual of stringi at https://stringi.gagolewski.com/
Gagolewski M., stringi: Fast and portable character string processing in R, Journal of Statistical Software 103(2), 2022, 1-59, doi:10.18637/jss.v103.i02
Other locale_sensitive: %s<%()
, about_locale
, about_search_boundaries
, about_search_coll
, stri_compare()
, stri_count_boundaries()
, stri_duplicated()
, stri_enc_detect2()
, stri_extract_all_boundaries()
, stri_locate_all_boundaries()
, stri_order()
, stri_rank()
, stri_sort()
, stri_sort_key()
, stri_split_boundaries()
, stri_trans_tolower()
, stri_unique()
, stri_wrap()
Other search_coll: about_search
, about_search_coll
Examples¶
stri_cmp('number100', 'number2')
## [1] -1
stri_cmp('number100', 'number2', opts_collator=stri_opts_collator(numeric=TRUE))
## [1] 1
stri_cmp('number100', 'number2', numeric=TRUE) # equivalent
## [1] 1
stri_cmp('above mentioned', 'above-mentioned')
## [1] -1
stri_cmp('above mentioned', 'above-mentioned', alternate_shifted=TRUE)
## [1] 0