stri_subset: Select Elements that Match a Given Pattern¶
Description¶
These functions return or modify a sub-vector where there is a match to a given pattern. In other words, they are roughly equivalent (but faster and easier to use) to a call to str[
stri_detect
(str, ...)]
or str[
stri_detect
(str, ...)] <- value
.
Usage¶
stri_subset(str, ..., regex, fixed, coll, charclass)
stri_subset(str, ..., regex, fixed, coll, charclass) <- value
stri_subset_fixed(
str,
pattern,
omit_na = FALSE,
negate = FALSE,
...,
opts_fixed = NULL
)
stri_subset_fixed(str, pattern, negate=FALSE, ..., opts_fixed=NULL) <- value
stri_subset_charclass(str, pattern, omit_na = FALSE, negate = FALSE)
stri_subset_charclass(str, pattern, negate=FALSE) <- value
stri_subset_coll(
str,
pattern,
omit_na = FALSE,
negate = FALSE,
...,
opts_collator = NULL
)
stri_subset_coll(str, pattern, negate=FALSE, ..., opts_collator=NULL) <- value
stri_subset_regex(
str,
pattern,
omit_na = FALSE,
negate = FALSE,
...,
opts_regex = NULL
)
stri_subset_regex(str, pattern, negate=FALSE, ..., opts_regex=NULL) <- value
Arguments¶
|
character vector; strings to search within |
|
supplementary arguments passed to the underlying functions, including additional settings for |
|
non-empty character vector of replacement strings; replacement function only |
|
character vector; search patterns (no more than the length of |
|
single logical value; should missing values be excluded from the result? |
|
single logical value; whether a no-match is rather of interest |
|
a named list used to tune up the search engine’s settings; see |
Details¶
Vectorized over str
as well as partially over pattern
and value
, with recycling of the elements in the shorter vector if necessary. As the aim here is to subset str
, pattern
cannot be longer than the former. Moreover, if the number of items to replace is not a multiple of length of value
, a warning is emitted and the unused elements are ignored. Hence, the length of the output will be the same as length of str
.
stri_subset
and stri_subset<-
are convenience functions. They call either stri_subset_regex
, stri_subset_fixed
, stri_subset_coll
, or stri_subset_charclass
, depending on the argument used.
Value¶
The stri_subset_*
functions return a character vector. As usual, the output encoding is UTF-8.
The stri_subset_*<-
functions modifies str
‘in-place’.
See Also¶
The official online manual of stringi at https://stringi.gagolewski.com/
Gagolewski M., stringi: Fast and portable character string processing in R, Journal of Statistical Software 103(2), 2022, 1-59, doi:10.18637/jss.v103.i02
Other search_subset: about_search
Examples¶
stri_subset_regex(c('stringi R', '123', 'ID456', ''), '^[0-9]+$')
## [1] "123"
x <- c('stringi R', '123', 'ID456', '')
`stri_subset_regex<-`(x, '[0-9]+$', negate=TRUE, value=NA) # returns a copy
## [1] NA "123" "ID456" NA
stri_subset_regex(x, '[0-9]+$') <- NA # modifies `x` in-place
print(x)
## [1] "stringi R" NA NA ""