stri_sub: Extract a Substring From or Replace a Substring In a Character Vector¶
Description¶
stri_sub
extracts particular substrings at code point-based index ranges provided. Its replacement version allows to substitute (in-place) parts of a string with given replacement strings. stri_sub_replace
is its magrittr’s pipe-operator-friendly variant that returns a copy of the input vector.
For extracting/replacing multiple substrings from/within each string, see stri_sub_all.
Usage¶
stri_sub(str, from = 1L, to = -1L, length)
stri_sub(str, from=1L, to=-1L, length, omit_na=FALSE) <- value
stri_sub_replace(..., replacement, value = replacement)
Arguments¶
|
a character vector |
|
an integer vector giving the start indexes or a two-column matrix of type |
|
an integer vector giving the end indexes; mutually exclusive with |
|
an integer vector giving the substring lengths; mutually exclusive with |
|
a single logical value; indicates whether missing values in any of the indexes or in |
|
a character vector defining the replacement strings [replacement function only] |
|
arguments to be passed to |
|
alias of |
Details¶
Vectorized over str
, [value
], from
and (to
or length
). Parameters to
and length
are mutually exclusive.
Indexes are 1-based, i.e., the start of a string is at index 1. For negative indexes in from
or to
, counting starts at the end of the string. For instance, index -1 denotes the last code point in the string. Non-positive length
gives an empty string.
Argument from
gives the start of a substring to extract. Argument to
defines the last index of a substring, inclusive. Alternatively, its length
may be provided.
If from
is a two-column matrix, then these two columns are used as from
and to
, respectively, and anything passed explicitly as from
or to
is ignored. Such types of index matrices are generated by stri_locate_first and stri_locate_last. If extraction based on stri_locate_all is needed, see stri_sub_all.
In stri_sub
, out-of-bound indexes are silently corrected. If from
> to
, then an empty string is returned.
In stri_sub<-
, some configurations of indexes may work as substring ‘injection’ at the front, back, or in middle.
If both to
and length
are provided, length
has priority over to
.
Note that for some Unicode strings, the extracted substrings might not be well-formed, especially if input strings are not NFC-normalized (see stri_trans_nfc), include byte order marks, Bidirectional text marks, and so on. Handle with care.
Value¶
stri_sub
and stri_sub_replace
return a character vector. stri_sub<-
changes the str
object in-place.
See Also¶
Other indexing: stri_locate_all_boundaries(), stri_locate_all(), stri_sub_all()
Examples¶
s <- 'Lorem ipsum dolor sit amet, consectetur adipisicing elit.'
stri_sub(s, from=1:3*6, to=21)
stri_sub(s, from=c(1,7,13), length=5)
stri_sub(s, from=1, length=1:3)
stri_sub(s, -17, -7)
stri_sub(s, -5, length=4)
(stri_sub(s, 1, 5) <- 'stringi')
(stri_sub(s, -6, length=5) <- '.')
(stri_sub(s, 1, 1:3) <- 1:2)
x <- c('12 3456 789', 'abc', '', NA, '667')
stri_sub(x, stri_locate_first_regex(x, '[0-9]+')) # see stri_extract_first
stri_sub(x, stri_locate_last_regex(x, '[0-9]+')) # see stri_extract_last
stri_sub_replace(x, stri_locate_first_regex(x, '[0-9]+'),
omit_na=TRUE, replacement='***') # see stri_replace_first
stri_sub_replace(x, stri_locate_last_regex(x, '[0-9]+'),
omit_na=TRUE, replacement='***') # see stri_replace_last
stri_sub(x, stri_locate_first_regex(x, '[0-9]+'), omit_na=TRUE) <- '***'
print(x)
## Not run: x %>% stri_sub_replace(1, 5, replacement='new_substring')