about_search: String Searching

Description

This man page explains how to perform string search-based operations in stringi.

Details

The following independent string searching engines are available in stringi.

  • stri_*_regex – ICU’s regular expressions (regexes), see about_search_regex,

  • stri_*_fixed – locale-independent byte-wise pattern matching, see about_search_fixed,

  • stri_*_coll – ICU’s StringSearch, locale-sensitive, Collator-based pattern search, useful for natural language processing tasks, see about_search_coll,

  • stri_*_charclass – character classes search, e.g., Unicode General Categories or Binary Properties, see about_search_charclass,

  • stri_*_boundaries – text boundary analysis, see about_search_boundaries

Each search engine is able to perform many search-based operations. These may include:

  • stri_detect_* - detect if a pattern occurs in a string, see, e.g., stri_detect,

  • stri_count_* - count the number of pattern occurrences, see, e.g., stri_count,

  • stri_locate_* - locate all, first, or last occurrences of a pattern, see, e.g., stri_locate,

  • stri_extract_* - extract all, first, or last occurrences of a pattern, see, e.g., stri_extract and, in case of regexes, stri_match,

  • stri_replace_* - replace all, first, or last occurrences of a pattern, see, e.g., stri_replace and also stri_trim,

  • stri_split_* - split a string into chunks indicated by occurrences of a pattern, see, e.g., stri_split,

  • stri_startswith_* and stri_endswith_* detect if a string starts or ends with a pattern match, see, e.g., stri_startswith,

  • stri_subset_* - return a subset of a character vector with strings that match a given pattern, see, e.g., stri_subset.