stri_stats_general: General Statistics for a Character Vector¶
Description¶
This function gives general statistics for a character vector, e.g., obtained by loading a text file with the readLines
or stri_read_lines
function, where each text line’ is represented by a separate string.
Usage¶
stri_stats_general(str)
Arguments¶
|
character vector to be aggregated |
Details¶
None of the strings may contain \r
or \n
characters, otherwise you will get at error.
Below by ‘white space’ we mean the Unicode binary property WHITE_SPACE
, see stringi-search-charclass
.
Value¶
Returns an integer vector with the following named elements:
Lines
- number of lines (number of non-missing strings in the vector);LinesNEmpty
- number of lines with at least one non-WHITE_SPACE
character;Chars
- total number of Unicode code points detected;CharsNWhite
- number of Unicode code points that are notWHITE_SPACE
s;… (Other stuff that may appear in future releases of stringi).
See Also¶
The official online manual of stringi at https://stringi.gagolewski.com/
Gagolewski M., stringi: Fast and portable character string processing in R, Journal of Statistical Software 103(2), 2022, 1-59, doi:10.18637/jss.v103.i02
Other stats: stri_stats_latex()
Examples¶
s <- c('Lorem ipsum dolor sit amet, consectetur adipisicing elit.',
'nibh augue, suscipit a, scelerisque sed, lacinia in, mi.',
'Cras vel lorem. Etiam pellentesque aliquet tellus.',
'')
stri_stats_general(s)
## Lines LinesNEmpty Chars CharsNWhite
## 4 3 163 142