Installing stringi
In most cases, installing stringi is as simple as calling:
install.packages("stringi")
However, due to the overwhelming complexity of the ICU4C library, upon which stringi is based, and the colourful diversity of operating systems, their flavours, and particular setups, some users may still experience a few issues that hopefully can be resolved with the help of this short manual.
Also, some build tweaks are possible.
ICU4C
The stringi package depends on the ICU4C >= 55 library.
If we install the package from sources and either:
this requirement is not met (check out https://icu.unicode.org/download, the
libicu-devel
rpm on Fedora/CentOS/OpenSUSE,libicu-dev
on Ubuntu/Debian, etc.),pkg-config
fails to find appropriate build settings for ICU-based projects, orR CMD INSTALL
is called with the--configure-args='--disable-pkg-config'
argument, or environment variableSTRINGI_DISABLE_PKG_CONFIG
is set to non-zero orinstall.packages("stringi", configure.args="--disable-pkg-config")
is executed,
then ICU will be built together with stringi. A custom subset of ICU4C 69.1 is shipped with the package. We also include ICU4C 55.1 which can be used as a fallback version (e.g., on older Solaris boxes).
To get the most out of stringi, you are strongly encouraged to rely on our ICU4C package bundle. This ensures maximum portability across all platforms (Windows and macOS users by default fetch the pre-compiled binaries from CRAN built precisely this way).
ICU Data Library and No Internet Access
Note that if you choose to use our ICU4C bundle, then – by default – the
ICU data library will be downloaded from one of our mirror servers.
However, if you have already downloaded a version of icudt*.zip
suitable
for your platform (big/little-endian), you may wish to install the
package by calling:
install.packages("stringi", configure.vars="ICUDT_DIR=<icudt_dir>")
Moreover, if you have no internet access on the machines
you try to install stringi on, try fetching the latest development version
of the package, as it is shipped with the ICU
data archives.
You can build a distributable source package that includes all the required
ICU data files (for off-line use) by omitting some relevant lines in
the .Rbuildignore
file. The following command sequence should do the trick:
wget https://github.com/gagolews/stringi/archive/master.zip -O stringi.zip
unzip stringi.zip
sed -i '/\/icu..\/data/d' stringi-master/.Rbuildignore
R CMD build stringi-master
Assuming the most recent development version of the package is numbered x.y.z,
a file named stringi_x.y.z.tar.gz
is created in the current working directory.
The package can now be installed (the source bundle may be propagated via
scp
etc.) by executing:
R CMD INSTALL stringi_x.y.z.tar.gz
Alternatively, call from within an R session:
install.packages("stringi_x.y.z.tar.gz", repos=NULL)
C++11 Issues
A decent C++11 compiler is required to build ICU4C 69.1 from sources.
Note that Pre-4.9.0 GCC has a
bug where
::max_align_t
has been defined, but not std::max_align_t
.
If our built-in workaround does not work, you may try calling:
install.packages("stringi", configure.args="--with-extra-cxxflags='-std=c++11'")
Overall, your build chain may be misconfigured, check out,
amongst others, <R_inst_dir>/etc/Makeconf
(e.g., are you using
-std=gnu++11
instead of -std=c++11
?). Refer to
https://cran.r-project.org/doc/manuals/r-release/R-admin.html
for more details.
There is an option of using the fallback version of ICU4C 55.1.
However, it requires the support of the long long
type in a few functions,
(this is not part of the C++98 standard; works on Solaris, though). Try:
install.packages("stringi", configure.args="--disable-cxx11")
Customising the Build Process
Additional features and options of the ./configure
script:
--disable-cxx11
: Disable C++11; if you build ICU4C from sources, make sure your C++ compiler supports thelong long
type.--disable-icu-bundle
: Enforce system ICU.--disable-pkg-config
: Disablepkg-config
; ICU4C will be compiled from sources.--with-extra-cflags=FLAGS
: Additional C compiler flags.--with-extra-cppflags=FLAGS
: Additional C/C++ preprocessor flags.--with-extra-cxxflags=FLAGS
: Additional C++ compiler flags.--with-extra-ldflags=FLAGS
: Additional linker flags.--with-extra-libs=FLAGS
: Additional libraries to link against.
Some influential environment variables:
ICUDT_DIR
: Optional directory with an already downloaded ICU data archive (icudt*.zip
); either an absolute path or a path relative to<package source dir>/src
; defaults toicuXX/data
.PKG_CONFIG_PATH
: An optional list of directories to search forpkg-config
’s.pc
files.R_HOME
: Override the R directory, e.g.,/usr/lib64/R
. Note that$R_HOME/bin/R
point to the R executable.CAT
: Thecat
command used to generate the list of source files to compile.PKG_CONFIG
:Thepkg-config
command used to fetch the necessary compiler flags to link to the existinglibicu
installation.STRINGI_DISABLE_CXX11
: Disable C++11; see also--disable-cxx11
.STRINGI_DISABLE_PKG_CONFIG
: Compile ICU from sources; see also--disable-pkg-config
.STRINGI_DISABLE_ICU_BUNDLE
: Enforce system ICU; see also--disable-icu-bundle
.STRINGI_CFLAGS
: see--with-extra-cflags
.STRINGI_CPPFLAGS
: see--with-extra-cppflags
.STRINGI_CXXFLAGS
: see--with-extra-cxxflags
.STRINGI_LDFLAGS
: see--with-extra-ldflags
.STRINGI_LIBS
: see--with-extra-libs
.
Conclusion
We expect that with a correctly configured C++11 compiler and properly installed system ICU4C distribution, you should face no problems installing the package, especially if you use our ICU4C bundle and have a working internet access.
If you do not manage to set up a successful stringi build, do not hesitate to file a bug report. However, please check the list of archived (closed) issues first – it is very likely that a solution to your problem has already been posted.
To help diagnose your error further, please run (from the terminal):
cd /tmp
wget https://github.com/gagolews/stringi/archive/master.zip
unzip master.zip
cd stringi-master
./configure
And submit the output from ./configure
as well as the contents of
config.log
.