21.9 Simple Transforms
We start with some manual special transforms we may want to do. For
example, we might want to replace
/'', used sometimes to separate alternative words, with a space. This will avoid the two words being run into one string of characters through the transformations. We might also replace@’’ and ``’’ with a space, for the same
To create a custom transformation we make use of tm::content_transformer() to create a function to achieve the transformation, and then apply it to the corpus using tm::tm_map().
<- content_transformer(function(x, pattern) gsub(pattern, " ", x)) toSpace <- tm_map(docs, toSpace, "/") docs <- tm_map(docs, toSpace, "@") docs <- tm_map(docs, toSpace, "\\|")docs
This can be done with a single call:
<- tm_map(docs, toSpace, "/|@|\\|")docs
Check the email address.
Your donation will support ongoing availability and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984. Copyright © 1995-2022 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0