TagHelper Tools: Tools for Machine Learning With Text
Spoken Interfaces and Dialogue Processing
By Carolyn Rosé
This project provides a basic resource for researchers who use text-processing technology in or want to learn about text mining at a basic level. It has been used by a wide range of researchers in fields as diverse as law, medicine, social sciences, education, architecture, and civil engineering. A specific goal of our research is to develop text classification technology to address concerns specific to classifying sentences using coding schemes developed for behavioral research, especially in the area of computer-supported collaborative learning. A particular focus of our work is developing text classification technology that performs well on highly skewed data sets. Another important problem is avoiding overfitting idiosyncratic features on non-IID datasets. TagHelper Tools has been downloaded more than a thousand times in the past 18 months.