Mori, Yuji

Applications of Natural Language Processing for Predicting Self-Harm Risk

2022

Mori, Yuji
Advisor(s): Schoenberg, Frederic

Abstract

Self-harm is a subset of mental health that is considered a severe condition requiringimmediate attention. This research aims to predict individuals’ risk of self-harm using their social media history. This dataset and broader task were originally developed by the eRisk lab at the Conference and Labs of the Evaluation Forum (CLEF). By analyzing the text corpus, it is possible to identify writing patterns that are highly correlated with self-harm. Various methods rooted in Natural Language Processing (NLP) are explored to this end, including sentiment analysis, random forest classification, and deep learning classification using BERT. The results show that adequate classification is attainable with these methods, but the potential to incorporate additional processing steps and model features to increase predictiveness is also discussed.

Main Content

For improved accessibility of PDF content, download the file to your device.

UCLA

Applications of Natural Language Processing for Predicting Self-Harm Risk