Charlie F. Welch | publications

2024

C3NLP
Do Multilingual Large Language Models Mitigate Stereotype Bias?

Nie, Shangrui, Fromm, Michael, Welch, Charles, Görge, Rebekka, Karimi, Akbar, Plepi, Joan, Mowmita, Nazia Afsan, Flores-Herr, Nicolas, Ali, Mehdi, and Flek, Lucie

In Proceedings of the 2nd Workshop on Cross-Cultural Considerations in NLP 2024

Abs Bib

Preliminary results in recent research suggest that multilingual LLMs have a tendency to show a less biased behavior compared to monolingual ones. However, a systematic analysis of the effect of multilingual training on bias mitigation and the impact of other factors, such as scaling the method for LLMs or equal distribution of training data, is lacking. To address this gap, we trained and evaluated new large-scale multilingual and monolingual models across five European languages (English, German, French, Italian and Spanish). For a systematic comparison, we use six newly trained LLMs consistent in size (2.6B) and architecture; one monolingual model for each of the five selected European languages and a multilingual one with equally distributed data proportion across those languages, trained on publicly available data only. For evaluation, we automatically translated standard bias benchmarks to these five languages and in each language manually verified the translation quality and the bias preservation by independent annotators. We experimentally show that our bias mitigation hypothesis holds consistently. Furthermore, we find that multilingual models show not only less bias but also higher prediction accuracy with the same amount of training data and model architecture and size. We make both the novel multilingual bias benchmarks and the trained models publicly available.
@inproceedings{nie-2024-do-multilingual, title = {Do Multilingual Large Language Models Mitigate Stereotype Bias?}, author = {Nie, Shangrui and Fromm, Michael and Welch, Charles and Görge, Rebekka and Karimi, Akbar and Plepi, Joan and Mowmita, Nazia Afsan and Flores-Herr, Nicolas and Ali, Mehdi and Flek, Lucie}, booktitle = {Proceedings of the 2nd Workshop on Cross-Cultural Considerations in NLP}, year = {2024}, }
ACL
Perspective Taking through Generating Responses to Conflict Situations

Plepi, Joan, Welch, Charles, and Flek, Lucie

In Findings of the Association for Computational Linguistics: ACL 2024 2024

Abs Bib

Although language model performance across diverse tasks continues to improve, these models still struggle to understand and explain the beliefs of other people. This skill requires perspective-taking, the process of conceptualizing the point of view of another person. Perspective taking becomes challenging when the text reflects more personal and potentially more controversial beliefs. We explore this task through natural language generation of responses to conflict situations. We evaluate novel modifications to recent architectures for conditioning generation on an individual’s comments and self-disclosure statements. Our work extends the Social-Chem-101 corpus, using 95k judgements written by 6k authors from English Reddit data, for each of whom we obtained 20-500 self-disclosure statements. Our evaluation methodology borrows ideas from both personalized generation and theory of mind literature. Our proposed perspective-taking models outperform recent work, especially the twin encoder model conditioned on self-disclosures with high similarity to the conflict situation.
@inproceedings{plepi-2024-perspective-taking, title = {Perspective Taking through Generating Responses to Conflict Situations}, author = {Plepi, Joan and Welch, Charles and Flek, Lucie}, booktitle = {Findings of the Association for Computational Linguistics: ACL 2024}, year = {2024}, }
WOAH
Harnessing Personalization Methods to Identify and Predict Unreliable Information Spreader Behavior

Ashraf, Shaina, Gruschka, Fabio, Flek, Lucie, and Welch, Charlie

In Proceedings of the 8th Workshop on Online Abuse and Harms (WOAH) at NAACL 2024 2024

Abs Bib

Studies on detecting and understanding the spread of unreliable news on social media have identified key characteristic differences between reliable and unreliable posts. These differences in language use also vary in expression across individuals, making it important to consider personal factors in unreliable news detection. The application of personalization methods for this has been made possible by recent publication of datasets with user histories, though this area is still largely unexplored. In this paper we present approaches to represent social media users in order to improve performance on three tasks: (1) classification of unreliable news posts, (2) classification of unreliable news spreaders, and, (3) prediction of the spread of unreliable news. We compare the User2Vec method from previous work to two other approaches; a learnable user embedding layer trained with the downstream task, and a representation derived from an authorship attribution classifier. We demonstrate that the implemented strategies substantially improve classification performance over state-of-the-art and provide initial results on the task of unreliable news prediction.
@inproceedings{ashraf-2024-harnessing-personalization, title = {Harnessing Personalization Methods to Identify and Predict Unreliable Information Spreader Behavior}, author = {Ashraf, Shaina and Gruschka, Fabio and Flek, Lucie and Welch, Charlie}, booktitle = {Proceedings of the 8th Workshop on Online Abuse and Harms (WOAH) at NAACL 2024}, year = {2024}, }
NAACL
Corpus Considerations for Annotator Modeling and Scaling

Sarumi, Olufunke, Neuendorf, Béla, Plepi, Joan, Flek, Lucie, Schlötterer, Jörg, and Welch, Charles

In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics 2024

Abs Bib

Recent trends in natural language processing research and annotation tasks affirm a paradigm shift from the traditional reliance on a single ground truth to a focus on individual perspectives, particularly in subjective tasks. In scenarios where annotation tasks are meant to encompass diversity, models that solely rely on the majority class labels may inadvertently disregard valuable minority perspectives. This oversight could result in the omission of crucial information and, in a broader context, risk disrupting the balance within larger ecosystems. As the landscape of annotator modeling unfolds with diverse representation techniques, it becomes imperative to investigate their effectiveness with the fine-grained features of the datasets in view. This study systematically explores various annotator modeling techniques and compares their performance across seven corpora. From our findings, we show that the commonly used user token model consistently outperforms more complex models. We introduce a composite embedding approach and show distinct differences in which model performs best as a function of the agreement with a given dataset. Our findings shed light on the relationship between corpus statistics and annotator modeling performance, which informs future work on corpus construction and perspectivist NLP.
@inproceedings{sarumi-2024-corpus-considerations, title = {Corpus Considerations for Annotator Modeling and Scaling}, author = {Sarumi, Olufunke and Neuendorf, Béla and Plepi, Joan and Flek, Lucie and Schlötterer, Jörg and Welch, Charles}, booktitle = {Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics}, year = {2024}, }
NLPerspectives
A Perspectivist Corpus of Numbers in Social Judgements

May, Marlon, Flek, Lucie, and Welch, Charles

In Proceedings of the 3rd Workshop on Perspectivist Approaches to NLP (NLPerspectives) at LREC-COLING 2024 May 2024

Abs Bib

With growing interest in the use of large language models, it is becoming increasingly important to understand whose views they express. These models tend to generate output that conforms to majority opinion and are not representative of diverse views. As a step toward building models that can take differing views into consideration, we build a novel corpus of social judgements. We crowdsourced annotations of a subset of the Commonsense Norm Bank that contained numbers in the situation descriptions and asked annotators to replace the number with a range defined by a start and end value that, in their view, correspond to the given verdict. Our corpus contains unaggregated annotations and annotator demographics. We describe our annotation process for social judgements and will release our dataset to support future work on numerical reasoning and perspectivist approaches to natural language processing.
@inproceedings{may-etal-2024-perspectivist, title = {A Perspectivist Corpus of Numbers in Social Judgements}, author = {May, Marlon and Flek, Lucie and Welch, Charles}, booktitle = {Proceedings of the 3rd Workshop on Perspectivist Approaches to NLP (NLPerspectives) at LREC-COLING 2024}, month = may, year = {2024}, address = {Torino, Italia}, publisher = {ELRA and ICCL}, url = {https://aclanthology.org/2024.nlperspectives-1.4}, pages = {42--48} }
LREC-COLING
Appraisal Framework for Clinical Empathy: A Novel Application to Breaking Bad News Conversations

Lahnala, Allison, Neuendorf, Béla, Thomin, Alexander, Welch, Charles, Stibane, Tina, and Flek, Lucie

In Proceedings of The 2024 Joint International Conference, on Computational Linguistics, Language Resources and Evaluation May 2024

Abs Bib

We introduce an innovative annotation approach that draws on well-established frameworks for clinical empathy and breaking bad news (BBN) conversations. Empathy is essential in healthcare communication and requires considering the interactive dynamics of discourse relations. We constructed Empathy in BBNs, a dataset of simulated BBN conversations in German, annotated with our novel annotation scheme in collaboration with a large medical school to support research on educational tools for medical didactics. The dataset contains conversations between medical students and standardized patients and fine-grained annotations of the components of empathic interactions. We provide a detailed description of our span and relation labeling annotation procedure, where two trained annotators obtained Krippendorff’s alpha agreement of ≥ 0.85. The annotation is based on 1) Pounds (2011)’s appraisal framework (AF) for clinical empathy, which is grounded in systemic functional linguistics, and 2) SPIKES protocol for breaking bad news (Baile et al., 2000) commonly taught in medical didactics training. This approach presents novel opportunities to study clinical empathic behavior and enables the training of models to detect causal relations involving empathy, a highly desirable feature of systems that can provide feedback to medical professionals in training. We present illustrative examples and discuss applications of annotation scheme and insights we can draw from the framework
@inproceedings{lahnala-2024-lead-empathy, title = {Appraisal Framework for Clinical Empathy: A Novel Application to Breaking Bad News Conversations}, author = {Lahnala, Allison and Neuendorf, Béla and Thomin, Alexander and Welch, Charles and Stibane, Tina and Flek, Lucie}, booktitle = {Proceedings of The 2024 Joint International Conference, on Computational Linguistics, Language Resources and Evaluation}, year = {2024}, }
CLPsych
Archetypes and Entropy: Theory-Driven Extraction of Evidence for Suicide Risk

Varadarajan, Vasudha, Lahnala, Allison, Ganesan, Adithya V., Dey, Gourab, Mangalik, Siddharth, Bucur, Ana-Maria, Soni, Nikita, Rao, Rajath, Lanning, Kevin, Vallejo, Isabella, Flek, Lucie, Schwartz, H. Andrew, Welch, Charles, and Boyd, Ryan L.

In Proceedings of the Tenth Workshop on Computational Linguistics and Clinical Psychology May 2024

Abs Bib PDF

Psychological risk factors for suicide have been extensively studied for decades. However, combining explainable theory with modern data-driven language modeling approaches is non-trivial. Here, we propose and evaluate methods for identifying language patterns indicative of suicide risk by combining theory-driven suicidal archetypes with language model-based and relative entropy-based approaches. Archetypes are based on prototypical statements that evince risk of suicidality while relative entropy considers the difference between how probable the risk-familiar and risk-unfamiliar models find user language. Each approach performed well individually; combining the two strikingly improved performance, yielding our combined system submission with a BERTScore Recall of 0.906. Further, we find diagnostic language is distributed unevenly in posts, with titles containing substantial risk evidence. We conclude that a union between theory- and data-driven methods is beneficial, outperforming more modern prompt-based methods.
@inproceedings{varadarajan_archetypes_2024, title = {Archetypes and {Entropy}: {Theory}-{Driven} {Extraction} of {Evidence} for {Suicide} {Risk}}, author = {Varadarajan, Vasudha and Lahnala, Allison and Ganesan, Adithya V. and Dey, Gourab and Mangalik, Siddharth and Bucur, Ana-Maria and Soni, Nikita and Rao, Rajath and Lanning, Kevin and Vallejo, Isabella and Flek, Lucie and Schwartz, H. Andrew and Welch, Charles and Boyd, Ryan L.}, booktitle = {Proceedings of the {Tenth} {Workshop} on {Computational} {Linguistics} and {Clinical} {Psychology}}, year = {2024}, }

2023

JMIR
Expressive Interviewing Agents to Support Health-Related Behavior Change: A Study of COVID-19 Behaviors

Stewart, Ian, Welch, Charles, An, Lawrence, Resnicow, Kenneth, Pennebaker, James, and Mihalcea, Rada

Journal of Medical Internet Research May 2023

Abs Bib PDF

Expressive writing and motivational interviewing are well-known approaches to help patients cope with stressful life events. While these methods are often applied by human counselors, it is less well understood if an automated AI approach can benefit patients. Providing an automated method would help expose a wider range of people to the possible benefits of motivational interviewing, with lower cost and more adaptability to sudden events like the COVID-19 pandemic. This study presents an automated writing system and evaluates possible outcomes among participants with respect to behavior related to the COVID-19 pandemic. Participants exhibited short term positive changes in mental health, but not long-term, and some linguistic metrics of writing style were correlated with positive change in behavior. While there were no significant long-term effects observed, the positive short term effect suggests that the Expressive Interviewing intervention could be used in cases where a patient lacks access to traditional therapy and needs a short-term solution.
@article{stewart-2023-expressive, title = {Expressive Interviewing Agents to Support Health-Related Behavior Change: A Study of COVID-19 Behaviors}, author = {Stewart, Ian and Welch, Charles and An, Lawrence and Resnicow, Kenneth and Pennebaker, James and Mihalcea, Rada}, journal = {Journal of Medical Internet Research}, year = {2023}, publisher = {JMIR Publications Inc., Toronto, Canada}, }
WASSA
CAISA at WASSA 2023 Shared Task: Domain Transfer for Empathy, Distress, and Personality Prediction

Gruschka, Fabio, Lahnala, Allison, Welch, Charles, and Flek, Lucie

In Proceedings of the 13th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis May 2023

Abs Bib PDF

This research contributes to the task of predicting empathy and personality traits within dialogue, an important aspect of natural language processing, as part of our experimental work for the WASSA 2023 Empathy and Emotion Shared Task. For predicting empathy, emotion polarity, and emotion intensity on turns within a dialogue, we employ adapters trained on social media interactions labeled with empathy ratings in a stacked composition with the target task adapters. Furthermore, we embed demographic information to predict Interpersonal Reactivity Index (IRI) subscales and Big Five Personality Traits utilizing BERT-based models. The results from our study provide valuable insights, contributing to advancements in understanding human behavior and interaction through text. Our team ranked 2nd on the personality and empathy prediction tasks, 4th on the interpersonal reactivity index, and 6th on the conversational task.
@inproceedings{gruschka-etal-2023-caisa, title = {CAISA at WASSA 2023 Shared Task: Domain Transfer for Empathy, Distress, and Personality Prediction}, author = {Gruschka, Fabio and Lahnala, Allison and Welch, Charles and Flek, Lucie}, booktitle = {Proceedings of the 13th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis}, year = {2023}, publisher = {Association for Computational Linguistics}, }
RANLP
Challenges of GPT-3-based Conversational Agents for Healthcare

Lechner, Fabian, Lahnala, Allison, Flek, Lucie, and Welch, Charles

In Proceedings of the International Conference on Recent Advances in Natural Language Processing May 2023

Abs Bib PDF

The potential to provide patients with faster information access while allowing medical specialists to focus on urgent tasks makes medical domain dialog agents appealing. However, there can be dire consequences due to the limitations of large-language models (LLMs) built into such agents. This paper investigates the challenges and risks of using GPT-3-based models for medical question-answering (MedQA). We perform several evaluations contextualized in terms of standard medical principles. We provide a procedure for manually designing patient queries to stress-test high-risk limitations of LLMs in MedQA systems. Our analysis shows that the LLMs fail to respond safely to these queries, producing invalid medical information, dangerous recommendations, and offensive content.
@inproceedings{lechner-etal-2023-challenges, title = {Challenges of GPT-3-based Conversational Agents for Healthcare}, author = {Lechner, Fabian and Lahnala, Allison and Flek, Lucie and Welch, Charles}, booktitle = {Proceedings of the International Conference on Recent Advances in Natural Language Processing}, year = {2023}, }
TLLM
Style Locality for Controllable Generation with kNN Language Models

Nawezi, Gilles, Flek, Lucie, and Welch, Charles

In Proceedings of the First Workshop on Taming Large Language Models May 2023

Abs Bib PDF

Recent language models have been improved by the addition of external memory. Nearest neighbor language models retrieve similar contexts to assist in word prediction. The addition of locality levels allows a model to learn how to weight neighbors based on their relative location to the current text in source documents, and have been shown to further improve model performance. Nearest neighbor models have been explored for controllable generation but have not examined the use of locality levels. We present a novel approach for this purpose and evaluate it using automatic and human evaluation on politeness, formality, supportiveness, and toxicity textual data. We find that our model is successfully able to control style and provides a better fluency-style trade-off than previous work.
@inproceedings{nawezi-etal-2023-style, title = {Style Locality for Controllable Generation with kNN Language Models}, author = {Nawezi, Gilles and Flek, Lucie and Welch, Charles}, booktitle = {Proceedings of the First Workshop on Taming Large Language Models}, year = {2023}, }

2022

EMNLP
A Critical Reflection and Forward Perspective on Empathy and Natural Language Processing

Lahnala, Allison, Welch, Charles, Jurgens, David, and Flek, Lucie

In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP) May 2022

Abs Bib PDF

We review the state of research on empathy in our field and identify the following issues: (1) empathy definitions are absent or abstract, which (2) leads to low construct validity and reproducibility. Moreover, (3) emotional empathy is overemphasized, skewing our focus to a narrow subset of simplified tasks. We believe these issues hinder research progress and argue that current directions will benefit from a clear conceptualization that includes operationalizing cognitive empathy components. Our main objectives are to provide insight and guidance on empathy conceptualization for NLP research objectives and to encourage researchers to pursue the overlooked opportunities in this area, highly relevant, e.g., for the clinical and educational sector.
@inproceedings{lahnala-etal-2022-acritical, title = {A Critical Reflection and Forward Perspective on Empathy and Natural Language Processing}, author = {Lahnala, Allison and Welch, Charles and Jurgens, David and Flek, Lucie}, booktitle = {Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP)}, year = {2022}, publisher = {Association for Computational Linguistics}, }
EMNLP
Unifying Data Perspectivism and Personalization: An Application to Social Norms

Plepi, Joan, Neuendorf, Béla, Flek, Lucie, and Welch, Charles

In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP) May 2022

Abs Bib PDF

Instead of using a single ground truth for language processing tasks, several recent studies have examined how to represent and predict the labels of the set of annotators. However, often little or no information about annotators is known, or the set of annotators is small. In this work, we examine a corpus of social media posts about conflict from a set of 13k annotators and 210k judgements of social norms. We provide a novel experimental setup that applies personalization methods to the modeling of annotators and compare their effectiveness for predicting the perception of social norms. We further provide an analysis of performance across subsets of social situations that vary by the closeness of the relationship between parties in conflict, and assess where personalization helps the most.
@inproceedings{plepi-etal-2022-unifying, title = {Unifying Data Perspectivism and Personalization: An Application to Social Norms}, author = {Plepi, Joan and Neuendorf, B{\'e}la and Flek, Lucie and Welch, Charles}, booktitle = {Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP)}, year = {2022}, publisher = {Association for Computational Linguistics}, }
NLP+CSS
Understanding Interpersonal Conflict Types and their Impact on Perception Classification

Welch, Charles, Plepi, Joan, Neuendorf, Béla, and Flek, Lucie

In Proceedings of the Fifth Workshop on Natural Language Processing and Computational Social Science May 2022

Abs Bib PDF

Studies on interpersonal conflict have a long history and contain many suggestions for conflict typology. We use this as the basis of a novel annotation scheme and release a new dataset of situations and conflict aspect annotations. We then build a classifier to predict whether someone will perceive the actions of one individual as right or wrong in a given situation, outperforming previous work on this task. Our analyses include conflict aspects, but also generated clusters, which are human validated, and show differences in conflict content based on the relationship of participants to the author. Our findings have important implications for understanding conflict and social norms.
@inproceedings{welch2022understanding, title = {Understanding Interpersonal Conflict Types and their Impact on Perception Classification}, author = {Welch, Charles and Plepi, Joan and Neuendorf, B{\'e}la and Flek, Lucie}, booktitle = {Proceedings of the Fifth Workshop on Natural Language Processing and Computational Social Science}, year = {2022}, }
GEM
Nearest Neighbor Language Models for Stylistic Controllable Generation

Trotta, Severino, Flek, Lucie, and Welch, Charles

In Proceedings of the Second Workshop on Generation, Evaluation & Metrics (GEM) May 2022

Abs Bib PDF

Recent language modeling performance has been greatly improved by the use of external memory. This memory encodes the context so that similar contexts can be recalled during decoding. This similarity depends on how the model learns to encode context, which can be altered to include other attributes, such as style. We construct and evaluate an architecture for this purpose, using corpora annotated for politeness, formality, and toxicity. Through extensive experiments and human evaluation we demonstrate the potential of our method to generate text while controlling style. We find that style-specific datastores improve generation performance, though results vary greatly across styles, and the effect of pretraining data and specific styles should be explored in future work.
@inproceedings{trotta2022nearest, title = {Nearest Neighbor Language Models for Stylistic Controllable Generation}, author = {Trotta, Severino and Flek, Lucie and Welch, Charles}, booktitle = {Proceedings of the Second Workshop on Generation, Evaluation & Metrics (GEM)}, year = {2022}, }
NAACL
Mitigating Toxic Degeneration with Empathetic Data: Exploring the Relationship Between Toxicity and Empathy

Lahnala, Allison, Welch, Charles, Neuendorf, Béla, and Flek, Lucie

In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies May 2022

Abs Bib PDF

Large pre-trained neural language models have supported the effectiveness of many NLP tasks, yet are still prone to generating toxic language hindering the safety of their use. Using empathetic data, we improve over recent work on controllable text generation that aims to reduce the toxicity of generated text. We find we are able to dramatically reduce the size of fine-tuning data to 7.5-30k samples while at the same time making significant improvements over state-of-the-art toxicity mitigation of up to 3.4% absolute reduction (26% relative) from the original work on 2.3m samples, by strategically sampling data based on empathy scores. We observe that the degree of improvements is subject to specific communication components of empathy. In particular, the more cognitive components of empathy significantly beat the original dataset in almost all experiments, while emotional empathy was tied to less improvement and even underperforming random samples of the original data. This is a particularly implicative insight for NLP work concerning empathy as until recently the research and resources built for it have exclusively considered empathy as an emotional concept.
@inproceedings{lahnala-etal-2022-mitigating, title = {Mitigating Toxic Degeneration with Empathetic Data: {E}xploring the Relationship Between Toxicity and Empathy}, author = {Lahnala, Allison and Welch, Charles and Neuendorf, Béla and Flek, Lucie}, booktitle = {Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies}, year = {2022}, publisher = {Association for Computational Linguistics}, }
WASSA
CAISA at WASSA 2022: Adapter-Tuning for Empathy Prediction

Lahnala, Allison, Welch, Charles, and Flek, Lucie

In Proceedings of the 12th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis May 2022

Abs Bib PDF

We build a system that leverages adapters, a light weight and efficient method for leveraging large language models to perform the task Empathy and Distress prediction tasks for WASSA 2022. In our experiments, we find that stacking our empathy and distress adapters on a pre-trained emotion classification adapter performs best compared to full fine-tuning approaches and emotion feature concatenation. We make our experimental code publicly available.
@inproceedings{lahnala-etal-2022-caisa, title = {CAISA at WASSA 2022: Adapter-Tuning for Empathy Prediction}, author = {Lahnala, Allison and Welch, Charles and Flek, Lucie}, booktitle = {Proceedings of the 12th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis}, year = {2022}, publisher = {Association for Computational Linguistics}, }
ACL
Leveraging Similar Users for Personalized Language Modeling with Limited Data

Welch, Charles, Gu, Chenxi, Kummerfeld, Jonathan K., Pérez-Rosas, Verónica, and Mihalcea, Rada

In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics May 2022

Abs Bib PDF

Personalized language models are designed and trained to capture language patterns specific to individual users. This makes them more accurate at predicting what a user will write. However, when a new user joins a platform and not enough text is available, it is harder to build effective personalized language models. We propose a solution for this problem, using a model trained on users that are similar to a new user. In this paper, we explore strategies for finding the similarity between new users and existing ones and methods for using the data from existing users who are a good match. We further explore the trade-off between available data for new users and how well their language can be modeled.
@inproceedings{welch-etal-2022-leveraging, title = {{L}everaging {S}imilar {U}sers for {P}ersonalized {L}anguage {M}odeling with {L}imited {D}ata}, author = {Welch, Charles and Gu, Chenxi and Kummerfeld, Jonathan K. and P{\'e}rez-Rosas, Ver{\'o}nica and Mihalcea, Rada}, booktitle = {Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics}, year = {2022}, publisher = {Association for Computational Linguistics}, }
ACL
Knowledge Enhanced Reflection Generation for Counseling Dialogues

Shen, Siqi, Pérez-Rosas, Verónica, Welch, Charles, Poria, Soujanya, and Mihalcea, Rada

In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics May 2022

Abs Bib PDF

In this paper, we study the effect of commonsense and domain knowledge while generating responses in counseling conversations using retrieval and generative methods for knowledge integration. We propose a pipeline that collects domain knowledge through web mining, and show that retrieval from both domain-specific and commonsense knowledge bases improves the quality of generated responses. We also present a model that incorporates knowledge generated by COMET using soft positional encoding and masked self-attention. We show that both retrieved and COMET-generated knowledge improve the system’s performance as measured by automatic metrics and also by human evaluation. Lastly, we present a comparative study on the types of knowledge encoded by our system showing that \textitcausal and \textitintentional relationships benefit the generation task more than other types of commonsense relations.
@inproceedings{shen-etal-2022-leveraging, title = {Knowledge Enhanced Reflection Generation for Counseling Dialogues}, author = {Shen, Siqi and P{\'e}rez-Rosas, Ver{\'o}nica and Welch, Charles and Poria, Soujanya and Mihalcea, Rada}, booktitle = {Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics}, year = {2022}, publisher = {Association for Computational Linguistics}, }

2021

Thesis

Leveraging Longitudinal Data for Personalized Prediction and Word Representations

Welch, Charles

May 2021

Bib PDF

ACL
Exploring Self-Identified Counseling Expertise in Online Support Forums

Lahnala, Allison, Zhao, Yuntian, Welch, Charles, Kummerfeld, Jonathan K., An, Lawrence C, Resnicow, Kenneth, Mihalcea, Rada, and Pérez-Rosas, Verónica

In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 May 2021

Abs Bib PDF

A growing number of people engage in online health forums, making it important to understand the quality of the advice they receive. In this paper, we explore the role of expertise in responses provided to help-seeking posts regarding mental health. We study the differences between (1) interactions with peers; and (2) interactions with self-identified mental health professionals. First, we show that a classifier can distinguish between these two groups, indicating that their language use does in fact differ. To understand this difference, we perform several analyses addressing engagement aspects, including whether their comments engage the support-seeker further as well as linguistic aspects, such as dominant language and linguistic style matching. Our work contributes toward the developing efforts of understanding how health experts engage with health information- and support-seekers in social networks. More broadly, it is a step toward a deeper understanding of the styles of interactions that cultivate supportive engagement in online communities.
@inproceedings{lahnala-etal-2021-exploring, title = {Exploring Self-Identified Counseling Expertise in Online Support Forums}, author = {Lahnala, Allison and Zhao, Yuntian and Welch, Charles and Kummerfeld, Jonathan K. and An, Lawrence C and Resnicow, Kenneth and Mihalcea, Rada and P{\'e}rez-Rosas, Ver{\'o}nica}, booktitle = {Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021}, year = {2021}, address = {Online}, publisher = {Association for Computational Linguistics}, url = {https://aclanthology.org/2021.findings-acl.392}, doi = {10.18653/v1/2021.findings-acl.392}, pages = {4467--4480}, }
arXiv
Modeling Proficiency with Implicit User Representations

Breitwieser, Kim, Lahnala, Allison, Welch, Charles, Flek, Lucie, and Potthast, Martin

arXiv preprint arXiv:2110.08011 May 2021

Abs Bib PDF

We introduce the problem of proficiency modeling: Given a user’s posts on a social media platform, the task is to identify the subset of posts or topics for which the user has some level of proficiency. This enables the filtering and ranking of social media posts on a given topic as per user proficiency. Unlike experts on a given topic, proficient users may not have received formal training and possess years of practical experience, but may be autodidacts, hobbyists, and people with sustained interest, enabling them to make genuine and original contributions to discourse. While predicting whether a user is an expert on a given topic imposes strong constraints on who is a true positive, proficiency modeling implies a graded scoring, relaxing these constraints. Put another way, many active social media users can be assumed to possess, or eventually acquire, some level of proficiency on topics relevant to their community. We tackle proficiency modeling in an unsupervised manner by utilizing user embeddings to model engagement with a given topic, as indicated by a user’s preference for authoring related content. We investigate five alternative approaches to model proficiency, ranging from basic ones to an advanced, tailored user modeling approach, applied within two real-world benchmarks for evaluation.
@article{breitwieser2021modeling, title = {Modeling Proficiency with Implicit User Representations}, author = {Breitwieser, Kim and Lahnala, Allison and Welch, Charles and Flek, Lucie and Potthast, Martin}, journal = {arXiv preprint arXiv:2110.08011}, year = {2021}, }

2020

NLP COVID-19
Expressive Interviewing: A Conversational System for Coping with COVID-19

Welch, Charles, Lahnala, Allison, Pérez-Rosas, Verónica, Shen, Siqi, Seraj, Sarah, An, Larry, Resnicow, Kenneth, Pennebaker, James, and Mihalcea, Rada

In Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020 May 2020

Abs Bib PDF Website

The ongoing COVID-19 pandemic has raised concerns for many regarding personal and public health implications, financial security and economic stability. Alongside many other unprecedented challenges, there are increasing concerns over social isolation and mental health. We introduce Expressive Interviewing – an interview-style conversational system that draws on ideas from motivational interviewing and expressive writing. Expressive Interviewing seeks to encourage users to express their thoughts and feelings through writing by asking them questions about how COVID-19 has impacted their lives. We present relevant aspects of the system’s design and implementation as well as quantitative and qualitative analyses of user interactions with the system. In addition, we conduct a comparative evaluation with a general purpose dialogue system for mental health that shows our system potential in helping users to cope with COVID-19 issues.
@inproceedings{welch-etal-2020-expressive, title = {Expressive Interviewing: A Conversational System for Coping with {COVID}-19}, author = {Welch, Charles and Lahnala, Allison and P{\'e}rez-Rosas, Ver{\'o}nica and Shen, Siqi and Seraj, Sarah and An, Larry and Resnicow, Kenneth and Pennebaker, James and Mihalcea, Rada}, booktitle = {Proceedings of the 1st Workshop on {NLP} for {COVID}-19 (Part 2) at {EMNLP} 2020}, year = {2020}, address = {Online}, publisher = {Association for Computational Linguistics}, url = {https://aclanthology.org/2020.nlpcovid19-2.6}, doi = {10.18653/v1/2020.nlpcovid19-2.6}, }
SIGDIAL
Counseling-Style Reflection Generation Using Generative Pretrained Transformers with Augmented Context

Shen, Siqi, Welch, Charles, Mihalcea, Rada, and Pérez-Rosas, Verónica

In Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue May 2020

Abs Bib PDF

We introduce a counseling dialogue system that seeks to assist counselors while they are learning and refining their counseling skills. The system generates counselors’reflections – i.e., responses that reflect back on what the client has said given the dialogue history. Our method builds upon the new generative pretrained transformer architecture and enhances it with context augmentation techniques inspired by traditional strategies used during counselor training. Through a set of comparative experiments, we show that the system that incorporates these strategies performs better in the reflection generation task than a system that is just fine-tuned with counseling conversations. To confirm our findings, we present a human evaluation study that shows that our system generates naturally-looking reflections that are also stylistically and grammatically correct.
@inproceedings{shen-etal-2020-counseling, title = {Counseling-Style Reflection Generation Using Generative Pretrained Transformers with Augmented Context}, author = {Shen, Siqi and Welch, Charles and Mihalcea, Rada and P{\'e}rez-Rosas, Ver{\'o}nica}, booktitle = {Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue}, year = {2020}, address = {1st virtual meeting}, publisher = {Association for Computational Linguistics}, url = {https://aclanthology.org/2020.sigdial-1.2}, pages = {10--20}, }
EMNLP
Improving Low Compute Language Modeling with In-Domain Embedding Initialisation

Welch, Charles, Mihalcea, Rada, and Kummerfeld, Jonathan K.

In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) May 2020

Abs Bib PDF Blog Code

Many NLP applications, such as biomedical data and technical support, have 10-100 million tokens of in-domain data and limited computational resources for learning from it. How should we train a language model in this scenario? Most language modeling research considers either a small dataset with a closed vocabulary (like the standard 1 million token Penn Treebank), or the whole web with byte-pair encoding. We show that for our target setting in English, initialising and freezing input embeddings using in-domain data can improve language model performance by providing a useful representation of rare words, and this pattern holds across several different domains. In the process, we show that the standard convention of tying input and output embeddings does not improve perplexity when initializing with embeddings trained on in-domain data.
@inproceedings{welch-etal-2020-improving, title = {Improving Low Compute Language Modeling with In-Domain Embedding Initialisation}, author = {Welch, Charles and Mihalcea, Rada and Kummerfeld, Jonathan K.}, booktitle = {Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)}, year = {2020}, address = {Online}, publisher = {Association for Computational Linguistics}, url = {https://aclanthology.org/2020.emnlp-main.696}, doi = {10.18653/v1/2020.emnlp-main.696}, pages = {8625--8634}, }
EMNLP
Compositional Demographic Word Embeddings

Welch, Charles, Kummerfeld, Jonathan K., Pérez-Rosas, Verónica, and Mihalcea, Rada

In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) May 2020

Abs Bib PDF Code

Word embeddings are usually derived from corpora containing text from many individuals, thus leading to general purpose representations rather than individually personalized representations. While personalized embeddings can be useful to improve language model performance and other language processing tasks, they can only be computed for people with a large amount of longitudinal data, which is not the case for new users. We propose a new form of personalized word embeddings that use demographic-specific word representations derived compositionally from full or partial demographic information for a user (i.e., gender, age, location, religion). We show that the resulting demographic-aware word representations outperform generic word representations on two tasks for English: language modeling and word associations. We further explore the trade-off between the number of available attributes and their relative effectiveness and discuss the ethical implications of using them.
@inproceedings{welch-etal-2020-compositional, title = {Compositional Demographic Word Embeddings}, author = {Welch, Charles and Kummerfeld, Jonathan K. and P{\'e}rez-Rosas, Ver{\'o}nica and Mihalcea, Rada}, booktitle = {Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)}, year = {2020}, address = {Online}, publisher = {Association for Computational Linguistics}, url = {https://aclanthology.org/2020.emnlp-main.334}, doi = {10.18653/v1/2020.emnlp-main.334}, pages = {4076--4089}, }
COLING
Exploring the Value of Personalized Word Embeddings

Welch, Charles, Kummerfeld, Jonathan K., Pérez-Rosas, Verónica, and Mihalcea, Rada

In Proceedings of the 28th International Conference on Computational Linguistics May 2020

Abs Bib PDF Code

In this paper, we introduce personalized word embeddings, and examine their value for language modeling. We compare the performance of our proposed prediction model when using personalized versus generic word representations, and study how these representations can be leveraged for improved performance. We provide insight into what types of words can be more accurately predicted when building personalized models. Our results show that a subset of words belonging to specific psycholinguistic categories tend to vary more in their representations across users and that combining generic and personalized word embeddings yields the best performance, with a 4.7% relative reduction in perplexity. Additionally, we show that a language model using personalized word embeddings can be effectively used for authorship attribution.
@inproceedings{welch-etal-2020-exploring, title = {Exploring the Value of Personalized Word Embeddings}, author = {Welch, Charles and Kummerfeld, Jonathan K. and P{\'e}rez-Rosas, Ver{\'o}nica and Mihalcea, Rada}, booktitle = {Proceedings of the 28th International Conference on Computational Linguistics}, year = {2020}, address = {Barcelona, Spain (Online)}, publisher = {International Committee on Computational Linguistics}, url = {https://aclanthology.org/2020.coling-main.604}, doi = {10.18653/v1/2020.coling-main.604}, pages = {6856--6862}, }

2019

CICLing
Look Who’s Talking: Inferring Speaker Attributes from Personal Longitudinal Dialog

Welch, Charles, Pérez-Rosas, Verónica, Kummerfeld, Jonathan K, and Mihalcea, Rada

May 2019

Abs Bib PDF

We examine a large dialog corpus obtained from the conversation history of a single individual with 104 conversation partners. The corpus consists of half a million instant messages, across several messaging platforms. We focus our analyses on seven speaker attributes, each of which partitions the set of speakers, namely: gender; relative age; family member; romantic partner; classmate; co-worker; and native to the same country. In addition to the content of the messages, we examine conversational aspects such as the time messages are sent, messaging frequency, psycholinguistic word categories, linguistic mirroring, and graph-based features reflecting how people in the corpus mention each other. We present two sets of experiments predicting each attribute using (1) short context windows; and (2) a larger set of messages. We find that using all features leads to gains of 9-14% over using message text only.
@article{welch2019look, title = {Look Who's Talking: Inferring Speaker Attributes from Personal Longitudinal Dialog}, author = {Welch, Charles and P{\'e}rez-Rosas, Ver{\'o}nica and Kummerfeld, Jonathan K and Mihalcea, Rada}, publisher = {Conference on Computational Linguistics and Intelligent Text Processing}, booktitle = {Proceedings of the 20th International Conference on Computational Linguistics and Intelligent Text Processing}, year = {2019}, }
IEEE
Learning from Personal Longitudinal Dialog Data

Welch, Charles, Pérez-Rosas, Verónica, Kummerfeld, Jonathan K, and Mihalcea, Rada

IEEE Intelligent Systems May 2019

Abs Bib PDF Code

We explore the use of longitudinal dialog data for two dialog prediction tasks: next message prediction and response time prediction. We show that a neural model using personal data that leverages a combination of message content, style matching, time features, and speaker attributes leads to the best results for both tasks, with error rate reductions of up to 15% compared to a classifier that relies exclusively on message content and to a classifier that does not use personal data.
@article{welch2019learning, title = {Learning from Personal Longitudinal Dialog Data}, author = {Welch, Charles and P{\'e}rez-Rosas, Ver{\'o}nica and Kummerfeld, Jonathan K and Mihalcea, Rada}, journal = {IEEE Intelligent Systems}, volume = {34}, number = {4}, pages = {16--23}, year = {2019}, publisher = {IEEE}, }

2018

LREC
World Knowledge for Abstract Meaning Representation Parsing

Welch, Charles, Kummerfeld, Jonathan K., Feng, Song, and Mihalcea, Rada

In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018) May 2018

Abs Bib PDF

In this paper we explore the role played by world knowledge in semantic parsing. We look at the types of errors that currently exist in a state-of-the-art Abstract Meaning Representation (AMR) parser, and explore the problem of how to integrate world knowledge to reduce these errors. We look at three types of knowledge from (1) WordNet hypernyms and super senses,(2) Wikipedia entity links, and (3) retraining a named entity recognizer to identify concepts in AMR. The retrained entity recognizer is not perfect and cannot recognize all concepts in AMR and we examine the limitations of the named entity features using a set of oracles. The oracles show how performance increases if it can recognize different subsets of AMR concepts. These results show improvement on multiple fine-grained metrics, including a 6% increase in named entity F-score, and provide insight into the potential of world knowledge for future work in Abstract Meaning Representation parsing.
@inproceedings{welch-etal-2018-world, title = {World Knowledge for {A}bstract {M}eaning {R}epresentation Parsing}, author = {Welch, Charles and Kummerfeld, Jonathan K. and Feng, Song and Mihalcea, Rada}, booktitle = {Proceedings of the Eleventh International Conference on Language Resources and Evaluation ({LREC} 2018)}, year = {2018}, address = {Miyazaki, Japan}, publisher = {European Language Resources Association (ELRA)}, url = {https://aclanthology.org/L18-1492}, }
TAC
Entity and Event Extraction from Scratch Using Minimal Training Data

Wendlandt, Laura, Wilson, Steve R, Ignat, Oana, Welch, Charles, Zhang, Li, Wang, Mingzhe, Deng, Jia, and Mihalcea, Rada

May 2018

Abs Bib PDF

Understanding current world events in real-time involves sifting through news articles, tweets, photos, and videos from many different perspectives. The goal of the DARPA-funded AIDA project1 is to automate much of this process, building a knowledge base that can be queried to strategically generate hypotheses about different aspects of an event. We are participating in this project as a TA1 team, and we are building the first step of the overall system. Given raw multimodal input (eg, text, images, video), our goal is to generate a knowledge graph with entities, events, and relations.
@article{wendlandt2018entity, title = {Entity and Event Extraction from Scratch Using Minimal Training Data}, author = {Wendlandt, Laura and Wilson, Steve R and Ignat, Oana and Welch, Charles and Zhang, Li and Wang, Mingzhe and Deng, Jia and Mihalcea, Rada}, year = {2018}, publisher = {Text Analysis Conference}, booktitle = {Proceedings of the Text Analysis Conference 2018}, }

2016

COLING
Targeted Sentiment to Understand Student Comments

Welch, Charles, and Mihalcea, Rada

In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers May 2016

Abs Bib PDF Code

We address the task of targeted sentiment as a means of understanding the sentiment that students hold toward courses and instructors, as expressed by students in their comments. We introduce a new dataset consisting of student comments annotated for targeted sentiment and describe a system that can both identify the courses and instructors mentioned in student comments, as well as label the students’ sentiment toward those entities. Through several comparative evaluations, we show that our system outperforms previous work on a similar task.
@inproceedings{welch-mihalcea-2016-targeted, title = {Targeted Sentiment to Understand Student Comments}, author = {Welch, Charles and Mihalcea, Rada}, booktitle = {Proceedings of {COLING} 2016, the 26th International Conference on Computational Linguistics: Technical Papers}, year = {2016}, address = {Osaka, Japan}, publisher = {The COLING 2016 Organizing Committee}, url = {https://aclanthology.org/C16-1233}, pages = {2471--2481}, }
IVA
Targeted Sentiment Analysis: Identifying Student Sentiment Toward Courses and Instructors

Welch, Charles, and Mihalcea, Rada

In Proceedings of Intelligent Virtual Agents Conference May 2016

Abs Bib PDF Code

We examine targeted sentiment analysis for the purpose of building a dialog system for academic advising. For dialog tasks such as course selection it is important to recognize the entities (ie, courses and instructors) and the sentiment that students express toward them. We examine the effect of domain specific features, and show performance improvements for both the entity recognition and sentiment analysis tasks over baseline methods. A discussion of errors provides insight into the limitations of our method and directions for future work.
@inproceedings{welch2016targeted, title = {Targeted Sentiment Analysis: Identifying Student Sentiment Toward Courses and Instructors}, author = {Welch, Charles and Mihalcea, Rada}, publisher = {Intelligent Virtual Agents Conference}, booktitle = {Proceedings of Intelligent Virtual Agents Conference}, year = {2016}, }

2015

AAAI
What Women Want: Analyzing Research Publications to Understand Gender Preferences in Computer Science

Mihalcea, Rada, and Welch, Charles

In Workshops at the Twenty-Ninth AAAI Conference on Artificial Intelligence May 2015

Abs Bib

While the number of women who choose to pursue computer science and engineering careers is growing, men continue to largely outnumber them. In this paper, we describe a data mining approach that relies on a large collection of scientific articles to identify differences in gender interests in this field. Our hope is that through a better understanding of the differences between male and female preferences, we can enable more effective outreach and retention, and consequently contribute to the growth of the number of women who choose to pursue careers in this field.
@inproceedings{mihalcea2015women, title = {What Women Want: Analyzing Research Publications to Understand Gender Preferences in Computer Science}, author = {Mihalcea, Rada and Welch, Charles}, booktitle = {Workshops at the Twenty-Ninth AAAI Conference on Artificial Intelligence}, year = {2015}, }

2014

Serdica
Classification of Paintings by Artist, Movement, and Indoor Setting Using MPEG-7 Descriptor Features

Welch, Charles

Serdica Journal of Computing May 2014

Abs Bib

Image classification is an essential problem for content based image retrieval and image processing. Visual properties can be extracted from images in the form of MPEG-7 descriptors. Statistical methods can use these properties as features and be used to derive an effective method of classifying images by evaluating a minimal number of properties used in the MPEG-7 descriptor. Classification by artist, artistic movement, and indoor/outdoor setting is examined using J48, J48 graft, best first, functional, and least absolute deviation tree algorithms. An improved accuracy of 11% in classification of artist and 17% in classification of artistic movement over previous work is achieved using functional trees. In addition classification by indoor/outdoor setting shows that the method can be applied to new categories. We present an analysis of generated decision trees that shows edge histogram information is most prominent in classification of artists and artistic movements, while scalable color information is most useful for classification of indoor/outdoor setting.
@article{welch2014classification, title = {Classification of Paintings by Artist, Movement, and Indoor Setting Using MPEG-7 Descriptor Features}, author = {Welch, Charles}, journal = {Serdica Journal of Computing}, volume = {8}, number = {1}, pages = {15--28}, year = {2014}, }

2013

ANT
An Efficient Algorithm for Finding Large Localizable Regions in Wireless Sensor Networks

Kanchi, Saroja, and Welch, Charles

Procedia Computer Science May 2013

Abs Bib

In this paper, we find a novel algorithm that can identify large localizable regions within a wireless sensor network. This algorithm is based on graph rigidity. We find a new technique to annex localizable subgraphs to other localizable subgraphs, thus expanding the localization. We simulate the results on wireless sensor graphs of various topologies. Our results indicate that we are able to localize almost all nodes in wireless sensor networks of various radii. Using MDS-MAP techniques in conjuction with this algorithm, wireless sensor networks can be localized with very small number of anchor nodes.
@article{kanchi2013efficient, title = {An Efficient Algorithm for Finding Large Localizable Regions in Wireless Sensor Networks}, author = {Kanchi, Saroja and Welch, Charles}, journal = {Procedia Computer Science}, volume = {19}, pages = {1081--1087}, year = {2013}, publisher = {Elsevier}, }