Kenneth is a postdoctoral associate in the Department of Psychology and the Center for Integrative Developmental Science at Cornell University. He received his Ph.D. in Quantitative Psychology at the University of Notre Dame. His research is currently focused in two streams: integrative data analysis and text mining. He is generally interested in Bayesian statistics, multilevel modeling, topic modeling, statistical software development, and their application to advancing psychological and health research. He currently works with Dr. Anthony Ong in the Center for Integrative Developmental Science studying affective mechanisms to improve health outcomes across the lifespan.
PhD Quantitative Psychology, 2022
University of Notre Dame
MS Applied Statistics, 2017
Rochester Institute of Technology
BS Psychology, 2013
Rochester Institute of Technology
Integrative data analysis (IDA) is an alternative to meta-analysis that combines participant-level data from multiple studies. Two approaches, fixed effects models (FEM) and multilevel models (MLM), have been used in psychological applications of IDA, but have not been fully evaluated. Because IDA combines data from multiple studies, two different kinds of fixed effects can be studied in IDA: study-level and participant-level effects. Furthermore, between-study differences need to be modeled carefully. For IDA with cross-sectional data, we reviewed three FEMs and two MLMs. We focused on (a) whether and how they can estimate and test participant-level and study-level fixed effects; and (b) whether and how they model between-study differences in study means and participant-level effects. Because IDA is typically conducted with fewer than 30 studies, we evaluated the performance of these models and different MLM estimation methods in a simulation study under realistic IDA conditions. While two of the FEMs accurately estimate the fixed effects, they do not model between-study differences in participant-level effects, leading to incorrect inferences. Only a random-slopes MLM that accounts for differences in both study means and participant-level effects provided accurate inferences and estimates of the fixed effects and between-study differences. We found that MLMs can be feasible for IDA with as few as three to six studies using appropriate estimation methods. We illustrated the application of the five models and how they can provide different estimates and inferences in an empirical example. We conclude with recommendations to guide researchers when planning an IDA.
Objective: Text-based responses may provide significant contributions to suicide risk prediction, yet research including text data is limited. This may be due to a lack of exposure and familiarity with statistical analyses for this data structure. Method: The current study provides an overview of data processing and statistical algorithms for text data, guided by an empirical example of 947 online participants who completed both open-ended items and traditional self-report measures. We give an introduction to a number of text-based statistical approaches, including dictionary-based methods, topic modeling, word embeddings, and deep learning. Results: We analyze responses from the open-ended question “How do you feel today?”, detailing characteristics of the responses, as well as predicting past-year suicidal ideation. Conclusions: We see the analysis of text from social media, open-ended questions, and other text sources (i.e., medical records) as an important form of complementary assessment to traditional scales, shedding insight on what we are missing in our current set of questionnaires, which may ultimately serve to improve both our understanding and prediction of suicide.
Text is a burgeoning data source for psychological researchers, but little methodological research has focused on adapting popular modeling approaches for text to the context of psychological research. One popular measurement model for text, topic modeling, uses a latent mixture model to represent topics underlying a body of documents. Recently, psychologists have studied relationships between these topics and other psychological measures by using estimates of the topics as regression predictors along with other manifest variables. While similar two-stage approaches involving estimated latent variables are known to yield biased estimates and incorrect standard errors, two-stage topic modeling approaches have received limited statistical study and, as we show, are subject to the same problems. To address these problems, we proposed a novel statistical model — supervised latent Dirichlet allocation with covariates (SLDAX) — that jointly incorporates a latent variable measurement model of text and a structural regression model to allow the latent topics and other manifest variables to serve as predictors of an outcome. Using a simulation study with data characteristics consistent with psychological text data, we found that SLDAX estimates were generally more accurate and more efficient. To illustrate the application of SLDAX and a two-stage approach, we provide an empirical clinical application to compare the application of both the two-stage and SLDAX approaches. Finally, we implemented the SLDAX model in an open-source R package to facilitate its use and further study.
Objective: Despite nonsuicidal self-injury (NSSI) being a prevalent and problematic behavior, only approximately half of those who engage in NSSI disclose their behavior. Yet, limited research has explored the choice to disclose. This study sought to identify if NSSI characteristics, emotional distress, and perceived interpersonal obstacles discriminated between NSSI disclosure status. Exploratory aims also investigated reasons for one’s disclosure decision and disclosure contextual factors. Method: Participants included 977 undergraduate students (83% female) with a lifetime history of NSSI. Results: Greater NSSI intrapersonal functions, suicide risk, and significant other support, and lower depression symptoms were associated with NSSI disclosure. Exploratory results highlight perceptions of one’s NSSI severity and desire to receive support in disclosure choice; intrapersonal functions and peer support were associated with timing of disclosure. Conclusions: Findings underscore the potential importance of individual attitudes toward NSSI, in addition to traditionally measured risk factors, as potential drivers in NSSI disclosure.