https://socictopen.socict.org/files/original/04753a44f87df91dde134908efdb4dfd.pdf f97cd05f13d8c0a0da019848224c377e Dublin Core The Dublin Core metadata element set is common to all Omeka records, including items, files, and collections. For more information see, http://dublincore.org/documents/dces/. Title A name given to the resource Coronavirus Description An account of the resource Dominio científico: Coronavirus Text A resource consisting primarily of words for reading. Examples include books, letters, dissertations, poems, newspapers, articles, archives of mailing lists. Note that facsimiles or images of texts are still of the genre Text. Dublin Core The Dublin Core metadata element set is common to all Omeka records, including items, files, and collections. For more information see, http://dublincore.org/documents/dces/. Title A name given to the resource Natural Language Processing Reveals Vulnerable Mental Health Support Groups and Heightened Health Anxiety on Reddit During COVID-19: Observational Study Creator An entity primarily responsible for making the resource Low, Daniel M, Rumker, Laurie, Talkar, Tanya, Torous, John, Cecchi, Guillermo, Ghosh, Satrajit S Description An account of the resource BackgroundThe COVID-19 pandemic is impacting mental health, but it is not clear how people with different types of mental health problems were differentially impacted as the initial wave of cases hit. ObjectiveThe aim of this study is to leverage natural language processing (NLP) with the goal of characterizing changes in 15 of the world’s largest mental health support groups (eg, r/schizophrenia, r/SuicideWatch, r/Depression) found on the website Reddit, along with 11 non–mental health groups (eg, r/PersonalFinance, r/conspiracy) during the initial stage of the pandemic. MethodsWe created and released the Reddit Mental Health Dataset including posts from 826,961 unique users from 2018 to 2020. Using regression, we analyzed trends from 90 text-derived features such as sentiment analysis, personal pronouns, and semantic categories. Using supervised machine learning, we classified posts into their respective support groups and interpreted important features to understand how different problems manifest in language. We applied unsupervised methods such as topic modeling and unsupervised clustering to uncover concerns throughout Reddit before and during the pandemic. ResultsWe found that the r/HealthAnxiety forum showed spikes in posts about COVID-19 early on in January, approximately 2 months before other support groups started posting about the pandemic. There were many features that significantly increased during COVID-19 for specific groups including the categories “economic stress,” “isolation,” and “home,” while others such as “motion” significantly decreased. We found that support groups related to attention-deficit/hyperactivity disorder, eating disorders, and anxiety showed the most negative semantic change during the pandemic out of all mental health groups. Health anxiety emerged as a general theme across Reddit through independent supervised and unsupervised machine learning analyses. For instance, we provide evidence that the concerns of a diverse set of individuals are converging in this unique moment of history; we discovered that the more users posted about COVID-19, the more linguistically similar (less distant) the mental health support groups became to r/HealthAnxiety (ρ=–0.96, P Date A point or period of time associated with an event in the lifecycle of the resource 2020 Identifier An unambiguous reference to the resource within a given context 10.2196/22635 Source A related resource from which the described resource is derived Epidemiology and Health Publisher An entity responsible for making the resource available Korean Society of Epidemiology Coverage The spatial or temporal topic of the resource, the spatial applicability of the resource, or the jurisdiction under which the resource is relevant Public aspects of medicine, Computer applications to medicine. Medical informatics