Analysis of the effect of sentiment analysis on extracting adverse drug reactions from tweets and forum posts

Ioannis Korkontzelos, Azadeh Nikfarjam, Matthew Shardlow, Abeed Sarker, Sophia Ananiadou, Graciela H. Gonzalez

    Research output: Contribution to journalArticle

    45 Citations (Scopus)

    Abstract

    Objective The abundance of text available in social media and health related forums along with the rich expression of public opinion have recently attracted the interest of the public health community to use these sources for pharmacovigilance. Based on the intuition that patients post about Adverse Drug Reactions (ADRs) expressing negative sentiments, we investigate the effect of sentiment analysis features in locating ADR mentions. Methods We enrich the feature space of a state-of-the-art ADR identification method with sentiment analysis features. Using a corpus of posts from the DailyStrength forum and tweets annotated for ADR and indication mentions, we evaluate the extent to which sentiment analysis features help in locating ADR mentions and distinguishing them from indication mentions. Results Evaluation results show that sentiment analysis features marginally improve ADR identification in tweets and health related forum posts. Adding sentiment analysis features achieved a statistically significant F-measure increase from 72.14% to 73.22% in the Twitter part of an existing corpus using its original train/test split. Using stratified 10 × 10-fold cross-validation, statistically significant F-measure increases were shown in the DailyStrength part of the corpus, from 79.57% to 80.14%, and in the Twitter part of the corpus, from 66.91% to 69.16%. Moreover, sentiment analysis features are shown to reduce the number of ADRs being recognized as indications. Conclusion This study shows that adding sentiment analysis features can marginally improve the performance of even a state-of-the-art ADR identification method. This improvement can be of use to pharmacovigilance practice, due to the rapidly increasing popularity of social media and health forums.

    Original languageEnglish (US)
    Pages (from-to)148-158
    Number of pages11
    JournalJournal of Biomedical Informatics
    Volume62
    DOIs
    StatePublished - Aug 1 2016

    Fingerprint

    Drug-Related Side Effects and Adverse Reactions
    Health
    Public health
    Social Media
    Pharmacovigilance
    Intuition
    Public Opinion
    Public Health

    Keywords

    • Adverse drug reactions
    • Sentiment analysis
    • Social media
    • Text mining

    ASJC Scopus subject areas

    • Computer Science Applications
    • Health Informatics

    Cite this

    Analysis of the effect of sentiment analysis on extracting adverse drug reactions from tweets and forum posts. / Korkontzelos, Ioannis; Nikfarjam, Azadeh; Shardlow, Matthew; Sarker, Abeed; Ananiadou, Sophia; Gonzalez, Graciela H.

    In: Journal of Biomedical Informatics, Vol. 62, 01.08.2016, p. 148-158.

    Research output: Contribution to journalArticle

    Korkontzelos, Ioannis ; Nikfarjam, Azadeh ; Shardlow, Matthew ; Sarker, Abeed ; Ananiadou, Sophia ; Gonzalez, Graciela H. / Analysis of the effect of sentiment analysis on extracting adverse drug reactions from tweets and forum posts. In: Journal of Biomedical Informatics. 2016 ; Vol. 62. pp. 148-158.
    @article{a7dd7b50f72c4e199f54fe41defa3dc1,
    title = "Analysis of the effect of sentiment analysis on extracting adverse drug reactions from tweets and forum posts",
    abstract = "Objective The abundance of text available in social media and health related forums along with the rich expression of public opinion have recently attracted the interest of the public health community to use these sources for pharmacovigilance. Based on the intuition that patients post about Adverse Drug Reactions (ADRs) expressing negative sentiments, we investigate the effect of sentiment analysis features in locating ADR mentions. Methods We enrich the feature space of a state-of-the-art ADR identification method with sentiment analysis features. Using a corpus of posts from the DailyStrength forum and tweets annotated for ADR and indication mentions, we evaluate the extent to which sentiment analysis features help in locating ADR mentions and distinguishing them from indication mentions. Results Evaluation results show that sentiment analysis features marginally improve ADR identification in tweets and health related forum posts. Adding sentiment analysis features achieved a statistically significant F-measure increase from 72.14{\%} to 73.22{\%} in the Twitter part of an existing corpus using its original train/test split. Using stratified 10 × 10-fold cross-validation, statistically significant F-measure increases were shown in the DailyStrength part of the corpus, from 79.57{\%} to 80.14{\%}, and in the Twitter part of the corpus, from 66.91{\%} to 69.16{\%}. Moreover, sentiment analysis features are shown to reduce the number of ADRs being recognized as indications. Conclusion This study shows that adding sentiment analysis features can marginally improve the performance of even a state-of-the-art ADR identification method. This improvement can be of use to pharmacovigilance practice, due to the rapidly increasing popularity of social media and health forums.",
    keywords = "Adverse drug reactions, Sentiment analysis, Social media, Text mining",
    author = "Ioannis Korkontzelos and Azadeh Nikfarjam and Matthew Shardlow and Abeed Sarker and Sophia Ananiadou and Gonzalez, {Graciela H.}",
    year = "2016",
    month = "8",
    day = "1",
    doi = "10.1016/j.jbi.2016.06.007",
    language = "English (US)",
    volume = "62",
    pages = "148--158",
    journal = "Journal of Biomedical Informatics",
    issn = "1532-0464",
    publisher = "Academic Press Inc.",

    }

    TY - JOUR

    T1 - Analysis of the effect of sentiment analysis on extracting adverse drug reactions from tweets and forum posts

    AU - Korkontzelos, Ioannis

    AU - Nikfarjam, Azadeh

    AU - Shardlow, Matthew

    AU - Sarker, Abeed

    AU - Ananiadou, Sophia

    AU - Gonzalez, Graciela H.

    PY - 2016/8/1

    Y1 - 2016/8/1

    N2 - Objective The abundance of text available in social media and health related forums along with the rich expression of public opinion have recently attracted the interest of the public health community to use these sources for pharmacovigilance. Based on the intuition that patients post about Adverse Drug Reactions (ADRs) expressing negative sentiments, we investigate the effect of sentiment analysis features in locating ADR mentions. Methods We enrich the feature space of a state-of-the-art ADR identification method with sentiment analysis features. Using a corpus of posts from the DailyStrength forum and tweets annotated for ADR and indication mentions, we evaluate the extent to which sentiment analysis features help in locating ADR mentions and distinguishing them from indication mentions. Results Evaluation results show that sentiment analysis features marginally improve ADR identification in tweets and health related forum posts. Adding sentiment analysis features achieved a statistically significant F-measure increase from 72.14% to 73.22% in the Twitter part of an existing corpus using its original train/test split. Using stratified 10 × 10-fold cross-validation, statistically significant F-measure increases were shown in the DailyStrength part of the corpus, from 79.57% to 80.14%, and in the Twitter part of the corpus, from 66.91% to 69.16%. Moreover, sentiment analysis features are shown to reduce the number of ADRs being recognized as indications. Conclusion This study shows that adding sentiment analysis features can marginally improve the performance of even a state-of-the-art ADR identification method. This improvement can be of use to pharmacovigilance practice, due to the rapidly increasing popularity of social media and health forums.

    AB - Objective The abundance of text available in social media and health related forums along with the rich expression of public opinion have recently attracted the interest of the public health community to use these sources for pharmacovigilance. Based on the intuition that patients post about Adverse Drug Reactions (ADRs) expressing negative sentiments, we investigate the effect of sentiment analysis features in locating ADR mentions. Methods We enrich the feature space of a state-of-the-art ADR identification method with sentiment analysis features. Using a corpus of posts from the DailyStrength forum and tweets annotated for ADR and indication mentions, we evaluate the extent to which sentiment analysis features help in locating ADR mentions and distinguishing them from indication mentions. Results Evaluation results show that sentiment analysis features marginally improve ADR identification in tweets and health related forum posts. Adding sentiment analysis features achieved a statistically significant F-measure increase from 72.14% to 73.22% in the Twitter part of an existing corpus using its original train/test split. Using stratified 10 × 10-fold cross-validation, statistically significant F-measure increases were shown in the DailyStrength part of the corpus, from 79.57% to 80.14%, and in the Twitter part of the corpus, from 66.91% to 69.16%. Moreover, sentiment analysis features are shown to reduce the number of ADRs being recognized as indications. Conclusion This study shows that adding sentiment analysis features can marginally improve the performance of even a state-of-the-art ADR identification method. This improvement can be of use to pharmacovigilance practice, due to the rapidly increasing popularity of social media and health forums.

    KW - Adverse drug reactions

    KW - Sentiment analysis

    KW - Social media

    KW - Text mining

    UR - http://www.scopus.com/inward/record.url?scp=84978034203&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=84978034203&partnerID=8YFLogxK

    U2 - 10.1016/j.jbi.2016.06.007

    DO - 10.1016/j.jbi.2016.06.007

    M3 - Article

    AN - SCOPUS:84978034203

    VL - 62

    SP - 148

    EP - 158

    JO - Journal of Biomedical Informatics

    JF - Journal of Biomedical Informatics

    SN - 1532-0464

    ER -