Utilizing social media data for pharmacovigilance: A review

Abeed Sarker, Rachel Ginn, Azadeh Nikfarjam, Karen O'Connor, Karen Smith, Swetha Jayaraman, Tejaswi Upadhaya, Graciela Gonzalez

    Research output: Contribution to journalArticle

    189 Citations (Scopus)

    Abstract

    Objective: Automatic monitoring of Adverse Drug Reactions (ADRs), defined as adverse patient outcomes caused by medications, is a challenging research problem that is currently receiving significant attention from the medical informatics community. In recent years, user-posted data on social media, primarily due to its sheer volume, has become a useful resource for ADR monitoring. Research using social media data has progressed using various data sources and techniques, making it difficult to compare distinct systems and their performances. In this paper, we perform a methodical review to characterize the different approaches to ADR detection/extraction from social media, and their applicability to pharmacovigilance. In addition, we present a potential systematic pathway to ADR monitoring from social media. Methods: We identified studies describing approaches for ADR detection from social media from the Medline, Embase, Scopus and Web of Science databases, and the Google Scholar search engine. Studies that met our inclusion criteria were those that attempted to extract ADR information posted by users on any publicly available social media platform. We categorized the studies according to different characteristics such as primary ADR detection approach, size of corpus, data source(s), availability, and evaluation criteria. Results: Twenty-two studies met our inclusion criteria, with fifteen (68%) published within the last two years. However, publicly available annotated data is still scarce, and we found only six studies that made the annotations used publicly available, making system performance comparisons difficult. In terms of algorithms, supervised classification techniques to detect posts containing ADR mentions, and lexicon-based approaches for extraction of ADR mentions from texts have been the most popular. Conclusion: Our review suggests that interest in the utilization of the vast amounts of available social media data for ADR monitoring is increasing. In terms of sources, both health-related and general social media data have been used for ADR detection-while health-related sources tend to contain higher proportions of relevant data, the volume of data from general social media websites is significantly higher. There is still very limited amount of annotated data publicly available , and, as indicated by the promising results obtained by recent supervised learning approaches, there is a strong need to make such data available to the research community.

    Original languageEnglish (US)
    Pages (from-to)202-212
    Number of pages11
    JournalJournal of Biomedical Informatics
    Volume54
    DOIs
    StatePublished - Apr 1 2015

    Fingerprint

    Social Media
    Pharmacovigilance
    Drug-Related Side Effects and Adverse Reactions
    Monitoring
    Health
    Drug Monitoring
    Supervised learning
    Search engines
    World Wide Web
    Websites
    Information Storage and Retrieval
    Availability
    Research
    Search Engine
    Medical Informatics

    Keywords

    • Adverse drug reaction
    • Pharmacovigilance
    • Social media

    ASJC Scopus subject areas

    • Computer Science Applications
    • Health Informatics

    Cite this

    Sarker, A., Ginn, R., Nikfarjam, A., O'Connor, K., Smith, K., Jayaraman, S., ... Gonzalez, G. (2015). Utilizing social media data for pharmacovigilance: A review. Journal of Biomedical Informatics, 54, 202-212. https://doi.org/10.1016/j.jbi.2015.02.004

    Utilizing social media data for pharmacovigilance : A review. / Sarker, Abeed; Ginn, Rachel; Nikfarjam, Azadeh; O'Connor, Karen; Smith, Karen; Jayaraman, Swetha; Upadhaya, Tejaswi; Gonzalez, Graciela.

    In: Journal of Biomedical Informatics, Vol. 54, 01.04.2015, p. 202-212.

    Research output: Contribution to journalArticle

    Sarker, A, Ginn, R, Nikfarjam, A, O'Connor, K, Smith, K, Jayaraman, S, Upadhaya, T & Gonzalez, G 2015, 'Utilizing social media data for pharmacovigilance: A review', Journal of Biomedical Informatics, vol. 54, pp. 202-212. https://doi.org/10.1016/j.jbi.2015.02.004
    Sarker A, Ginn R, Nikfarjam A, O'Connor K, Smith K, Jayaraman S et al. Utilizing social media data for pharmacovigilance: A review. Journal of Biomedical Informatics. 2015 Apr 1;54:202-212. https://doi.org/10.1016/j.jbi.2015.02.004
    Sarker, Abeed ; Ginn, Rachel ; Nikfarjam, Azadeh ; O'Connor, Karen ; Smith, Karen ; Jayaraman, Swetha ; Upadhaya, Tejaswi ; Gonzalez, Graciela. / Utilizing social media data for pharmacovigilance : A review. In: Journal of Biomedical Informatics. 2015 ; Vol. 54. pp. 202-212.
    @article{7f240cafab7c45bf9e80bf5a08b2a915,
    title = "Utilizing social media data for pharmacovigilance: A review",
    abstract = "Objective: Automatic monitoring of Adverse Drug Reactions (ADRs), defined as adverse patient outcomes caused by medications, is a challenging research problem that is currently receiving significant attention from the medical informatics community. In recent years, user-posted data on social media, primarily due to its sheer volume, has become a useful resource for ADR monitoring. Research using social media data has progressed using various data sources and techniques, making it difficult to compare distinct systems and their performances. In this paper, we perform a methodical review to characterize the different approaches to ADR detection/extraction from social media, and their applicability to pharmacovigilance. In addition, we present a potential systematic pathway to ADR monitoring from social media. Methods: We identified studies describing approaches for ADR detection from social media from the Medline, Embase, Scopus and Web of Science databases, and the Google Scholar search engine. Studies that met our inclusion criteria were those that attempted to extract ADR information posted by users on any publicly available social media platform. We categorized the studies according to different characteristics such as primary ADR detection approach, size of corpus, data source(s), availability, and evaluation criteria. Results: Twenty-two studies met our inclusion criteria, with fifteen (68{\%}) published within the last two years. However, publicly available annotated data is still scarce, and we found only six studies that made the annotations used publicly available, making system performance comparisons difficult. In terms of algorithms, supervised classification techniques to detect posts containing ADR mentions, and lexicon-based approaches for extraction of ADR mentions from texts have been the most popular. Conclusion: Our review suggests that interest in the utilization of the vast amounts of available social media data for ADR monitoring is increasing. In terms of sources, both health-related and general social media data have been used for ADR detection-while health-related sources tend to contain higher proportions of relevant data, the volume of data from general social media websites is significantly higher. There is still very limited amount of annotated data publicly available , and, as indicated by the promising results obtained by recent supervised learning approaches, there is a strong need to make such data available to the research community.",
    keywords = "Adverse drug reaction, Pharmacovigilance, Social media",
    author = "Abeed Sarker and Rachel Ginn and Azadeh Nikfarjam and Karen O'Connor and Karen Smith and Swetha Jayaraman and Tejaswi Upadhaya and Graciela Gonzalez",
    year = "2015",
    month = "4",
    day = "1",
    doi = "10.1016/j.jbi.2015.02.004",
    language = "English (US)",
    volume = "54",
    pages = "202--212",
    journal = "Journal of Biomedical Informatics",
    issn = "1532-0464",
    publisher = "Academic Press Inc.",

    }

    TY - JOUR

    T1 - Utilizing social media data for pharmacovigilance

    T2 - A review

    AU - Sarker, Abeed

    AU - Ginn, Rachel

    AU - Nikfarjam, Azadeh

    AU - O'Connor, Karen

    AU - Smith, Karen

    AU - Jayaraman, Swetha

    AU - Upadhaya, Tejaswi

    AU - Gonzalez, Graciela

    PY - 2015/4/1

    Y1 - 2015/4/1

    N2 - Objective: Automatic monitoring of Adverse Drug Reactions (ADRs), defined as adverse patient outcomes caused by medications, is a challenging research problem that is currently receiving significant attention from the medical informatics community. In recent years, user-posted data on social media, primarily due to its sheer volume, has become a useful resource for ADR monitoring. Research using social media data has progressed using various data sources and techniques, making it difficult to compare distinct systems and their performances. In this paper, we perform a methodical review to characterize the different approaches to ADR detection/extraction from social media, and their applicability to pharmacovigilance. In addition, we present a potential systematic pathway to ADR monitoring from social media. Methods: We identified studies describing approaches for ADR detection from social media from the Medline, Embase, Scopus and Web of Science databases, and the Google Scholar search engine. Studies that met our inclusion criteria were those that attempted to extract ADR information posted by users on any publicly available social media platform. We categorized the studies according to different characteristics such as primary ADR detection approach, size of corpus, data source(s), availability, and evaluation criteria. Results: Twenty-two studies met our inclusion criteria, with fifteen (68%) published within the last two years. However, publicly available annotated data is still scarce, and we found only six studies that made the annotations used publicly available, making system performance comparisons difficult. In terms of algorithms, supervised classification techniques to detect posts containing ADR mentions, and lexicon-based approaches for extraction of ADR mentions from texts have been the most popular. Conclusion: Our review suggests that interest in the utilization of the vast amounts of available social media data for ADR monitoring is increasing. In terms of sources, both health-related and general social media data have been used for ADR detection-while health-related sources tend to contain higher proportions of relevant data, the volume of data from general social media websites is significantly higher. There is still very limited amount of annotated data publicly available , and, as indicated by the promising results obtained by recent supervised learning approaches, there is a strong need to make such data available to the research community.

    AB - Objective: Automatic monitoring of Adverse Drug Reactions (ADRs), defined as adverse patient outcomes caused by medications, is a challenging research problem that is currently receiving significant attention from the medical informatics community. In recent years, user-posted data on social media, primarily due to its sheer volume, has become a useful resource for ADR monitoring. Research using social media data has progressed using various data sources and techniques, making it difficult to compare distinct systems and their performances. In this paper, we perform a methodical review to characterize the different approaches to ADR detection/extraction from social media, and their applicability to pharmacovigilance. In addition, we present a potential systematic pathway to ADR monitoring from social media. Methods: We identified studies describing approaches for ADR detection from social media from the Medline, Embase, Scopus and Web of Science databases, and the Google Scholar search engine. Studies that met our inclusion criteria were those that attempted to extract ADR information posted by users on any publicly available social media platform. We categorized the studies according to different characteristics such as primary ADR detection approach, size of corpus, data source(s), availability, and evaluation criteria. Results: Twenty-two studies met our inclusion criteria, with fifteen (68%) published within the last two years. However, publicly available annotated data is still scarce, and we found only six studies that made the annotations used publicly available, making system performance comparisons difficult. In terms of algorithms, supervised classification techniques to detect posts containing ADR mentions, and lexicon-based approaches for extraction of ADR mentions from texts have been the most popular. Conclusion: Our review suggests that interest in the utilization of the vast amounts of available social media data for ADR monitoring is increasing. In terms of sources, both health-related and general social media data have been used for ADR detection-while health-related sources tend to contain higher proportions of relevant data, the volume of data from general social media websites is significantly higher. There is still very limited amount of annotated data publicly available , and, as indicated by the promising results obtained by recent supervised learning approaches, there is a strong need to make such data available to the research community.

    KW - Adverse drug reaction

    KW - Pharmacovigilance

    KW - Social media

    UR - http://www.scopus.com/inward/record.url?scp=84927917741&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=84927917741&partnerID=8YFLogxK

    U2 - 10.1016/j.jbi.2015.02.004

    DO - 10.1016/j.jbi.2015.02.004

    M3 - Article

    AN - SCOPUS:84927917741

    VL - 54

    SP - 202

    EP - 212

    JO - Journal of Biomedical Informatics

    JF - Journal of Biomedical Informatics

    SN - 1532-0464

    ER -