Quality of information and appropriateness of ChatGPT outputs for urology patients

Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › fagfællebedømt

Standard

Quality of information and appropriateness of ChatGPT outputs for urology patients. / Cocci, Andrea; Pezzoli, Marta; Lo Re, Mattia; Russo, Giorgio Ivan; Asmundo, Maria Giovanna; Fode, Mikkel; Cacciamani, Giovanni; Cimino, Sebastiano; Minervini, Andrea; Durukan, Emil.

I: Prostate Cancer and Prostatic Diseases, Bind 27, 2024, s. 103–108.

Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › fagfællebedømt

Harvard

Cocci, A, Pezzoli, M, Lo Re, M, Russo, GI, Asmundo, MG, Fode, M, Cacciamani, G, Cimino, S, Minervini, A & Durukan, E 2024, 'Quality of information and appropriateness of ChatGPT outputs for urology patients', Prostate Cancer and Prostatic Diseases, bind 27, s. 103–108. https://doi.org/10.1038/s41391-023-00705-y

APA

Cocci, A., Pezzoli, M., Lo Re, M., Russo, G. I., Asmundo, M. G., Fode, M., Cacciamani, G., Cimino, S., Minervini, A., & Durukan, E. (2024). Quality of information and appropriateness of ChatGPT outputs for urology patients. Prostate Cancer and Prostatic Diseases, 27, 103–108. https://doi.org/10.1038/s41391-023-00705-y

Vancouver

Cocci A, Pezzoli M, Lo Re M, Russo GI, Asmundo MG, Fode M o.a. Quality of information and appropriateness of ChatGPT outputs for urology patients. Prostate Cancer and Prostatic Diseases. 2024;27:103–108. https://doi.org/10.1038/s41391-023-00705-y

Author

Cocci, Andrea ; Pezzoli, Marta ; Lo Re, Mattia ; Russo, Giorgio Ivan ; Asmundo, Maria Giovanna ; Fode, Mikkel ; Cacciamani, Giovanni ; Cimino, Sebastiano ; Minervini, Andrea ; Durukan, Emil. / Quality of information and appropriateness of ChatGPT outputs for urology patients. I: Prostate Cancer and Prostatic Diseases. 2024 ; Bind 27. s. 103–108.

Bibtex

@article{93c8898a842b4592b15733bb17f59d41,

title = "Quality of information and appropriateness of ChatGPT outputs for urology patients",

abstract = "Background: The proportion of health-related searches on the internet is continuously growing. ChatGPT, a natural language processing (NLP) tool created by OpenAI, has been gaining increasing user attention and can potentially be used as a source for obtaining information related to health concerns. This study aims to analyze the quality and appropriateness of ChatGPT{\textquoteright}s responses to Urology case studies compared to those of a urologist. Methods: Data from 100 patient case studies, comprising patient demographics, medical history, and urologic complaints, were sequentially inputted into ChatGPT, one by one. A question was posed to determine the most likely diagnosis, suggested examinations, and treatment options. The responses generated by ChatGPT were then compared to those provided by a board-certified urologist who was blinded to ChatGPT{\textquoteright}s responses and graded on a 5-point Likert scale based on accuracy, comprehensiveness, and clarity as criterias for appropriateness. The quality of information was graded based on the section 2 of the DISCERN tool and readability assessments were performed using the Flesch Reading Ease (FRE) and Flesch-Kincaid Reading Grade Level (FKGL) formulas. Results: 52% of all responses were deemed appropriate. ChatGPT provided more appropriate responses for non-oncology conditions (58.5%) compared to oncology (52.6%) and emergency urology cases (11.1%) (p = 0.03). The median score of the DISCERN tool was 15 (IQR = 5.3) corresponding to a quality score of poor. The ChatGPT responses demonstrated a college graduate reading level, as indicated by the median FRE score of 18 (IQR = 21) and the median FKGL score of 15.8 (IQR = 3). Conclusions: ChatGPT serves as an interactive tool for providing medical information online, offering the possibility of enhancing health outcomes and patient satisfaction. Nevertheless, the insufficient appropriateness and poor quality of the responses on Urology cases emphasizes the importance of thorough evaluation and use of NLP-generated outputs when addressing health-related concerns.",

author = "Andrea Cocci and Marta Pezzoli and {Lo Re}, Mattia and Russo, {Giorgio Ivan} and Asmundo, {Maria Giovanna} and Mikkel Fode and Giovanni Cacciamani and Sebastiano Cimino and Andrea Minervini and Emil Durukan",

note = "Publisher Copyright: {\textcopyright} 2023, The Author(s), under exclusive licence to Springer Nature Limited.",

year = "2024",

doi = "10.1038/s41391-023-00705-y",

language = "English",

volume = "27",

pages = "103–108",

journal = "Prostate Cancer and Prostatic Diseases",

issn = "1365-7852",

publisher = "Nature Publishing Group",

}

RIS

TY - JOUR

T1 - Quality of information and appropriateness of ChatGPT outputs for urology patients

AU - Cocci, Andrea

AU - Pezzoli, Marta

AU - Lo Re, Mattia

AU - Russo, Giorgio Ivan

AU - Asmundo, Maria Giovanna

AU - Fode, Mikkel

AU - Cacciamani, Giovanni

AU - Cimino, Sebastiano

AU - Minervini, Andrea

AU - Durukan, Emil

PY - 2024

Y1 - 2024

N2 - Background: The proportion of health-related searches on the internet is continuously growing. ChatGPT, a natural language processing (NLP) tool created by OpenAI, has been gaining increasing user attention and can potentially be used as a source for obtaining information related to health concerns. This study aims to analyze the quality and appropriateness of ChatGPT’s responses to Urology case studies compared to those of a urologist. Methods: Data from 100 patient case studies, comprising patient demographics, medical history, and urologic complaints, were sequentially inputted into ChatGPT, one by one. A question was posed to determine the most likely diagnosis, suggested examinations, and treatment options. The responses generated by ChatGPT were then compared to those provided by a board-certified urologist who was blinded to ChatGPT’s responses and graded on a 5-point Likert scale based on accuracy, comprehensiveness, and clarity as criterias for appropriateness. The quality of information was graded based on the section 2 of the DISCERN tool and readability assessments were performed using the Flesch Reading Ease (FRE) and Flesch-Kincaid Reading Grade Level (FKGL) formulas. Results: 52% of all responses were deemed appropriate. ChatGPT provided more appropriate responses for non-oncology conditions (58.5%) compared to oncology (52.6%) and emergency urology cases (11.1%) (p = 0.03). The median score of the DISCERN tool was 15 (IQR = 5.3) corresponding to a quality score of poor. The ChatGPT responses demonstrated a college graduate reading level, as indicated by the median FRE score of 18 (IQR = 21) and the median FKGL score of 15.8 (IQR = 3). Conclusions: ChatGPT serves as an interactive tool for providing medical information online, offering the possibility of enhancing health outcomes and patient satisfaction. Nevertheless, the insufficient appropriateness and poor quality of the responses on Urology cases emphasizes the importance of thorough evaluation and use of NLP-generated outputs when addressing health-related concerns.

AB - Background: The proportion of health-related searches on the internet is continuously growing. ChatGPT, a natural language processing (NLP) tool created by OpenAI, has been gaining increasing user attention and can potentially be used as a source for obtaining information related to health concerns. This study aims to analyze the quality and appropriateness of ChatGPT’s responses to Urology case studies compared to those of a urologist. Methods: Data from 100 patient case studies, comprising patient demographics, medical history, and urologic complaints, were sequentially inputted into ChatGPT, one by one. A question was posed to determine the most likely diagnosis, suggested examinations, and treatment options. The responses generated by ChatGPT were then compared to those provided by a board-certified urologist who was blinded to ChatGPT’s responses and graded on a 5-point Likert scale based on accuracy, comprehensiveness, and clarity as criterias for appropriateness. The quality of information was graded based on the section 2 of the DISCERN tool and readability assessments were performed using the Flesch Reading Ease (FRE) and Flesch-Kincaid Reading Grade Level (FKGL) formulas. Results: 52% of all responses were deemed appropriate. ChatGPT provided more appropriate responses for non-oncology conditions (58.5%) compared to oncology (52.6%) and emergency urology cases (11.1%) (p = 0.03). The median score of the DISCERN tool was 15 (IQR = 5.3) corresponding to a quality score of poor. The ChatGPT responses demonstrated a college graduate reading level, as indicated by the median FRE score of 18 (IQR = 21) and the median FKGL score of 15.8 (IQR = 3). Conclusions: ChatGPT serves as an interactive tool for providing medical information online, offering the possibility of enhancing health outcomes and patient satisfaction. Nevertheless, the insufficient appropriateness and poor quality of the responses on Urology cases emphasizes the importance of thorough evaluation and use of NLP-generated outputs when addressing health-related concerns.

U2 - 10.1038/s41391-023-00705-y

DO - 10.1038/s41391-023-00705-y

M3 - Journal article

C2 - 37516804

AN - SCOPUS:85169813220

VL - 27

SP - 103

EP - 108

JO - Prostate Cancer and Prostatic Diseases

JF - Prostate Cancer and Prostatic Diseases

SN - 1365-7852

ER -

ID: 372961618

Institut for Idræt og Ernæring