Please use this identifier to cite or link to this item:
http://dx.doi.org/10.25673/117452
Title: | Proof-of-concept study of a small language model chatbot for breast cancer decision support - a transparent, source-controlled, explainable and data-secure approach |
Author(s): | Griewing, Sebastian Lechner, Fabian Gremke, Niklas Lukáč, Štefan Janni, Wolfgang Wallwiener, Markus Wagner, Uwe Hirsch, Martin Kuhn, Sebastian |
Issue Date: | 2024 |
Type: | Article |
Language: | English |
Abstract: | Purpose: Large language models (LLM) show potential for decision support in breast cancer care. Their use in clinical care is currently prohibited by lack of control over sources used for decision-making, explainability of the decision-making process and health data security issues. Recent development of Small Language Models (SLM) is discussed to address these challenges. This preclinical proof-of-concept study tailors an open-source SLM to the German breast cancer guideline (BC-SLM) to evaluate initial clinical accuracy and technical functionality in a preclinical simulation. Methods: A multidisciplinary tumor board (MTB) is used as the gold-standard to assess the initial clinical accuracy in terms of concordance of the BC-SLM with MTB and comparing it to two publicly available LLM, ChatGPT3.5 and 4. The study includes 20 fictional patient profiles and recommendations for 5 treatment modalities, resulting in 100 binary treatment recommendations (recommended or not recommended). Statistical evaluation includes concordance with MTB in % including Cohen’s Kappa statistic (κ). Technical functionality is assessed qualitatively in terms of local hosting, adherence to the guideline and information retrieval. Results: The overall concordance amounts to 86% for BC-SLM (κ = 0.721, p < 0.001), 90% for ChatGPT4 (κ = 0.820, p < 0.001) and 83% for ChatGPT3.5 (κ = 0.661, p < 0.001). Specific concordance for each treatment modality ranges from 65 to 100% for BC-SLM, 85–100% for ChatGPT4, and 55–95% for ChatGPT3.5. The BC-SLM is locally functional, adheres to the standards of the German breast cancer guideline and provides referenced sections for its decision-making. Conclusion: The tailored BC-SLM shows initial clinical accuracy and technical functionality, with concordance to the MTB that is comparable to publicly-available LLMs like ChatGPT4 and 3.5. This serves as a proof-of-concept for adapting a SLM to an oncological disease and its guideline to address prevailing issues with LLM by ensuring decision transparency, explainability, source control, and data security, which represents a necessary step towards clinical validation and safe use of language models in clinical oncology. |
URI: | https://opendata.uni-halle.de//handle/1981185920/119411 http://dx.doi.org/10.25673/117452 |
Open Access: | Open access publication |
License: | (CC BY 4.0) Creative Commons Attribution 4.0 |
Journal Title: | Journal of cancer research and clinical oncology |
Publisher: | Springer |
Publisher Place: | Berlin |
Volume: | 150 |
Original Publication: | 10.1007/s00432-024-05964-3 |
Appears in Collections: | Open Access Publikationen der MLU |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
s00432-024-05964-3.pdf | 1.13 MB | Adobe PDF | View/Open |