Curating Dataset Pipelines to Train Medical Chatbots on Early Sepsis Detection

Authors

  • Arup Das Author
  • Miriam Abecasis Author
  • Thomas Farrell Author
  • Isaac Sasson Author
  • Sophia Ramirez Velandia Monmouth University Author
  • Brooke Tortorelli Author
  • Jiacun Wang Author

Keywords:

Sepsis, Large language model, Medical Chatbot, Lexical Analysis, Semantical Analysis, LLL as a Judge

Abstract

Sepsis is a critical medical condition that arises when the body’s response to infection causes life-threatening organ dysfunction. Despite increasing awareness and the use of protocol-driven management strategies, early diagnosis remains a persistent challenge in clinical practice, especially in high- pressure settings such as emergency departments and ICUs. Nurses, as first responders, are crucial in identifying early signs, but often work under cognitive overload and ambiguity of the protocol. Large language models (LLMs) represent frontier neural network techniques that use self-supervised learning algorithms to process and understand human languages or text. This work focuses on building a robust data gathering pipeline in order to ultimately create an interactive clinical chatbot fine- tuned on sepsis-specific knowledge. The pipeline consists of a three-step process, namely lexical analysis, semantic analysis, and Q&A quality evaluation, that utilizes artificial intelligence to collect training data in novel ways. It provides a feasible and cutting-edge framework for LLM-based chatbot design and development.

Downloads

Download data is not yet available.
5

Downloads

Published

2026-01-04

Issue

Section

IJAIGM2025

Categories

How to Cite

Curating Dataset Pipelines to Train Medical Chatbots on Early Sepsis Detection. (2026). International Journal of Artificial Intelligence and Green Manufacturing, 1(4). https://hopeembark.org/index.php/IJGMAI/article/view/69