Rotenstein, L. S., Landman, A. & Bates, D. W. The electronic inbox-benefits, questions, and solutions for the road ahead. JAMA 330, 1735–1736 (2023).
Google Scholar
Nath, B. et al. Trends in electronic health record inbox messaging during the COVID-19 pandemic in an ambulatory practice network in New England. JAMA Netw. Open 4, e2131490 (2021).
Google Scholar
Burden, M. & Dyrbye, L. Evidence-Based Work Design – Bridging the Divide. N. Engl. J. Med 392, 1044–1046 (2025).
Google Scholar
Mandal, S. et al. Quantifying the impact of telemedicine and patient medical advice request messages on physicians’ work-outside-work. NPJ Digit Med 7, 35 (2024).
Google Scholar
Holmgren, A. J., Byron, M. E., Grouse, C. K. & Adler-Milstein, J. Association Between Billing Patient Portal Messages as e-Visits and Patient Messaging Volume. JAMA 329, 339–342 (2023).
Google Scholar
Baxter, S. L. et al. Association of Electronic Health Record Inbasket Message Characteristics With Physician Burnout. JAMA Netw. Open 5, e2244363 (2022).
Google Scholar
Paul Testa, A. S. How We’re Improving Physicians’ Messaging Experience Through Digital Tools, (2022).
Fogg, J. F. & Sinsky, C. A. In-Basket Reduction: A Multiyear Pragmatic Approach to Lessen the Work Burden of Primary Care Physicians. NEJM catalyst innovations in care delivery 4. (2023).
Si, S. et al. In Machine Learning for Healthcare Conference. 436-456 (PMLR).
Ouyang, L. et al. Vol. 35 (eds Koyejo S. et al.) 27730-27744 (2022).
Garcia, P. et al. Artificial Intelligence-Generated Draft Replies to Patient Inbox Messages. JAMA Netw. Open 7, e243201 (2024).
Google Scholar
Tai-Seale, M. et al. AI-Generated Draft Replies Integrated Into Health Records and Physicians’ Electronic Communication. JAMA Netw. Open 7, e246565 (2024).
Google Scholar
Baxter, S. L., Longhurst, C. A., Millen, M., Sitapati, A. M. & Tai-Seale, M. Generative artificial intelligence responses to patient messages in the electronic health record: early lessons learned. JAMIA Open 7, ooae028 (2024).
Google Scholar
English, E., Laughlin, J., Sippel, J., DeCamp, M. & Lin, C. T. Utility of Artificial Intelligence-Generative Draft Replies to Patient Messages. JAMA Netw. Open 7, e2438573 (2024).
Google Scholar
Chen, S. et al. The effect of using a large language model to respond to patient messages. Lancet Digit Health 6, e379–e381 (2024).
Google Scholar
Walker, A., Baxter, S., Tai-Seale, M., Sitapati, A. & Longhurst, C. The bot will answer you now: Using AI to assist patient-physician communication and implications for physician inbox workload. Annals of Family Medicine 21 (2023).
Cavalier, J. S. et al. Ethics in Patient Preferences for Artificial Intelligence-Drafted Responses to Electronic Messages. JAMA Netw. Open 8, e250449 (2025).
Google Scholar
Ayers, J. W., Desai, N. & Smith, D. M. Regulate Artificial Intelligence in Health Care by Prioritizing Patient Outcomes. JAMA 331, 639–640 (2024).
Google Scholar
Shelmerdine, S. C. et al. Can artificial intelligence pass the Fellowship of the Royal College of Radiologists examination? Multi-reader diagnostic accuracy study. BMJ 379, e072826 (2022).
Google Scholar
Singhal, K. et al. Large language models encode clinical knowledge. Nature 620, 172–180 (2023).
Google Scholar
Zhao, W. X. et al. A survey of large language models. arXiv preprint arXiv:2303.18223 (2023).
Bubeck, S. et al. Sparks of artificial general intelligence: Early experiments with gpt-4. arXiv preprint arXiv:2303.12712 (2023).
Johnson, D. et al. Assessing the Accuracy and Reliability of AI-Generated Medical Responses: An Evaluation of the Chat-GPT Model. Res Sq. (2023).
Small, W. R. et al. Large Language Model-Based Responses to Patients’ In-Basket Messages. JAMA Netw. Open 7, e2422399 (2024).
Google Scholar
Nov, O., Singh, N. & Mann, D. Putting ChatGPT’s Medical Advice to the (Turing) Test: Survey Study. JMIR Med Educ. 9, e46939 (2023).
Google Scholar
Sezgin, E., Sirrianni, J. & Linwood, S. L. Operationalizing and Implementing Pretrained, Large Artificial Intelligence Linguistic Models in the US Health Care System: Outlook of Generative Pretrained Transformer 3 (GPT-3) as a Service Model. JMIR Med Inf. 10, e32875 (2022).
Google Scholar
Webster, P. Six ways large language models are changing healthcare. Nat. Med. 29, 2969–2971 (2023).
Google Scholar
Yan, S. et al. Prompt engineering on leveraging large language models in generating response to InBasket messages. J. Am. Med Inf. Assoc. 31, 2263–2270 (2024).
Google Scholar
Baddeley, A. Working memory. Science 255, 556–559 (1992).
Google Scholar
Budd, J. Burnout Related to Electronic Health Record Use in Primary Care. J Prim Care Community Health 14, 21501319231166921. (2023).
Murphy, D. R., Giardina, T. D., Satterly, T., Sittig, D. F. & Singh, H. An Exploration of Barriers, Facilitators, and Suggestions for Improving Electronic Health Record Inbox-Related Usability: A Qualitative Analysis. JAMA Netw. Open 2, e1912638 (2019).
Google Scholar
Hersh, W. Search still matters: information retrieval in the era of generative AI. J. Am. Med Inf. Assoc. 31, 2159–2161 (2024).
Google Scholar
Moy, A. J., Cato, K. D., Kim, E. Y., Withall, J. & Rossetti, S. C. A Computational Framework to Evaluate Emergency Department Clinician Task Switching in the Electronic Health Record Using Event Logs. AMIA Annu Symp. Proc. 2023, 1183–1192 (2023).
Google Scholar
Alkhalaf, M., Yu, P., Yin, M. & Deng, C. Applying generative AI with retrieval augmented generation to summarize and extract key clinical information from electronic health records. J. Biomed. Inf. 156, 104662 (2024).
Google Scholar
Wachter, R. M. & Brynjolfsson, E. Will Generative Artificial Intelligence Deliver on Its Promise in Health Care?. JAMA 331, 65–69 (2024).
Google Scholar
Reinhard, P. In Mensch und Computer 2024-Workshopband. (Gesellschaft für Informatik eV).
Ko, D. G., Tachinardi, U. & Warm, E. J. Secure messaging telehealth billing in the digital age: moving beyond time-based metrics. J. Am. Med Inf. Assoc. 32, 230–234 (2025).
Google Scholar
Idnay, B. et al. Environment scan of generative AI infrastructure for clinical and translational science. Npj Health Syst. 2, 4 (2025).
Google Scholar
Borges do Nascimento, I. J. et al. Barriers and facilitators to utilizing digital health technologies by healthcare professionals. npj Digital Med. 6, 161 (2023).
Google Scholar
Bang, Y. et al. In Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (Volume 1: Long Papers). 675-718.
Ji, Z. et al. Survey of hallucination in natural language generation. ACM Comput. Surv. 55, 1–38 (2023).
Google Scholar
Akbar, F. et al. Physicians’ electronic inbox work patterns and factors associated with high inbox work duration. J. Am. Med. Inform. Assoc. 28, 923–930 (2020).
Google Scholar
Kannampallil, T. & Adler-Milstein, J. Using electronic health record audit log data for research: insights from early efforts. J. Am. Med Inf. Assoc. 30, 167–171 (2022).
Google Scholar
Ahmed, A., Chandra, S., Herasevich, V., Gajic, O. & Pickering, B. W. The effect of two different electronic health record user interfaces on intensive care provider task load, errors of cognition, and performance. Crit. Care Med 39, 1626–1634 (2011).
Google Scholar
Ratanawongsa, N., Matta, G. Y., Bohsali, F. B. & Chisolm, M. S. Reducing misses and near misses related to multitasking on the electronic health record: observational study and qualitative analysis. JMIR Hum. Factors 5, e4 (2018).
Google Scholar
Rule, A., Chiang, M. F. & Hribar, M. R. Using electronic health record audit logs to study clinical activity: a systematic review of aims, measures, and methods. J. Am. Med Inf. Assoc. 27, 480–490 (2020).
Google Scholar
Sinsky, C. A. et al. Metrics for assessing physician activity using electronic health record log data. J. Am. Med Inf. Assoc. 27, 639–643 (2020).
Google Scholar
Amroze, A. et al. Use of electronic health record access and audit logs to identify physician actions following noninterruptive alert opening: descriptive study. JMIR Med. Inf. 7, e12650 (2019).
Google Scholar
Brysbaert, M. How many words do we read per minute? A review and meta-analysis of reading rate. J. Mem. Lang. 109, 104047 (2019).
Google Scholar
Abbasian, M. et al. Foundation metrics for evaluating effectiveness of healthcare conversations powered by generative AI. NPJ Digit Med. 7, 82 (2024).
Google Scholar
Rydzewski, N. R. et al. Comparative Evaluation of LLMs in Clinical Oncology. NEJM AI 1, (2024).
Brynjolfsson, E., Li, D. & Raymond, L. R. Generative AI at work. (National Bureau of Economic Research, 2023).
Huang, M. et al. In 2023 IEEE 11th International Conference on Healthcare Informatics (ICHI). 381-387 (IEEE).
Derksen, F., Bensing, J. & Lagro-Janssen, A. Effectiveness of empathy in general practice: a systematic review. Br. J. Gen. Pr. 63, e76–e84 (2013).
Google Scholar
Smith, L., Kirk, W., Bennett, M. M., Youens, K. & Ramm, J. From Headache to Handled: Advanced In-Basket Management System in Primary Care Clinics Reduces Provider Workload Burden and Self-Reported Burnout. Appl Clin. Inf. 15, 869–876 (2024).
Google Scholar
Sinsky, C. et al. Allocation of Physician Time in Ambulatory Practice: A Time and Motion Study in 4 Specialties. Ann. Intern Med 165, 753–760 (2016).
Google Scholar
Singh, N., Lawrence, K., Sinsky, C. & Mann, D. M. Digital Minimalism – An Rx for Clinician Burnout. N. Engl. J. Med 388, 1158–1159 (2023).
Google Scholar
Newport, C. Digital minimalism: Choosing a focused life in a noisy world. (Penguin, 2019).
Gawande, A. Why doctors hate their computers. The New Yorker 12 (2018).
Greenhalgh, T. et al. Beyond Adoption: A New Framework for Theorizing and Evaluating Nonadoption, Abandonment, and Challenges to the Scale-Up, Spread, and Sustainability of Health and Care Technologies. J. Med Internet Res 19, e367 (2017).
Google Scholar
Shneiderman, B. Human-centered artificial intelligence: Reliable, safe & trustworthy. Int. J. Human–Computer Interact. 36, 495–504 (2020).
Google Scholar
Hong, C. et al. Application of unified health large language model evaluation framework to In-Basket message replies: bridging qualitative and quantitative assessments. J. Am. Med Inf. Assoc. 32, 626–637 (2025).
Google Scholar
Robinson, E. J. et al. Physician vs. AI-generated messages in urology: evaluation of accuracy, completeness, and preference by patients and physicians. World J. Urol. 43, 48 (2024).
Google Scholar
Shannon, C. E. A mathematical theory of communication. ACM SIGMOBILE Mob. Comput. Commun. Rev. 5, 3–55 (2001).
Google Scholar
Pennebaker J. W., B R., Booth R. J., Ashokkumar A., Francis M. E. Linguistic Inquiry and Word Count: LIWC-22, (2022).
He, L. & Zheng, K. How Do General-Purpose Sentiment Analyzers Perform when Applied to Health-Related Online Social Media Data?. Stud. Health Technol. Inf. 264, 1208–1212 (2019).
RamyaSri, V., Niharika, C., Maneesh, K. & Ismail, M. Sentiment Analysis of Patients’ Opinions in Healthcare using Lexicon-based Method. Int. J. Eng. Adv. Technol. 9, 6977–6981 (2019).
Google Scholar
Reimers, N. & Gurevych, I. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (p. 3982). Association for Computational Linguistics (2019).
Yang, X. et al. Measurement of Semantic Textual Similarity in Clinical Texts: Comparison of Transformer-Based Models. JMIR Med Inf. 8, e19735 (2020).
Google Scholar
Dirkson, A., Verberne, S., Kraaij, W., van Oortmerssen, G. & Gelderblom, H. Automated gathering of real-world data from online patient forums can complement pharmacovigilance for rare cancers. Sci. Rep. 12, 10317 (2022).
Google Scholar
Lopez, I. et al. Clinical entity augmented retrieval for clinical information extraction. NPJ Digit Med 8, 45 (2025).
Google Scholar
Kummervold, P. E. & Johnsen, J. A. Physician response time when communicating with patients over the Internet. J. Med Internet Res. 13, e79 (2011).
Google Scholar
Lanham, H. J., Leykum, L. K. & Pugh, J. A. Examining the complexity of patient-outpatient care team secure message communication: qualitative analysis. J. Med. Internet Res. 20, e218 (2018).
Google Scholar
Klare, G. R. Assessing readability. Reading research quarterly, 62–102 (1974).
Huang, Y. Q. et al. Charlson comorbidity index helps predict the risk of mortality for patients with type 2 diabetic nephropathy. J. Zhejiang Univ. Sci. B 15, 58–66 (2014).
Google Scholar
Lawrence, K. et al. The Impact of Telemedicine on Physicians’ After-hours Electronic Health Record “Work Outside Work” During the COVID-19 Pandemic: Retrospective Cohort Study. JMIR Med Inf. 10, e34826 (2022).
Google Scholar
Rule, A. et al. Guidance for reporting analyses of metadata on electronic health record use. J. Am. Med Inf. Assoc. 31, 784–789 (2024).
Google Scholar
link
