Research Area

AI in Medical Education

We are interested in what artificial intelligence actually changes in training: how learners find evidence, how they practice, how feedback is generated, how assessment becomes more continuous, and how clinicians learn to supervise delegated work.

At Grover Lab, this area sits at the intersection of curriculum design, performance analytics, simulation, and build-oriented educational infrastructure. It is not just about putting a chatbot into a course. It is about redesigning how medical training works when models, agents, retrieval systems, and multimodal assessment become part of the educational environment.

What this work is really about

There are at least four different questions inside the phrase AI in medical education. The first is informational: can AI help learners find, summarize, and interrogate the right knowledge? The second is pedagogic: does it improve learning, or merely make tasks feel easier? The third is assessment-related: can it generate better signals about competence than episodic exams and occasional supervision? The fourth is professional: what happens when trainees are no longer judged only by what they execute, but by how they use, supervise, and challenge machine-generated work?

Our view is that the interesting work is not just asking whether AI is useful. It is determining where AI should be used aggressively, where it should be constrained, and where educationally valuable friction still needs to be protected.

How Grover Lab approaches the field

Our approach is less about isolated tools and more about educational infrastructure. We are interested in source-grounded systems, not black boxes; in deliberate practice, not one-off novelty; in continuous performance data, not only retrospective judgment; and in faculty development, because no AI-enabled curriculum will work if the educators themselves cannot model good use.

This is why the current build layer matters. SharpenEDU is the clearest example of that direction. It links curriculum intelligence, repeated learner practice, and multimodal performance assessment into one loop. In practical terms, that means a school can move closer to an environment where curated curricular sources drive explanations, learner practice becomes abundant rather than scarce, and feedback is generated from actual performance rather than only from occasional faculty impressions.

The intellectual logic for that work is visible in Samir Grover’s essays on retrieval-augmented generation in medical education, on curriculum as infrastructure, and on the changing role of the physician in an agentic era in The AI agent-augmented physician.

Current lines of inquiry

Agents and supervision. As multi-step systems become capable of execution, medical education has to decide what remains a protected rep for the learner and what can be delegated. That argument is developed more explicitly in Agentic AI in medical education, part 2, and I became a good doctor by doing it badly first.

Source-grounded curriculum systems. We are skeptical of generic AI outputs untethered from approved curricular material. Our interest is in systems that sit on top of curated sources, create traceable drafts, and make curricular change easier to operationalize. That is one reason SharpenEDU matters: it treats the curriculum as a living substrate rather than a static PDF archive.

Assessment and feedback. AI becomes more educationally meaningful when it produces better signals about actual performance. This includes natural language processing, video-based review, structured feedback, and the possibility of more ambient assessment models. It also includes the hard question of calibration: how to make learners and clinicians better at judging both themselves and the machine.

Faculty development. AI-literate learners need AI-literate supervisors. The bottleneck is not only student skill. It is whether educators know when to use these systems, how to question them, and how to redesign teaching around them. That is part of the argument in Who trains the trainers?.

Selected Grover Lab work in this area

Integrating artificial intelligence into endoscopy training: opportunities, challenges, and strategies. Grover SC, Walsh CM. Lancet Gastroenterology & Hepatology. 2024. A concise statement of how AI changes training, supervision, and educational design in a procedural domain.

Comparison of the Impact of Wikipedia, UpToDate, and a Digital Textbook on Short-Term Knowledge Acquisition Among Medical Students: Randomized Controlled Trial of Three Web-Based Resources. Scaffidi MA, Khan R, Wang C, et al. JMIR Medical Education. 2017. An earlier Grover Lab study on digital knowledge sources and learner performance, relevant to how trainees use external cognitive tools.

Video-based interventions to improve self-assessment accuracy among physicians: A systematic review. Pattni C, Scaffidi M, Li J, et al. PLOS One. 2023. Relevant to AI-enabled feedback and the question of how technology can improve calibration rather than only efficiency.

Performance of natural language processing in identifying adenomas from colonoscopy reports: a systematic review and meta-analysis. Sabrie N, et al. IGIE. 2023. Illustrates Grover Lab interest in extracting actionable signals from routine clinical text, a key ingredient in future assessment infrastructure.

The potential utility of CHATGPT4.0 as an AI assistant in the education and management of patients with Barrett's esophagus. Dang F, Kwon J, Lin A, et al. Diseases of the Esophagus. 2025. Included here as a contemporary example of LLM-assisted patient education and clinician support in a content domain adjacent to training.

Related essays and build work

Agentic AI in medical education Why agents matter because they perceive, reason, act, and persist rather than simply answer.

The AI agent-augmented physician A framework for medicine’s shift from pure execution to the supervision of delegated work.

Tying RAGs to medical education An argument for connecting model outputs to curated institutional sources rather than free-floating generation.

Curriculum as infrastructure: how to keep medical education evergreen A design argument for continuous curricular updating rather than periodic refresh cycles.

SharpenEDU Current educational infrastructure work spanning curriculum intelligence, deliberate practice, and multimodal performance assessment.