Journal of Nursing Education

Major Article 

Kirkpatrick's Evaluation of Simulation and Debriefing in Health Care Education: A Systematic Review

Sandra Johnston, RN; Fiona Maree Coyer, PhD, RN; Robyn Nash, PhD, RN



Simulation is an integral component of health care education. Research suggests a positive relationship between simulation and learning outcomes. Kirkpatrick's framework is a four-level model based on the premise that learning resulting from training programs can be classified into four levels: reaction, learning, behavior, and results. Evaluation of educational impact provides valuable feedback to educators that may assist with development and improvement of teaching methods.


This review is based on the PRISMA guidelines for conducting a systematic review. Inclusion criteria included articles (a) written in the English language, (b) published between 2000 and 2016, (c) describing a debriefing intervention after high-fidelity patient simulation, and (d) based in health care.


Thirteen studies met criteria for inclusion in the review.


Results indicated a paucity of studies at the highest levels of evaluation, indicating an area where future research is needed to assist with the development and improvement of simulation education. [J Nurs Educ. 2018;57(7):393–398.]



Simulation is an integral component of health care education. Research suggests a positive relationship between simulation and learning outcomes. Kirkpatrick's framework is a four-level model based on the premise that learning resulting from training programs can be classified into four levels: reaction, learning, behavior, and results. Evaluation of educational impact provides valuable feedback to educators that may assist with development and improvement of teaching methods.


This review is based on the PRISMA guidelines for conducting a systematic review. Inclusion criteria included articles (a) written in the English language, (b) published between 2000 and 2016, (c) describing a debriefing intervention after high-fidelity patient simulation, and (d) based in health care.


Thirteen studies met criteria for inclusion in the review.


Results indicated a paucity of studies at the highest levels of evaluation, indicating an area where future research is needed to assist with the development and improvement of simulation education. [J Nurs Educ. 2018;57(7):393–398.]

In health care education, it is not uncommon for students to be unable to apply theoretical knowledge to solve clinical problems despite having demonstrated mastery of theory (Norman, 2009). As an educational strategy supporting students in the application of theory to practice, high-fidelity simulation (HFS) has become increasingly popular, to the extent that it now comprises a significant component of health care education (Boet et al., 2014; Decker, Sportsman, Puetz, & Billings, 2008; Kirkman, 2013). In terms of supporting the pedagogical intent of transferring learning from the simulation to future practice, HFS blends academic learning and authentic real-world connections by means of the representation of real-life scenarios, emphasizing the relevance for real-world learners (DiLullo, McGee, & Kriebel, 2011).

Typical routines of simulation comprise of prebriefing, enactment of the clinical scenario, and a debriefing. Although all components work in unison, the debriefing is postulated to be a critical component and has been described as the cornerstone of simulation (Issenberg, McGaghie, Petrusa, Lee Gordon, & Scalese, 2005). Literature suggests that debriefing is simulation's most effective feature (Issenberg et al., 2005; McGaghie, Issenberg, Petrusa, & Scalese, 2010). Debriefing is a critical process for fostering deep learning where the simulated experience is reexamined with the aim of assimilation and accommodation of learning (Bailey, Johnson-Russell, & Lupien, 2010; National League for Nursing Simulation Innovation Resource, n.d.). Knowledge gains have been found to be greatest not after the practical component of simulation alone, but after the debriefing component of the simulation (Shinnick, Woo, Horwich, & Steadman, 2011; Tosterud, Petzäll, Hedelin, & Hall-Lord, 2014).

Research is suggestive of a positive relationship between simulation and learning outcomes (Weller, Nestel, Marshall, Brooks, & Conn, 2012). Simulation is a learning and teaching methodology that is time and resource intensive. With large cohorts of nursing students enrolled in nursing and other health care programs, allocated time for simulated learning may be limited. As such, nurse educators are constantly striving for the most effective methods of delivery of simulated learning experiences (Fey & Jenkins, 2015). Evaluation of educational impact therefore provides valuable feedback to educators that may assist with development and improvement of teaching methods (Thistlethwaite, Kumar, Moran, Saunders, & Carr, 2015). Kirkpatrick (1967) developed an organizational tool that has been used as a method of evaluating and categorizing outcome criteria of educational training. Kirkpatrick's (1967) framework is a four-level model based on the premise that learning resulting from training programs can be classified into four levels: reaction, learning, behavior, and results (impact on patient outcomes; Figure 1). In level 1, reaction, evaluation relates to participant perceptions or satisfaction of training programs. Level 2, learning, is suggested to have occurred when there are changes in attitudes, knowledge is increased, or there is improvement in skill acquisition. Level 3, behavior or application of the learning, is an indication of the extent to which on-the-job behavior has changed as a result of training. Level 4, results, determine the impact of training on organizational benefits and the final results that occur (Kirkpatrick, 1967). In the health care arena, Level 4 evaluates whether the learning transfers to the clinical setting and improves patient outcomes (Abdulghani et al., 2014; Boet et al., 2014; Hammick, Freeth, Koppel, Reeves, & Barr, 2007; Issenberg et al., 2005).

Kirkpatrick's (1967) four-level framework.

Figure 1.

Kirkpatrick's (1967) four-level framework.

Aim and Review Questions

This review was conducted with the initial premise that simulated learning is synergized and strengthened by the debriefing element of the simulation experience. The aim of this systematic review was to search, extract, appraise, and synthesize research in health care education, which related to HFS studies that compared debriefing strategies, to answer the following question: Of the HFS studies that compare debriefing strategies, what levels of Kirkpatrick's (1967) training evaluation model are evaluated?


Search Strategy

This review is based on the PRISMA checklist (Moher, Liberati, Tetzlaff, Altman, & PRISMA Group, 2009) and guidelines for conducting a systematic review (Khan, Kunz, Kleijnen, & Antes, 2003). A systematic search focusing on high-fidelity patient simulation debriefing in health care was conducted in December 2014 and again in July 2016 in the following databases: Cumulative Index to Nursing and Allied Health Literature (CINAHL®), Medline® and PubMed®, Scopus®, ScienceDirect®, and PsycInfo®. Search terms were limited to English and included the keywords high fidelity patient simulation, patient simulation, HFS, HFPS, debriefing, critical reflection, reflection, post simulation analysis, post simulation evaluation and feedback; and the truncation symbol (*). Electronic searches were supplemented by a hand search of the Simulation in Healthcare journal; individual reference lists and an Internet search using the Google Scholar search engine. For the purposes of this review, HFS was characterized by use of human patient simulators that were computer-based manikins providing physiological responses and a high level of interactivity and realism for the learner (INACSL Standards Committee, 2016).

Inclusion and Exclusion Criteria

Articles were assessed for inclusion based on the following criteria: (a) English language, (b) published between 2000 and 2016, (c) described a debriefing intervention after high-fidelity patient simulation, and (d) based in health care. The exclusion criteria were (a) discussion or review papers, (b) descriptive studies, (c) case reports, or (d) papers reporting the development of debriefing tools.

Data Extraction and Assessment of Quality of Evidence

Inclusion criteria were assessed by the primary reviewer (S.J.) and overseen by the review team (F.M.C., R.N.). Extraction of data from the identified studies was independently performed by the primary reviewer and validated by the review team. Quality assessment on each study was undertaken using the Cochrane Risk of Bias Tool (Higgins & Green, 2011). Using this tool, each study was assessed for sources of potential bias, including selection, performance, detection, attrition, reporting, and other biases that did not fit into these categories (Higgins & Green, 2011).

Data Analysis

Narrative summary of findings frequently occurs in systematic reviews where a lack of randomized studies exist or where study methodologies are heterogeneous; therefore, the combination of evidence from existing studies can be used to provide a more general context (Khan et al., 2003). As quantitative data could not be statistically combined for meta-analysis in the reviewed studies, extracted data were therefore synthesized into a narrative form.


The broad search generated 1,096 articles, from which the relevant papers were selected for review. Details of the selection process are presented in the PRISMA flowchart (Figure 2). After the removal of duplicates, 936 article titles were scanned for relevance, resulting in 87 articles identified as potentially relevant. Following examination of the abstracts, a further 74 articles were excluded for not meeting the inclusion criteria. Thirteen articles that met all inclusion criteria were identified for data extraction and analysis of results. Data and outcome measures were collected and discussed with relation to Kirkpatrick's (1967) levels of evaluation.

PRISMA 2009 flow diagram.

Figure 2.

PRISMA 2009 flow diagram.

Study Characteristics

The included study characteristics, participants, interventions, outcomes, level of evaluation and results are presented in Table A (available in the online version of this article).

Study CharacteristicsStudy CharacteristicsStudy CharacteristicsStudy CharacteristicsStudy Characteristics

Table A:

Study Characteristics

Description and Methodological Quality of Studies

The included studies contained sources of bias in the randomization process. The method of randomization was reported in only six studies and included computer generated lists of random numbers (Van Heukelom et al., 2010; Welke et al., 2009), drawing names from a hat (Reed et al., 2013; Reed, 2015), and the use of drawing opaque envelopes from a hat (Chronister & Brown, 2012; Weaver, 2015). None of the participants in any of the included studies were blinded in terms of the debriefing they received; however, they may not have known whether they were in the intervention or control groups. Facilitators conducting the debriefing were also not blinded; furthermore, some researchers also acted as facilitators (Dreifuerst, 2012).

Types of Participants

All of the studies used convenience sampling, with sample sizes ranging from 30 (Welke et al., 2009) to 238 (Dreifuerst, 2012). The range of professions included qualified health professionals and health professional students, including undergraduate medical students (Bond et al., 2006; Cicero et al., 2012; Van Heukelom et al., 2010), anesthetists and anesthetic residents (Morgan et al., 2009; Savoldelli et al., 2006; Welke et al., 2009), and undergraduate nursing students (Chronister & Brown, 2012; Dreifuerst, 2012; Grant et al., 2014; Grant et al., 2010; Mariani et al., 2013; Reed, 2015; Reed et al., 2013; Shinnick, Woo, & Mentes, 2011; Weaver, 2015).

Interventions: Debriefing Methods

A variety of debriefing strategies were used, with an instructor video-assisted debrief being the most common debriefing intervention among the reviewed studies (Boet et al., 2011; Chronister & Brown, 2012; Grant et al., 2010; Grant et al., 2014; Reed et al., 2013; Savoldelli et al., 2006; Weaver, 2015; Welke et al., 2009). The structured debriefing called “Debriefing for Meaningful Learning” (Dreifuerst, 2012; Mariani et al., 2013) was used. The addition of written components of journaling and blogging to verbal debriefing were utilized by Reed (2015). In one study, participants did not receive any debriefing (Shinnick, Woo, Horwich, et al., 2011) and a change in the usual postsimulation timing of the debriefing delivery was used by Van Heukelom et al. (2010). The length of time of the debriefings varied, with no time limit for debriefing given in seven studies (Grant et al., 2010; Grant et al., 2014; Mariani et al., 2013; Savoldelli et al., 2006; Shinnick, Woo, Horwich, et al., 2011; Weaver, 2015; Welke et al., 2009). In the remaining studies, debriefing ranged from 20 (Boet et al., 2011; Reed et al., 2013; Reed, 2015; Van Heukelom et al., 2010) to 30 minutes (Chronister & Brown, 2012; Dreifuerst, 2012).

Types of Outcomes

Outcomes reported related to student perceptions and satisfaction with the debriefing experience, cognitive skills, clinical skills, knowledge, and behaviors. Student perceptions of the debriefing were the primary outcomes of two studies measured using the Debriefing Experience Scale (Reed et al., 2013; Reed, 2015). Three studies evaluated nontechnical skills of situation awareness, teamwork, decision making, and task management using the Anaesthesia Non-Technical Skills scale (Boet et al., 2011; Savoldelli et al., 2006; Welke et al., 2009). Clinical judgment was measured using the Lasater Clinical Judgment Rubric in two studies (Mariani et al., 2013; Weaver, 2015), and clinical reasoning skills were evaluated using the Health Sciences Reasoning Test (Dreifuerst, 2012). Chronister and Brown (2012) evaluated clinical psychomotor and assessment skills and used the Emergency Response Performance Tool to ascertain change in the time taken to respond during simulated patient care. Clinical knowledge of heart failure was evaluated with an investigator developed Clinical Knowledge Questionnaire in one study (Shinnick, Woo, Horwich, et al., 2011), whereas another study used a 7-point Likert scale questionnaire to determine self-reported knowledge to perform resuscitation skills (Van Heukelom et al., 2010). Safety behaviors, which included observed behaviors of identifying patients, communication among team members, assessment, and applying appropriate interventions, were recorded using the Clinical Simulation tool (Grant et al., 2010; Grant et al., 2014).

Level of Evaluation According to Kirkpatrick's (1967) Four-Level Model

The outcomes of the HFS and debriefing have been categorized according to the four levels of evaluation:

  • Level 1 = participant reactions.
  • Level 2 = learning.
  • Level 3 = behavior.
  • Level 4 = results.

Level 1: Participant Reactions

Studies investigating participant experiences of the simulation debriefing were classified into level 1, evaluating participant reactions. In both studies included in this category (Reed et al., 2013; Reed, 2015), participants' perceptions of debriefing were ascertained using the Debriefing Experience scale. This validated tool comprising of a 20-item scale allowed participants to rate the debriefing in the areas of experience and importance to the student. The use of debriefing with video was compared to video alone (Reed et al., 2013) and debriefing by one of three methods: discussion only debriefing, discussion debriefing followed by blogging, and discussion debriefing followed by journaling (Reed, 2015). Overall, perceptions of the debriefing experience were minimally different between video debriefing and debriefing alone and discussion was preferred over written forms of debriefing.

Level 2: Learning

Level 2 evaluation pertained to outcomes such as the acquisition of knowledge and skills that occurred following the simulation experience. Nine studies in this review evaluated changes in the areas of nontechnical skill performance (Boet et al., 2011; Savoldelli et al., 2006; Welke et al., 2009), clinical reasoning skills (Dreifuerst, 2012), clinical judgment skills (Mariani et al., 2013; Weaver, 2015), knowledge (Van Heukelom et al., 2010; Shinnick, Woo, Horwich, et al., 2011), and clinical skills (Chronister & Brown, 2012). Improvement in anesthetic residents' nontechnical skills was measured using the Anaesthesia Non-Technical Skills scale, which comprises four main skill areas: situational awareness, team working, decision making, and task management (Boet et al., 2011; Savoldelli et al., 2006; Welke et al., 2009). These studies all compared video-assisted debriefing with other debriefing methods. Video-facilitated debriefing did not result in improved outcomes (Boet et al., 2011), and Welke et al. (2009) found both simulation groups' skills scores improved from pretest to posttest regardless whether debriefing was used. Savoldelli et al. (2006) found that the group debriefed with the video-assisted facilitator method showed less improvement in skills than the instructor-facilitated debriefing group.

Gains in clinical judgment, as measured by the Lasater Clinical Judgment Rubric, were investigated by Mariani, Cantrell, Meakim, Prieto, and Dreifuerst (2013). Nursing students were debriefed with the intervention-structured Debriefing for Meaningful Learning, compared with the usual unstructured debriefing. Higher clinical judgment scores, which improved over time from the first simulation to a simulation 5 weeks later, were found in the intervention group; however, these were not statistically significant.

Weaver (2015) sought to identify whether nursing students' clinical judgment improved after participating in simulation with debriefing using an intervention videotaped model demonstration conducted as a part of the plus/delta method versus the usual structured plus/delta debriefing without the demonstration groups. After an initial simulation, a second simulation occurred 1 week later followed by the usual debriefing. Clinical judgment was evaluated by rates using the Lasater Clinical Judgment Rubric during both simulations, with the intervention group showing greater change in clinical judgment from simulation 1 to simulation 2.

Dreifuerst (2012) explored the relationship of a structured debriefing on the development of clinical reasoning skills in undergraduate nursing students when compared with customary debriefing based on the work of Jeffries (2007). A significant difference in the change in pretest to posttest clinical reasoning scores was found.

Chronister and Brown (2012) explored the effects of two different debriefing styles on quality of undergraduate critical care student skills (assessment and psychomotor), skills response time, and knowledge retention in senior-level critical care students engaged in a cardiopulmonary arrest simulation. The control group was debriefed verbally, and the intervention group received video-assisted verbal debriefing. The quality of skill improvement was found to be higher and response times were faster for students in the video-assisted group; however, knowledge retention from pretest to posttest was greater in the verbal-only group.

One hundred sixty-one medical students were randomly assigned to receive either debriefing following simulation or versus immediate feedback that occurred during simulation (Van Heukelom et al., 2010), and retrospective pre–post assessment was made through survey using Likert-scale questions assessing students' self-reported confidence and knowledge. Medical students' self-reported knowledge in their ability to perform medical resuscitation skills increased in both groups. Shinnick, Woo, Horwich, et al. (2011) examined the impact of simulation components (hands-on alone and hands-on plus debriefing) on heart failure clinical knowledge in prelicensure nursing students. Mean knowledge scores for both groups decreased from pretest to the first posttest but improved after a combination of simulation experience and debriefing.

Level 3: Behavior

According to Kirkpatrick (1967), level 3 evaluation relates to the degree to which learners changed their behavior outside the learning environment. In the context of health care, this would imply behavior change that has occurred in the clinical setting. None of the studies included in this review assessed participants' behaviors in clinical settings. However, in the simulated setting, two studies investigated target behaviors, specifically patient identification, team communication, and vital signs of nursing and nurse anesthetist students (Grant et al., 2010) and nursing students (Grant et al., 2014). Both studies utilized video-assisted debriefing versus oral and were measured through facilitator observation using the Clinical Simulation Evaluation Tool (Radhakrishnan et al., 2007). No statistically significant difference was found between the control and experimental groups in their total performance score in either of the studies.

Level 4: Results

No studies in the review tested level 4 in Kirkpatrick's (1967) framework to measure the effect of learners' actions on patient outcomes.


Studies that explored participant's perceptions of experience and satisfaction with the debriefing were categorized into level 1, the lowest level of evaluation. It could be perceived that there is no relevance between reaction criteria and other level evaluations of learning, behavior, and results. Although studies evaluating learner reactions specific to debriefing may not provide evidence to suggest learning occurred following debriefing, this research is not redundant. Valuable feedback on perceptions and satisfaction will influence the design and implementation of future debriefing learning experiences, affecting the learning that does occur as a result of debriefing (Cioffi, 2001; Fey & Jenkins, 2015). Overall, there was little difference in the perception of debriefing methods that were identified in this review.

The acquisition of skills and knowledge, classified as level 2 evaluation, were the most studied of all HFS and debriefing experiences. This may be due to the nature of debriefing being a discussion based on reflection, which has been shown to foster critical thinking and clinical judgment skills and possibly influence either self-reports or the actual acquisition of skills and knowledge (Mariani et al., 2013). Furthermore, regardless of whether the skills and knowledge gains are self-reported or otherwise, strategies for behavior improvement are routinely incorporated into debriefing discussions that lend themselves to supporting a change in skills and knowledge.

Of interest within this level of evaluation was the number of studies that utilized video-assisted technology within debriefing. Among some of the reviewed studies, using video assistance offered no statistically significant educational advantages over instructor debriefing (Boet et al., 2011; Savoldelli et al., 2006; Welke et al., 2009). Although increasing in popularity (Cheng et al., 2014), video-assisted debriefing requires further research to measure the benefits of the learning outcomes achieved against the considerable costs associated with this particular technology.

In this review, level 3 evaluation relating to changes in behaviors was measured only in simulated settings. In the studies in this category, video debriefing was the method used. As suggested by Boet et al. (2014), the findings of these studies that demonstrate an improvement in behaviors may be subject to criticism because participants who demonstrated an improvement in behaviors may have been taught adequately in the simulated setting but did not necessarily transfer learning to real practice. The absence of research demonstrating behavior changes in the clinical area is not surprising. Measuring behavioral change after debriefing would rely on the ability to create, manipulate, and control real-life conditions so that the person being tested could demonstrate behaviors. This approach raises ethical and patient safety issues (Adamson, Kardong-Edgren, & Willhaus, 2013). The element of workplace culture adds further complexity to evaluating research in real health care environments. Kardong-Edgren (2010) postulated that although educators may teach well, workplace practices may hinder good practice and are therefore confounding factors to be considered when evaluating outcomes.

The absence of any studies that could be categorized into level four, evaluation measuring the effect of learners' actions on patient outcomes, was noted to be a significant gap in the literature. The lack of research in this area may be due to factors such as the nature of this research in evaluating change in outcomes generally being long term, potentially expensive, and subject to various extraneous confounding variables, including natural maturation (Kardong-Edgren, 2010). The influence of confounding variables such as age, gender, race, and socioeconomic status must be considered in addition to the primary comparison variable of interest. For example, the effect of outcomes research does not have the benefit of allowing randomization, which creates an equal mix of all possible confounders in both comparison groups. The findings of outcomes research therefore depends on how many covariates can be identified and adjusted for (Chang & Talamini, 2011).


There were several limitations to this review. The search strategy was limited to English language studies and did not include unpublished abstracts from conference proceedings or grey literature. Only studies between 2000 and 2016 were included. Our review included only 13 studies with different study designs and outcomes, thus preventing us from conducting a meta-analysis. The quality of evidence is variable, with many potential sources of bias identified, such as the lack of details regarding randomization procedures, the inability for participants to be blinded, and the researcher acting as a facilitator in the study, potentially influencing results of study outcomes.


Although lower levels of evaluation are not redundant, how simulation and debriefing affects learning, behaviors, and ultimately patient outcomes is of great importance. Few studies have examined the true impact of simulation and debriefing as evidenced by Kirkpatrick's (1967) level 3 (changes in behavior) and level 4 (results, or impact on patient outcomes). Researchers can assist the continued maturation of the simulation pedagogy by aspiring to higher levels of Kirkpatrick's (1967) evaluation.


  • Abdulghani, H.M., Shaik, S.A., Khamis, N., Al-Drees, A.A., Irshad, M., Khalil, M.S., Alhagwi, A.L. & Isnani, A. (2014). Research methodology workshops evaluation using the Kirkpatrick's model: Translating theory into practice. Medical Teacher, 36(Suppl.), S24–S29. doi:10.3109/0142159X.2014.886012 [CrossRef]
  • Adamson, K.A., Kardong-Edgren, S. & Willhaus, J. (2013). An updated review of published simulation evaluation instruments. Clinical Simulation in Nursing, 9, e393–e400. doi:10.1016/j.ecns.2012.09.004 [CrossRef]
  • Bailey, C., Johnson-Russell, J. & Lupien, A. (2010). High-fidelity patient simulation. In Bradshaw, M.J. & Lowenstein, A.J. (Eds.), Innovative teaching strategies in nursing and related health professions (5th ed, pp. 212–226). Sudbury, MA: Jones and Bartlett.
  • Boet, S., Bould, M., Fung, L., Qosa, H., Perrier, L., Tavares, W. & Tricco, A.C. (2014). Transfer of learning and patient outcome in simulated crisis resource management: A systematic review. Canadian Journal of Anesthesia [Journal Canadien d'Anesthésie], 61, 571–582. doi:10.1007/s12630-014-0143-8 [CrossRef]
  • Boet, S., Bould, M.D., Bruppacher, H.R., Desjardins, F., Chandra, D.B. & Naik, V.N. (2011). Looking in the mirror: Self-debriefing versus instructor debriefing for simulated crises. Critical Care Medicine, 39, 1377–1381. doi:10.1097/CCM.0b013e31820eb8be [CrossRef]
  • Bond, W.F., Deitrick, L.M., Eberhardt, M., Barr, G.C., Kane, B.G., Worrilow, C.C. & Croskerry, P. (2006). Cognitive versus technical debriefing after simulation training. Academic Emergency Medicine, 13, 276–283. doi:10.1197/j.aem.2005.10.013 [CrossRef]
  • Chang, D.C. & Talamini, M.A. (2011). A review for clinical outcomes research: Hypothesis generation, data strategy, and hypothesis-driven statistical analysis. Surgical Endoscopy, 25, 2254–2260. doi:10.1007/s00464-010-1543-7 [CrossRef]
  • Chronister, C. & Brown, D. (2012). Comparison of simulation debriefing methods. Clinical Simulation in Nursing, 8, e281–e288. doi:10.1016/j.ecns.2010.12.005 [CrossRef]
  • Cicero, M.X., Auerbach, M.A., Zigmont, J., Riera, A., Ching, K. & Baum, C.R. (2012). Simulation training with structured debriefing improves residents' pediatric disaster triage performance. Prehospital and Disaster Medicine, 27, 239–244. doi:10.1017/S1049023X12000775 [CrossRef]
  • Decker, S., Sportsman, S., Puetz, L. & Billings, L. (2008). The evolution of simulation and its contribution to competency. The Journal of Continuing Education in Nursing, 39, 74–80. doi:10.3928/00220124-20080201-06 [CrossRef]
  • DiLullo, C., McGee, P. & Kriebel, R.M. (2011). Demystifying the millennial student: A reassessment in measures of character and engagement in professional education. Anatomical Sciences Education, 4, 214–226. doi:10.1002/ase.240 [CrossRef]
  • Dreifuerst, K.T. (2012). Using debriefing for meaningful learning to foster development of clinical reasoning in simulation. Journal of Nursing Education, 51, 326–333. doi:10.3928/01484834-20120409-02 [CrossRef]
  • Fey, M.K. & Jenkins, L.S. (2015). Debriefing practices in nursing education programs: Results from a national study. Nursing Education Perspectives, 36, 361–366. doi:10.5480/14-1520 [CrossRef]
  • Grant, J.S., Dawkins, D., Molhook, L., Keltner, N.L. & Vance, D.E. (2014). Comparing the effectiveness of video-assisted oral debriefing and oral debriefing alone on behaviors by undergraduate nursing students during high-fidelity simulation. Nurse Education in Practice, 14, 479–484. doi:10.1016/j.nepr.2014.05.003 [CrossRef]
  • Grant, J.S., Moss, J., Epps, C. & Watts, P. (2010). Using video-facilitated feedback to improve student performance following high-fidelity simulation. Clinical Simulation in Nursing, 6, e177–e184. doi:10.1016/j.ecns.2009.09.001 [CrossRef]
  • Hammick, M., Freeth, D., Koppel, I., Reeves, S. & Barr, H. (2007). A best evidence systematic review of interprofessional education: BEME guide no. 9. Medical Teacher, 29, 735–751. doi:10.1080/01421590701682576 [CrossRef]
  • INACSL Standards Committee. (2016). INACSL standards of best practice: SimulationSM debriefing. Clinical Simulation in Nursing, 12(S), S21–S25. doi:10.1016/j.ecns.2016.09.008 [CrossRef]
  • Issenberg, B., McGaghie, W.C., Petrusa, E.R., Lee Gordon, D. & Scalese, R.J. (2005). Features and uses of high-fidelity medical simulations that lead to effective learning: A BEME systematic review. Medical Teacher, 27, 10–28. doi:10.1080/01421590500046924 [CrossRef]
  • Jeffries, P.R. (Ed.). (2007). Simulation in nursing education: From conceptualization to evaluation. New York, NY: National League for Nursing.
  • Kardong-Edgren, S. (2010). Striving for higher levels of evaluation in simulation. Clinical Simulation in Nursing, 6, e203–e204. doi:10.1016/j.ecns.2010.07.001 [CrossRef]
  • Khan, K.S., Kunz, R., Kleijnen, J. & Antes, G. (2003). Five steps to conducting a systematic review. Journal of the Royal Society of Medicine, 96, 118–121. doi:10.1177/014107680309600304 [CrossRef]
  • Kirkman, T.R. (2013). High-fidelity simulation effectiveness in nursing students' transfer of learning. International Journal of Nursing Education Scholarship, 10, 171–176. doi:10.1515/ijnes-2012-0009 [CrossRef]
  • Kirkpatrick, D.L. (1967). Evaluation of training. In Craig, R.L. & Bittel, L.R. (Eds.), Training and development handbook (pp. 87–112). New York, NY: McGraw Hill.
  • Mariani, B., Cantrell, M.A., Meakim, C., Prieto, P. & Dreifuerst, K.T. (2013). Structured debriefing and students' clinical judgment abilities in simulation. Clinical Simulation in Nursing, 9, e147–e155. doi:10.1016/j.ecns.2011.11.009 [CrossRef]
  • McGaghie, W.C., Issenberg, S.B., Petrusa, E.R. & Scalese, R.J. (2010). A critical review of simulation-based medical education research: 2003–2009. Medical Education, 44, 50–63. doi:10.1111/j.1365-2923.2009.03547.x [CrossRef]
  • Moher, D., Liberati, A., Tetzlaff, J. & Altman, D.G.PRISMA Group. (2009). Preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement. BMJ, 339, b2535. doi:10.1136/bmj.b2535 [CrossRef]
  • Morgan, P.J., Tarshis, J., LeBlanc, V., Cleave-Hogg, D., DeSousa, S., Haley, M.F. & Law, J. (2009). Efficacy of high-fidelity simulation debriefing on the performance of practicing anesthetists in simulated scenarios. British Journal of Anaesthesia, 103, 531–537. doi:10.1093/bja/aep222 [CrossRef]
  • National League for Nursing Simulation Innovation Resource. (n.d.). SIRC glossary. Retrieved from
  • Norman, G. (2009). Teaching basic science to optimize transfer. Medical Teacher, 31, 807–811. doi:10.1080/01421590903049814 [CrossRef]
  • Reed, S.J. (2015). Written debriefing: Evaluating the impact of the addition of a written component when debriefing simulations. Nurse Education in Practice, 15, 543–548. doi:10.1016/j.nepr.2015.07.011 [CrossRef]
  • Reed, S.J., Andrews, C.M. & Ravert, P. (2013). Debriefing simulations: Comparison of debriefing with video and debriefing alone. Clinical Simulation in Nursing, 9, e585–e591. doi:10.1016/j.ecns.2013.05.007 [CrossRef]
  • Savoldelli, G.L., Naik, V.N., Park, J., Joo, H.S., Chow, R. & Hamstra, S.J. (2006). Value of debriefing during simulated crisis management: Oral versus video-assisted oral feedback. Anesthesiology, 105, 279–285. doi:10.1097/00000542-200608000-00010 [CrossRef]
  • Shinnick, M.A., Woo, M., Horwich, T.B. & Steadman, R. (2011). Debriefing: The most important component in simulation?Clinical Simulation in Nursing, 7, e105–e111. doi:10.1016/j.ecns.2010.11.005 [CrossRef]
  • Shinnick, M.A., Woo, M.A. & Mentes, J.C. (2011). Human patient simulation: State of the science in prelicensure nursing education. Journal of Nursing Education, 50, 65–72. doi:10.3928/01484834-20101230-01 [CrossRef]
  • Thistlethwaite, J., Kumar, K., Moran, M., Saunders, R. & Carr, S. (2015). An exploratory review of pre-qualification interprofessional education evaluations. Journal of Interprofessional Care, 29, 292–297. doi:10.3109/13561820.2014.985292 [CrossRef]
  • Tosterud, R., Petzäll, K., Hedelin, B. & Hall-Lord, M.L. (2014). Psychometric testing of the Norwegian version of the questionnaire, student satisfaction and self-confidence in learning, used in simulation. Nurse Education in Practice, 14, 704–708. doi:10.1016/j.nepr.2014.10.004 [CrossRef]
  • Van Heukelom, J.N., Begaz, T. & Treat, R. (2010). Comparison of postsimulation debriefing versus in-simulation debriefing in medical simulation. Simulation in Healthcare, 5, 91–97. doi:10.1097/SIH.0b013e3181be0d17 [CrossRef]
  • Weaver, A. (2015). The effect of a model demonstration during debriefing on students' clinical judgment, self-confidence, and satisfaction during a simulated learning experience. Clinical Simulation in Nursing, 11, 20–26. doi:10.1016/j.ecns.2014.10.009 [CrossRef]
  • Welke, T.M., LeBlanc, V.R., Savoldelli, G.L., Joo, H.S., Chandra, D.B., Crabtree, N.A. & Naik, V.N. (2009). Personalized oral debriefing versus standardized multimedia instruction after patient crisis simulation. Anesthesia and Analgesia, 109, 183–189. doi:10.1213/ane.0b013e3181a324ab [CrossRef]
  • Weller, J.M., Nestel, D., Marshall, S.D., Brooks, P.M. & Conn, J.J. (2012). Simulation in clinical teaching and learning. Medical Journal of Australia, 196, 594. doi:10.5694/mja10.11474 [CrossRef]

Study Characteristics

Primary Author (Year) and CountryContext and ParticipantsStudy DesignInterventions: Debriefing MethodsResultsLearning OutcomeKirkpatrick's Level of Evaluation
Boet (2011) CanadaAnaesthetic residents (N = 50) Control group (n = 25) Intervention group (n = 25)Randomized repeated measure designSelf-debrief (C) Instructor video-assisted debrief (I)Statistically significant increases in Anaesthesia Non-Technical Skills scale (ANTS) and in four components for both groups, no difference between briefing types. No statistical difference in improvement of nontechnical skills performance between groups (p = .58).Improvement in nontechnical skills2
Chronister (2012) USAUndergraduate nursing students critical (N = 37) Control group (n = not given) Intervention group (n = not given)Comparative crossover designInstructor verbal debriefing (C) Video assisted plus verbal debriefing (I)Knowledge retention greater in control group (p <.008). Intervention group significantly faster times for pulse assessment prior to CPR (p = .094), initial defibrillation shock (p = .042), and total time to resuscitation (p = .028).Improvement in skill and knowledge2
Dreifuerst (2012) USAUndergraduate nursing students (N = 238) Control group (n = 118) Intervention group (n = 122)Quasi-experimental Pretest–posttest studyStandard debriefing (C) Debriefing for Meaningful Learning (DML) method (I)Significant difference between participants' test effect of DML and total HRST score (p ⩽ .05) Perceived difference in quality of debriefingwhen DML used (p ⩽ .001)Improvement in clinical reasoning2
Grant (2010) USANursing and nurse anaesthetist students (N = 40) Control group (n = 20) Intervention group (n = 20)Quasi-experimentalInstructor-facilitated debriefing verbal debriefing (C) Video-facilitated instructor debriefing (I)No significant difference between groups on total performance scores. Intervention group more likely to perform patient identification (p <.01), team communication (p =.013), vital signs (p = .047).Change in target behavior3
Grant (2014) USAUndergraduate nursing students (N = 48) Control group (n = 24) Intervention group (n = 24)Pretest–posttestOral debrief (C) Oral + video (I)No significant difference between the two groups.Change in target behavior3
Mariani (2013) USAUndergraduate nursing students (N = 86) Control (n = 42) Intervention (n = 42)Quasi-experimentalUnstructured debriefing (C) – no specific format Intervention group DML (I)Higher mean clinical judgement scores of intervention group and improved more over time but differences not statistically different (p = .09). No statistically significant overall scale scores and subscales on Lasater Clinical Judgment Rubric.Improvement in clinical judgement2
Reed (2013) USAUndergraduate nursing students (N = 64) Control group (n = 32) Intervention group (n = 32)Quasi-experimentalDebriefing without video (C)Minimally different experiences.Perception of experience1
Reed (2015) USAUndergraduate nursing students (N = 58) Control group (n = 15) Intervention (n = 20) (I) (n = 13)(I)Experimental (randomized controlled trial [RCT]) Replacing before next drawDiscussion only debriefing (C) Discussion with journaling (I) Discussion with blogging (I)Statistical significance found in three individual items: “the debriefing environment was physically comfortable” p = .020; “debriefing provided me with a learning opportunity” p = .031; and “debriefing helped me to clarify problems” p = .008.Perception of experience1
Savoldelli (2006) CanadaAnaesthetic residents (n = 42) Control group (n = 14) Intervention group 2 (n = 14) Intervention group 3 (n = 14)RCTGroup 1 No feedback (C) Group 2 Facilitator oral feedback (I) Group 3 Facilitator (oral) + video debriefing (I)Significant improvement (p = .005) was reported in facilitator oral feedback and facilitator oral + video debriefing. No significant difference between oral and video assisted oral feedback groups.Improvement in nontechnical skills2
Shinnick (2011) USAUndergraduate nursing students (N = 162) Control group (n = 72) Intervention group (n = 90)2 group repeated measure design experimentalHands-on practice with no debriefing (C) Practice plus debriefing (I)Mean heart failure knowledge scores for both groups decreased from the pretest to the first posttest (p < .001) but improved after the combination of simulation experience and debriefing sessions (p < .001).Improvement in knowledge2
Van Heukelom (2010) USAUndergraduate medical students (N = 161) Control group (n = 84) Intervention group (n = 77)RCTImmediate feedback during simulation experience or in simulation debriefing (C) - at any point in simulation when an error is made. Instructor-facilitated debriefing session after the simulation experience postsimulation debriefing (I)Statistically significant improvement for both individual items and overall measures related to students self-reported confidence and knowledge for both groups (p ⩽.001)Improvement in knowledge2
Weaver (2015) USAUndergraduate nursing students (N = 96) Control (n = not given) Intervention (n = not given)Quasi-experimentalStandard structured debriefing using plus/delta method (C) plus/delta Standard plus delta debriefing + video model demonstration (I)Statistically significant difference (p < .001) in level of improvement in clinical judgement from Time 1 to Time 2 between students who received a model demonstration of scenario Intervention group greater improvement (change) from Time 1 to Time 2. Change in self-confidence between control and intervention groups approached statistical significance (p = .061). Intervention group indicated a statistically significant difference (p <.05) in satisfaction from Time 1 to Time 2.Improvement in clinical judgement2
Welke (2009) CanadaAnaesthesia residents (N = 30) Control group (n = 15) Intervention (n = 15)RCTComputer-based multimedia tutorial (C) Personal debriefing of instructor with videotape after the simulation (I)Improvements in total ANTS score from simulation 1 to 2 (p = .97), simulation 1 to simulation 3 (p = .94) and or simulation 2 to 3 (p = .84) were similar for both groups.Improvement in nontechnical skills2

Dr. Johnston is Director of Clinical Partnerships, Dr. Coyer is Professor of Nursing, and Dr. Nash is Professor of Nursing, School of Nursing, Queensland University of Technology, Brisbane, Queensland, Australia.

The authors have disclosed no potential conflicts of interest, financial or otherwise.

Address correspondence to Sandra Johnston, RN, Director of Clinical Partnerships, School of Nursing, Queensland University of Technology, Victoria Park Road, Kelvin Grove, Queensland 4059, Australia; e-mail:

Received: August 01, 2017
Accepted: February 09, 2018


Sign up to receive

Journal E-contents