Original Research Paper

Evaluating the Impact of Virtual Reality on Medical Students’ Skills Performance in High-Fidelity Simulation: A Randomised Controlled Trial Pilot

Matthew Whallett 1, Sindoora Mahesh 2, Joshua Whittaker 3, Alexander Crichton 4, and Usman Ahmed 5

1BMBS, BSc, Digital Clinical Fellow, the Dudley Group NHS Foundation Trust, Dudley, United Kingdom

2MBChB, Clinical Teaching Fellow, the Dudley Group NHS Foundation Trust, Dudley, United Kingdom

3MBChB, MEd, Chair, West Midlands Simulation Network, Birmingham, United Kingdom

4MBChB, MMedSci, NHS England Midlands Clinical Fellow, NHS England West Midlands, Birmingham, United Kingdom

5MBBS, PhD, Head of Virtual Learning, NHS England West Midlands, Birmingham, United Kingdom


ABSTRACT

Background: There is currently limited evidence on the comparative educational value of virtual reality (VR) in high-fidelity simulation for managing acutely unwell patients (AUPs). This study aimed to evaluate the effects of VR sessions on performance outcomes in technical skills and non-technical skills during high-fidelity simulation for AUPs. Methods: This randomized controlled trial was conducted among fifth-year medical students in England. The control arm followed the standard curriculum, whereas the intervention arm completed a VR session prior to their high-fidelity simulation. In both arms, non-technical skills were evaluated using a validated behavioral markers system (BMS), whilst technical skills were assessed by calculating the percentage of critical actions completed. Results: The non-technical skills performance did not differ significantly between the control and intervention arms, nor did the percentage of critical actions completed. Participants provided predominantly positive feedback on their experience with the VR intervention. Conclusions: Whilst previous evidence suggests the potential transferable skills and cost-effectiveness of VR sessions, this study did not show measurable improvements in performance outcomes, likely due to small sample size. The findings of this pilot study emphasize the importance of conducting further research to explore the direct impact of VR sessions upon clinical outcomes, and their suitability as an adjunct to high-fidelity simulation.

Keywords: virtual reality, simulation training, non-technical skills, technical skills, immersive learning

Date submitted: 13-September-2024

Email: Matthew Whallett (matthew.whallett@doctors.org.uk)

This is an open access journal, and articles are distributed

Citation: Whallett M, Mahesh S, Whittaker J, Crichton A, Ahmed U. Virtual reality for the improvement of simulation performance and education resources. Educ Health 2024;37:367-376

Online access: www.educationforhealthjournal.org
DOI: 10.62694/efh.2024.181

Published by The Network: Towards Unity for Health


Background

VR is a technology that creates an immersive simulated environment.1 Recent interest and investment in VR as an educational tool have been driven by advancements in hardware and software, which have enhanced realism and immersive simulated experience.1

Compared to traditional teaching methods using real patients, VR offers learners an opportunity to explore surgical procedures, emergency situations and anatomy in a safe and controlled environment.1,2 VR allows for repeated practice, theoretically enhancing proficiency levels, broadening experiences and improving decision-making and problem-solving skills.1

When conducted in isolation, VR simulations lack social interaction, potentially hindering team-working and communication.2 Although research is ongoing for multi-professional virtual simulations, it is a nascent field. Some students may experience cybersickness, characterized by dizziness and nausea which can last up to four hours, posing a significant barrier to its effectiveness.3 This was particularly profound when viewing content in the first-person, to those with prior motion sickness or limited gaming experience, and women.3

High-fidelity simulation utilizes realistic patient mannequins in simulated clinical scenarios to enhance realism.2 It supports a wide range of learning objectives in both technical skills and non-technical skills.4,5 Its role in teaching the management of acutely unwell patients (AUPs) is well-established, with widespread use throughout British medical education.2,4,6 With running costs less than 10% of that for high-fidelity scenarios,1 VR technology offers a flexible, cost-effective, and immersive alternative to high-fidelity simulation for teaching the management of AUPs,1,2,4,6

The potential drawbacks of VR technology must be considered when allocating time and resources that could be spent on high-fidelity simulation or elsewhere on medical training. Firstly, the upfront costs associated with procuring equipment and establishing the necessary infrastructure may exceed the financial capacities of many institutions.1.

Also, the current state of VR technology, including both software and hardware options, may not be sufficiently mature for widespread implementation. Consequently, students may not receive an experience that effectively translates to real-world clinical practice.

We aimed to determine whether VR technology has value as an adjunct in teaching medical students the management of AUPs and to examine the potential value of combining VR with simulation to enhance performance. We hypothesize that using VR scenarios for pre-learning technical skills and scenario-specific knowledge allows learners to focus more upon non-technical skills in high-fidelity simulation, thus improving their performance.

In this study, participants’ performance in both non-technical skills and technical skills were evaluated during a high-fidelity simulation after a preparatory VR session. In the context of high-fidelity simulation, where students face clinical uncertainty and intricate behavioral expectations, there is a risk of cognitive overload, which can hinder learning and performance.7 Additionally, medical students’ relatively limited experience may further compound this challenge as they have less prior knowledge to draw upon.

To combat this effect, we introduced the VR session to enhance participants’ familiarity with the necessary technical skills and knowledge. When participating in the subsequent high-fidelity scenario, participants would arguably require fewer cognitive resources to handle the technical aspects effectively. This reduction in cognitive load would enable participants to allocate more attention to the non-technical components of the scenario, resulting in improved performance.

Methods

A single-center randomized controlled trial was conducted to investigate whether participation in a VR session before completing a high-fidelity simulation would increase the technical skills and non-technical skills displayed during the simulation. The study design diagram is presented in Figure 1.



Figure 1 Study design diagram

The established teaching schedule for fifth (final)-year medical students at a hospital in England, is based upon the local medical school’s learning objectives for the “acutely ill patient” module, which covers 14 key presenting complaints for an acutely unwell patient, such as chest pain and seizure. Each key presentation is taught using a high-fidelity simulation session paired to an introductory lecture (Table 1). The simulation session involves two students managing a patient who has presented as an emergency admission with a key presentation followed by a structured debrief. Students also received an hour-long lecture on theory relevant to the key presentation.

Table 1 Linked key presentations, Oxford Medical Simulation scenarios and high-fidelity simulation diagnoses

From January to March 2023, all fifth-year medical students based at the hospital were recruited for the study, and written consent was obtained. Students were divided into two groups, referred to as Group A and Group B, based upon alphabetical sorting of surnames.

In the control arm, participants followed the standard practice by attending the lecture and subsequent high-fidelity simulation. In the intervention arm, participants completed an additional VR session. Prior to starting their first VR scenario, participants familiarized themselves with the equipment and software using the OMS orientation tutorial. The Oxford Medical Simulation (OMS) software (London, UK) was accessed through Meta Quest 2 VR headsets (Irvine, USA) and students individually completed an observed 20-minute OMS scenario aligned with the key presentation in the week between the relevant lecture and high-fidelity simulation. After the scenario, students were encouraged to review the feedback generated by the software and given the opportunity to debrief and discuss any relevant topics.

We collected control and intervention data for eight high-fidelity simulation sessions with a corresponding OMS scenario (Table 1).

Each group was randomly allocated, using Microsoft Excel (Washington, USA), to the intervention arm for four scenarios and the control arm for the other four scenarios (Table 1). Group A and B completed high-fidelity simulation sessions separately. Participants were blinded to the scenario. During the high-fidelity simulation session, two students participated and were assessed, whilst the remaining students in the group observed. At the beginning of each simulation session, students volunteered for the roles of lead and assistant, with every student required to lead at least one simulation session. Any student who did not attend the lecture or VR session related to that key presentation was excluded from high-fidelity simulation participation.

The primary outcome measure was the Medical Students’ Non-Technical Skill (Medi-StuNTS) behavior marker system (BMS): a validated tool for assessing non-technical skills displayed during high-fidelity simulation sessions for medical students.8,9 The BMS consists of 16 domains, with scores ranging from 1 (excellent) to 5 (poor). A total BMS score was calculated by summing the scores across all domains, resulting in a range of 16 (best) to 80 (worst). An independent blinded examiner, with 16 years of consultant anaesthesia experience, conducted the BMS scoring. As a simulation unit lead, the examiner is experienced in facilitating high-fidelity simulation and is well versed in using the BMS to assess medical students’ performance.

Technical performance was assessed using the percentage of critical actions completed during the high-fidelity simulation scenario. Each scenario had 9 to 17 critical actions, which were devised by the examiner during the development of the high-fidelity simulation scenarios, which occurred prior to this study design. The number of critical actions completed by the students was divided by the total critical actions for the scenario to give a percentage completed.

Upon completion, the OMS software gives participants a performance score for each scenario. The methods used to calculate this score are proprietary information of OMS. We used this score to investigate any potential predictive validity between the OMS score and subsequent performance in both technical skills and non-technical skills during the high-fidelity simulation.

To determine sample sizes, a preliminary study was conducted six months earlier using the same design on a different cohort of final-year medical students.10 The study included eight participants in high-fidelity simulation, with four participants in each arm. With alpha=0.05, power=0.9, and 1:1 enrolment ratio, a sample size of 12 (6 per arm) would be sufficiently powered to detect a difference in BMS score.10

All participants completed a questionnaire to assess the face and content validity of using VR in this study after completing the program. The questionnaire covered participants’ experience across five key themes: content quality, user experience, educational value, technical issues and experience of cybersickness. Eight questions utilized a 7-point Likert scale ranging from “strongly disagree” to “strongly agree”. Six further questions allowed free-text answers. The questionnaire is displayed in the first column of Table 2.

Table 2 Participant questionnaire data

We completed a Bonferroni Correction calculation to approximate an alpha-value of 0.00313 for identifying significant differences in BMS scores across the 16 individual domains. This correction was used to offset the effect of multiple comparisons we made upon the same data and to avoid a type 1 error. Otherwise, a p-value <0.05 was used to determine statistical significance.

Normality of continuous data was assessed using the Shapiro-Wilk test. For univariate analysis, a one-tailed unpaired t-test was used if data were parametric, and a Mann-Whitney U test for non-parametric data. Pearson’s correlation coefficient was applied for parametric continuous data, while Spearman’s rank correlation coefficient was used for non-parametric continuous data. We chose these statistical tests due to their robustness with small datasets and commonplace usage. Descriptive statistics were used to analyze qualitative portions of questionnaire data. All statistical analyses were completed using Statistics Kingdom’s online calculators (Melbourne, Australia).

Results

We recruited and randomized 20 students, yielding 34 data sets. Two data sets were excluded due to incomplete attendance, leaving 32 datasets for analysis (control arm n=16, intervention arm n=16). The preliminary study data were incorporated to increase the sample size, as the randomization and intervention procedures were identical.

Data from both arms across the 16 BMS domains were non-parametric (Shapiro Wilks; p<0.05 in all cases). The total BMS scores were parametric (Shapiro Wilks; p>0.05 in both arms).

There was no significant difference in median BMS scores between the arms across all domains (Mann-Whitney U test p>0.05 in all cases), nor in the total BMS scores (one-tailed unpaired t-test, p=0.526) (Figure 2). As students progressed through the scenarios and became more familiar with the general session structure, no significant improvement was observed in any BMS domain nor the total BMS score.



Figure 2 Median BMS scores for control and intervention arms

A column graph demonstrating the median BMS score across each BMS domain (using left y-axis) and the total BMS score (using right y-axis). The dark-grey column represents the control arm and the light-grey column represents the intervention arm. Error bars represent the interquartile range. There were no significant differences observed between control and intervention arms.

The percentage of critical actions completed in the control (n=9) and intervention (n=9) arms followed a normal distribution (Shapiro-Wilk test, control arm p=0.44, intervention arm p=0.29). No significant difference was observed in the percentage of critical outcomes completed in the intervention arm compared to the control arm (onetailed unpaired t-test, p=0.66).

The cohort’s OMS scores were parametric (Shapiro-Wilk test, p=0.87). The total OMS score was not significantly correlated with the percentage of critical actions completed during high-fidelity simulation (Pearson-correlation coefficient=0.32, p=0.20).

There was no significant correlation between each BMS domain score and OMS score, nor total BMS score and OMS score (Spearman’s rank correlation coefficient, p=0.29).

Table 2 contains results of the questionnaire.

Participants generally had positive responses regarding the quality of OMS content. The majority found the VR hardware and OMS software user-friendly and easy to operate (95%, n=18), and felt fully immersed and engaged in the VR experience (84%, n=16). All participants found the scenarios easy to navigate. However, three participants identified limitations of the software, including difficulty in task progression in subsequent scenarios, initial challenges in locating actions within the interface, and unsuitability of the user interface for attention deficit hyperactivity disorder (ADHD).

Most participants (84%, n=16) felt fully immersed and engaged with the VR experience and enjoyed it. 94% (n=17) found the hardware and software easy to operate. However, three participants felt that their immersion and enjoyment were somewhat limited. These participants found the experience amusing and realistic but noted the software did not effectively educate them.

In the free-text questionnaire responses, participants highlighted the value they found in OMS for their education and their interest in continued integration into the curriculum. Participants stated the VR experience complemented their high-fidelity simulation curriculum well and helped them become familiar with common presentations before encountering them in the simulation environment.

Cybersickness was reported by 47% of participants (n=10). The most common symptoms experienced were dizziness (26%, n=5), eye strain (21%, n=4), headache (21%, n=4), and nausea (16%, n=3).

Discussion

Despite the lack of significant metrics in our study, participants found the VR sessions to be high-quality, immersive, user-friendly, and subjectively valuable. Macnamara et al2 reported mixed participant feedback in a similar study with fifth-year medical students using OMS scenarios paired with high-fidelity simulation scenarios. Their study showed high-fidelity simulation and OMS both improved confidence equally, however, high-fidelity simulation was perceived as more useful and realistic for learning than VR. They also conclude that both formats were equally immersive, but high-fidelity simulation provided better opportunities for demonstrating team-working skills.

Our results align with Mallik et al11 who found improved confidence in managing diabetes emergencies after an OMS VR scenario. Singleton et al12 showed nursing students felt more confident and knowledgeable about AUP management after a VR session. Their participants also found the VR content engaging and immersive.13 Similar findings were reported by Mestre et al13 in a large multi-center trial, who demonstrated a significant enhancement in perception of learning using virtual patient simulation. Most evidence14 from diverse cohorts supports our study’s findings, indicating positive reception of VR amongst students.

Cybersickness impacted 47% of participants in this study which could be concerning, but it must be noted that the sample size was small.

Liaw et al15 conducted a randomized controlled trial comparing desktop VR sessions to high-fidelity simulation scenarios for medical students. The study measured critical action completion during subsequent simulation scenarios using the RAPIDS tool. In the intervention arm, which had a similar teaching program to ours, there was no significant difference in RAPIDS tool scores compared to the control arm. This finding supports our results, specifically the lack of improvement in technical skills following VR sessions, further suggesting that VR, regardless of format, does not significantly enhance performance in subsequent high-fidelity simulation.

Our study’s findings, which differ from published data, challenge the belief that VR instruction benefits technical skills outcomes. Previous research has shown positive effects of VR upon assessments and isolated clinical procedures.14,17,18 However, our data suggests that the educational outcomes achieved through VR may not easily translate to the complexities of the clinical environment. The VR session may have failed to provide useful knowledge, which limits its potential ability to alleviate any cognitive load.

We found no correlation between BMS and OMS scores, nor between OMS scores and the proportion of critical actions completed. At the time of writing, there is no evidence available identifying correlations between OMS score and performance in standardized assessment nor clinical practice.

Study limitations

Several limitations should be acknowledged. Firstly, despite our best efforts to closely match available VR scenarios to key presentations, there was inherent variability in how closely they aligned. This effect has been limited by not presenting data by specific presentation/scenarios, but instead averaging outcomes across the whole program.

Secondly, due to the limited pool of participants, some individuals completed multiple scenarios during the study period, potentially introducing sampling bias.

Thirdly, self-selection bias was present in the study as students predominantly volunteered for participation in each high-fidelity simulation. However, this bias was beyond our control due to the established curriculum, and its impact upon each trial arm is expected to be similar. We minimized this bias by blinding participants to the topic of the high-fidelity simulation scenario they were volunteering for, and assessors remained completely blinded to the study group assignment.

Fourthly, all participants received exposure to VR simulation during the study. While the VR scenario may not have been directly relevant to the assessed presentation, there could have been general learning from the experience that influenced their non-technical skills or technical skills performance. This could be accounted for by maintaining consistent control and intervention groups throughout the study period. The lack of intervention arm for every key presentation, which was due to limited availability of OMS scenarios, must also be acknowledged.

Finally, due to the limited availability of adequately trained faculty who could remain blinded to the participants’ arm, all marking was performed by a single individual. This ensured consistency in BMS scoring but prevented assessment of rater quality and control of potential biases. It should be noted that the subjective nature of the BMS marking criteria introduces inherent bias, which may affect the validity of this assessment method.

Evidence suggests that VR sessions can provide transferable skills14,16 and offer greater cost-effectiveness, repeatability, and standardization compared to high-fidelity simulation.1,13 We therefore hypothesize that the lack of significant results in this pilot study is likely secondary to the small sample size presented here, and studies with higher power may yield different results. Consequently, questions remain about whether this format of VR session could directly impact measurable outcomes in managing AUPs in clinical practice and provide an effective alternative to high-fidelity simulation. The small sample size also limits the generalizability of these findings into real-world clinical settings. Future research should focus upon investigating larger populations and exploring the potential relationship between implementing virtual reality sessions and performance in standardized assessment.

References

1. Pottle J. Virtual reality and the transformation of medical education. Future Healthcare Journal. 2019;6(3):181–185. https://doi.org/10.7861/fhj.2019-0036
Crossref  PubMed  PMC

2. Macnamara AF, Bird K, Rigby A, Sathyapalan T, Hepburn D. High-fidelity simulation and virtual reality: an evaluation of medical students’ experiences. BMJ Simulation Technology Enhanced Learning. 2021;7(6):528–535. https://doi.org/10.1136/bmjstel-2020-000625
Crossref

3. Tian N, Lopes P, Boulic R. A review of cybersickness in head-mounted displays: raising attention to individual susceptibility. Virtual Reality. 2022;26(4):1409–1441. https://doi.org/10.1007/s10055-022-00638-2
Crossref

4. Beal MD, Kinnear J, Anderson CR, Martin TD, Wamboldt R, Hooper L. The effectiveness of medical simulation in teaching medical students critical care medicine: A systematic review and meta-analysis. Simulation in Healthcare. 2017;12(2):104–116. https://doi.org/10.1097/sih.0000000000000189
Crossref  PubMed

5. Nicolaides M, Cardillo L, Theodoulou I, et al. Developing a novel framework for non-technical skills learning strategies for undergraduates: A systematic review. Annals of Medicine Surgery (London). 2018;36:29–40. https://doi.org/10.1016/j.amsu.2018.10.005
Crossref  PMC

6. McInerney N, Nally D, Khan MF, Heneghan H, Cahill RA. Performance effects of simulation training for medical students – a systematic review. GMS Journal for Medical Education. 2022. https://doi.org/10.3205/ZMA001572

7. Van Merriënboer JJG, Sweller J. Cognitive load theory in health professional education: design principles and strategies: Cognitive load theory. Medical Education. 2010;44(1):85–93. https://doi.org/10.1111/j.1365-2923.2009.03498.x
Crossref

8. Hamilton AL, Kerins J, MacCrossan MA, Tallentire VR. Medical Students’ Non-Technical Skills (Medi-StuNTS): preliminary work developing a behavioural marker system for the non-technical skills of medical students in acute care. BMJ Simulation Technology Enhanced Learning. 2019;5(3):130–139. https://doi.org/10.1136/bmjstel-2018-000310
Crossref

9. Phillips EC, Smith SE, Clarke B, et al. Validity of the Medi-StuNTS behavioural marker system: assessing the non-technical skills of medical students during immersive simulation. BMJ Simulation Technology Enhanced Learning. 2021;7(1):3–10. https://doi.org/10.1136/bmjstel-2019-000506
Crossref

10. Jayaprakash S, Whallett M, Crichton A, et al. Virtual reality for the improvement of simulation performance and education resources - VISPER trial. Future Healthcare Journal. 2023;10(Suppl 3). https://doi.org/10.7861/fhj.10-3-s70
Crossref

11. Mallik R, Patel M, Atkinson B, Kar P. Exploring the role of virtual reality to support clinical diabetes training—A pilot study. Journal of Diabetes Science and Technology. 2022;16(4):844–851. https://doi.org/10.1177/19322968211027847
Crossref

12. Singleton H, James J, Penfold S, et al. Deteriorating patient training using nonimmersive virtual reality: A descriptive qualitative study. Computers, Informatics, Nursing. 2021;39(11):675–681. https://doi.org/10.1097/cin.0000000000000787
Crossref  PubMed

13. Mestre A, Muster M, El Adib AR, et al. The impact of small-group virtual patient simulator training on perceptions of individual learning process and curricular integration: a multicentre cohort study of nursing and medical students. BMC Medical Education. 2022;22(1). https://doi.org/10.1186/s12909-022-03426-3
Crossref  PubMed  PMC

14. Dhar E, Upadhyay U, Huang Y, et al. A scoping review to assess the effects of virtual reality in medical education and clinical care. Digital Health. 2023;9:205520762311580. https://doi.org/10.1177/20552076231158022
Crossref

15. Liaw SY, Sutini, Chua WL, et al. Desktop virtual reality versus face-to-face simulation for team-training on stress levels and performance in clinical deterioration: A randomised controlled trial. Journal of General Internal Medicine. 2023;38(1):67–73. https://doi.org/10.1007/s11606-022-07557-7
Crossref

16. Church HR, Murdoch-Eaton D, Sandars J. Under- and post-graduate training to manage the acutely unwell patient: a scoping review. BMC Medical Education. 2023;23(1). https://doi.org/10.1186/s12909-023-04119-1
Crossref  PubMed  PMC

17. Kyaw BM, Saxena N, Posadzki P, et al. Virtual reality for health professions education: Systematic review and meta-analysis by the digital health education collaboration. Journal of Medical Internet Research. 2019;21(1):e12959. https://doi.org/10.2196/12959
Crossref  PubMed  PMC

18. Zhao G, Fan M, Yuan Y, Zhao F, Huang H. The comparison of teaching efficiency between virtual reality and traditional education in medical education: a systematic review and meta-analysis. Annals of Translational Medicine. 2021;9(3):252–252. https://doi.org/10.21037/atm-20-2785
Crossref  PubMed  PMC


© Education for Health.


Education for Health | Volume 37, No. 4, October-December 2024

(Return to Top)