The study was approved by the local institutional review board (AU14E0935) and conducted as a single-center prospective single cohort questionnaire study at the Sydney Clinical Skills and Simulation Centre (SCSSC), within the Northern Sydney Local Health District (NSLHD), in Sydney, Australia. Learners completed the SBT-QAT10 at the end of scenarios in standardized SBT course.
Considering the relatively limited intended application of the tool was to be personal reflection and programmatic level quality assurance, we adopted classical notions of criterion validity and construct validity after referring to Messick’s argument-based validity framework  where these correspond closely Messick’s dimension of Relationships with other variables. Consequently, construct validity evidence was assessed by measuring the internal consistency of the questionnaire items, their correlation with satisfaction scores and the comparative impact of the assignation of active versus observer roles on learners’ perceptions. 
We hypothesized that the SBT-QA10’s questionnaire items are internally consistent and that positive perceptions will be associated with positive overall satisfaction scores.
As further confirmation of the construct validity we also sought to appraise the tool against a modifiable determinant of learners’ experience. Emerging evidence suggests one factor impacting upon the learning experience is the learner’s assigned role as either active participant versus observer in the action phase of SBT. [18,19,20] The learners’ experience when in observer roles is relevant to the instructional design and facilitation of simulation programs. Consequently, we measured the comparative impact of the assignation of active versus observer roles on learners’ perceptions as a secondary measure of the construct validity.
Learners were newly graduated nurses or doctors within their first 12 months of postgraduate work who were enrolled in a SBT based course to fulfil mandatory training requirements entitled Detection, Evaluation, Treatment, Escalation and Communication in Teams (DETECT) that addressed management of the deteriorating patient on a hospital ward. The course was similar in format to one utilized in previous studies. [16, 21] The study was performed accordance with the Declaration of Helsinki and approved by the appropriate ethics committee. All learners received written and oral information regarding the study and consented voluntarily to having data used for study purposes. Learners could withdraw their consent at any time.
The SBT-QAT10 questionnaire tool was developed, as an iterative process with the team-members, by adapting definitions of the eight perceptions provided in the original qualitative study into action statements and by incorporating questionnaire items that were used in a previously published evaluation of the DETECT course that addressed learner satisfaction. [16, 21] In the original interview study, the learners used phrasing to describe their perceptions in a manner to suggest two overarching themes: psychosocial emotional responses and cognitive understanding of the situation. Belonging was a perception that was commonly expressed with predominantly psychosocial phrasing, for example. In contrast, perceptions of conscious mental effort, task focus and control of attention were predominantly concerned with the learners’ cognition. Other perceptions, including surveillance, responsibility, realism, and contextual understanding, showed a greater degree of inter-individual variation in that they had a psychosocial impact on some learners and an impact on the cognitive processes of other learners”. [16, 21] We intentionally avoided classifying the questionnaire items into themes to enable further exploration through statistical factor analysis.
We developed the two additional questionnaire items as two of the original perceptions were represented by two separate related factors: The perception ‘belonging’ represented both relationships formed within the participant group and between learners and the instructor (Q1-2).  Similarly, the perception ‘responsibility’ represented both the amount of autonomy and independence available to the participant as well as the level of available support from the faculty member (instructor) (Q4-5).  In developing of the SBT-QA10, we tried to avoid survey fatigue by altering use of positive and negative statements in the questionnaire (avoiding “straightening” the answers). [22, 23] The final questionnaire contained a total of 10 items for perceptions of which seven were positively phrased and three were negatively phrased.
Two measures of satisfaction were also included in the final questionnaire. The phrases “I felt comfortable learning this way” and “I now feel more confident managing the clinical case” were derived without changes in wording from the previously cited study where they were shown to discriminate between the modifiable factors of technology format (conventional face to face versus videoconference enabled remote facilitation) and instructor experience with SBT.  A five-point Likert scale was used to score level of agreement with the questionnaire item.
The readability of the questionnaire was also tested as part of an iterative process and was piloted by members of a teaching faculty within the SCSSC prior to deployment. [24, 25]
This course is part of a larger program of emergency response team training and has been described in detail in previous publications. [16, 21] DETECT courses have low complexity scenarios designed to rehearse identifying, communicating, and treating patient deterioration on a hospital ward. The scenarios are designed to be of equal difficulty and matched to the learners’ level of clinical experience and workplace roles. The course is run in a pause-and-discuss format where learners in groups of six to eight engage in one or two phases of action that are interspersed with targeted reflective debriefing conversations.
Following an introductory lecture, learners rotate through three simulation scenarios that employ either patient simulators or simulated patients. To accommodate larger group sizes, a fourth scenario may be conducted in the format of a table-top case scenario, whereby facilitators use a verbal description and visual props to present the scenario. 
During each scenario round, the groups are divided into learners undertaking active and observer roles. The learners self-select the roles, with instructors ensuring that all learners experience both roles across the scenario rounds. Learners with active roles are briefed that they would take part in the action phase of the scenario and debriefing conversations. Learners with observer roles are briefed that they would observe the action phase from within the room and contribute their perspective to the debriefing conversations during the pauses.
The course was slightly modified to enable the questionnaires to be completed. This included a preliminary briefing about the research project including providing questionnaire forms and collecting completed consent forms allowing analysis of data. Immediately after the in-action phase, and before the post-scenario debriefing, the learners were asked to fill in the questionnaire (both active and observers). Questionnaire completion was expected to require five minutes. As we did not intend to modify the DETECT course during the study period, the results were not analyzed by the authors until the study period was over.
Each DETECT course is delivered by three to four instructors. Each scenario has one instructor who delivers their scenario three to four times per course, with learners rotating between scenarios. Prior to instructing on the course, every instructor completes the course-specific instructor accreditation program and achieves the course standard for competency in scenario delivery and constructive student enquiry-focused debriefing.  The authors were not involved in teaching and recruiting in the included courses to avoid creating any bias.
As DETECT is mandatory training for the organization it was important that the course remained unaffected by the study and therefore no explicit randomization was performed. Learners were divided into equal sized groups by the course leader during the course introduction without any knowledge of the learners’ demographic attributes, apart from gender, profession, and estimated age. No group of learners had identical rotations.
Construct validity was assessed by correlating the scores for the individual SBT-QA10 items with scores for other perceptions, the total score of the ten items and with scores for the two individual measures of satisfaction.
Learners’ responses to the SBT-QA10 questionnaire were also compared according to their assigned roles as ‘active responders’ or ‘observers’ according to our hypothesis that the strength of the perception would differ between learners in each group. For instance, in comparison to responses provided by learners in active roles, we predicted that learners in observer roles would report reduced scores for realism and mental workload. We also compared questionnaire responses between the second versus first scenario in a given role. Here, we hypothesized that scores for some perceptions such as being observed, use of mental workload and feeling uncomfortable may change reflecting greater familiarity with the teaching environment and methods, an effect observed in the previous study. 
Questionnaire results were analyzed in Microsoft ® Excel (version 16.23 (190,309)™). Responses for the negatively phrased questionnaire items and the items predicted to have a negative impact on learners’ satisfaction, were reversed so that a high score for each item would be assumed to predict a positive experience. Missing responses for individual questionnaire items or within the set of twelve questionnaire items were treated as missing data and excluded from analysis.
Data were analyzed using IBM SPSS (version 21). Tests requiring two-group comparison of independent and dependent variables used the Kruskal-Wallis non-parametric test for ordinal or categorical data or where interval data were not normally distributed. Tests requiring correlation employed two tailed Kendall’s tau Rank Order Test of Correlation for non-parametric data. A significance value of p < 0.05 was accepted as significant for all tests.
An exploratory principal component factor analysis (EPCFA) was conducted to further validate the questionnaire items and explore underlying themes. Generally accepted criteria for EPCFA were applied including an Eigenvalue > 1, factor loading of > 0.4, a Kaiser-Meyer-Olkin (KMO) Measure of Sampling Adequacy > 0.7 and Bartletts Test for Sphericity reaching significance of < 0.05. .
The eligible populations of these groups average 200 per annum. Based on previous studies, we assumed 50 learners would provide adequate power at 0.8 to detect 10% difference between the groups with 2-sided significance levels of p < 0.05. [8, 28]