Exploring Anesthesia Provider Preferences for Precision Feedback: Preference Elicitation Study

Abstract Background Health care professionals must learn continuously as a core part of their work. As the rate of knowledge production in biomedicine increases, better support for health care professionals’ continuous learning is needed. In health systems, feedback is pervasive and is widely considered to be essential for learning that drives improvement. Clinical quality dashboards are one widely deployed approach to delivering feedback, but engagement with these systems is commonly low, reflecting a limited understanding of how to improve the effectiveness of feedback about health care. When coaches and facilitators deliver feedback for improving performance, they aim to be responsive to the recipient’s motivations, information needs, and preferences. However, such functionality is largely missing from dashboards and feedback reports. Precision feedback is the delivery of high-value, motivating performance information that is prioritized based on its motivational potential for a specific recipient, including their needs and preferences. Anesthesia care offers a clinical domain with high-quality performance data and an abundance of evidence-based quality metrics. Objective The objective of this study is to explore anesthesia provider preferences for precision feedback. Methods We developed a test set of precision feedback messages with balanced characteristics across 4 performance scenarios. We created an experimental design to expose participants to contrasting message versions. We recruited anesthesia providers and elicited their preferences through analysis of the content of preferred messages. Participants additionally rated their perceived benefit of preferred messages to clinical practice on a 5-point Likert scale. Results We elicited preferences and feedback message benefit ratings from 35 participants. Preferences were diverse across participants but largely consistent within participants. Participants’ preferences were consistent for message temporality (α=.85) and display format (α=.80). Ratings of participants’ perceived benefit to clinical practice of preferred messages were high (mean rating 4.27, SD 0.77). Conclusions Health care professionals exhibited diverse yet internally consistent preferences for precision feedback across a set of performance scenarios, while also giving messages high ratings of perceived benefit. A “one-size-fits-most approach” to performance feedback delivery would not appear to satisfy these preferences. Precision feedback systems may hold potential to improve support for health care professionals’ continuous learning by accommodating feedback preferences.


Table of Contents
No, I do not wish to publish my submitted manuscript as a preprint.2) If accepted for publication in a JMIR journal, would you like the PDF to be visible to the public?
Yes, please make my accepted manuscript PDF available to anyone at any time (Recommended).
Yes, but please make my accepted manuscript PDF available only to logged-in users; I understand that the title and abstract will remain v Yes, but only make the title and abstract visible (see Important note, above).I understand that if I later pay to participate in <a href="http

Introduction
Healthcare professionals must learn continuously as a core part of their work.As the rate of knowledge production in biomedicine increases, better support for providers' continuous learning is needed [1].Feedback about care quality and outcomes is pervasive in health systems, and widely considered to be essential for learning that drives improvement.Clinical performance feedback is one form of feedback that is commonly delivered to healthcare professionals in clinical quality dashboards and reports.However, engagement with these resources is generally low, and their impact has been less than optimal [2][3][4][5], resulting in missed opportunities to improve the quality and safety of care.A large proportion of randomized controlled trials of feedback interventions (also known as audit and feedback) show limited influence on clinical practice [5].Moreover, what is considered as best practice for feedback interventions has not changed meaningfully for decades, even after hundreds of trials and repeated calls for new approaches to feedback interventions [6][7][8].
To our knowledge, most clinical performance feedback interventions use a one-size-fits-most approach to both the prioritization of performance information and its visual display as feedback, with the same metrics and visualizations being sent to all recipients.One-size-fits-most feedback may not be effective due to a host of characteristics such as individuals' knowledge, skills, and motivational orientation to their work [2,3,[9][10][11].Methods used by coaches, educators, and quality improvement facilitators to deliver feedback suggest that these factors are important [2,12,13].
Furthermore, in the context of routine feedback interventions (e.g. with monthly or quarterly measurement cycles), the value of performance information [14][15][16] may be reduced when performance is stable, but feedback interventions are not commonly prioritized accordingly.Given the increasing use and digitization of performance measures and clinical quality dashboards [17,18], healthcare systems need to understand how to better accommodate healthcare professionals' feedback preferences and the corresponding value of performance information.
Precision feedback is feedback that has been prioritized based on its motivational potential for a specific recipient [19][20][21][22].Using this approach, high-value feedback messages can be selected to enhance reports and emails, such as "You reached the top performer benchmark" and "Your performance dropped below the peer average".The potential impact of precision feedback increases with greater variability and differences in individuals' knowledge, skills, and motivational orientation, but these differences and their interactions are not well understood, as studies of providers' feedback preferences appear to be scarce.Qualitative studies have explored feedback preferences by asking participants to discuss their experiences with prior feedback, for example, prompted by a published feedback report [23] or a performance report belonging to the participant or their organization [24].Quantitative preference elicitation methods have been used extensively in health decision-making [25,26], but uncertainty about the measurement properties of preferences contributes to controversy around their use [27].To our knowledge, no instruments of provider feedback preferences with validity evidence have been developed.To begin to explore and understand these differences, we designed a preference elicitation study for motivating performance information and its display format.
We conducted this study in the context of anesthesia care quality improvement.In this context, data generated about care processes are produced primarily by anesthesia machines that report the administration of anesthetics and the patient's corresponding state with relatively high accuracy and reliability.Attribution of performance to individual anesthesia providers is feasible due to their authenticated use of an anesthesia machine for each operative case.A national-scale quality improvement consortium, the Multicenter Perioperative Outcomes Group (MPOG) [28,29]

Performance information
Information about measures, levels, time intervals, comparators, and a feedback recipient.[20,22] Feedback Information about performance that can guide future action.[33] Feedback recipient A person, team, or organization to whom a feedback intervention is directed.[22] Precision feedback Feedback that is prioritized according to its motivational potential for a specific recipient.

Motivating performance information
Performance information that has potential to motivate a feedback recipient through a known mechanism of action.

Comparison
Motivating performance information that is about a discrepancy between the performance levels of a feedback recipient and a comparator.[22] Trend Motivating performance information that is about a change in performance.[22] Achievement Motivating performance information that is about a change from a negative comparison to a positive comparison.[22] Loss Motivating performance information that is about a change from a positive comparison to a negative comparison.[22] Comparator Information that is used to identify a discrepancy with the performance level of a feedback recipient.[22] Benchmark A comparator with a performance level that is calculated from the performance of other health professionals or peers.[22,34] Explicit target A comparator with a performance level that is explicitly expected.[22,34] Time point information Performance information that is about a single time interval.

Time series information
Performance information that is about multiple time intervals.

Causal pathway model
A specification of influential elements in a causal process, including preconditions, mechanisms, moderators, and outcomes.[35] represents a discrepancy between the performance level of a feedback recipient and some comparator [22].There are multiple types of comparators, including benchmarks having a performance level that is determined by a population-based analysis.Benchmarks are commonly calculated as a summary statistic of top performers, such as choosing the performance level for a population that occurs at the 90th percentile, or the using achievable benchmark of care (ABC) method [36].Another type of comparator is an explicit target, including goals or standards that set expectations for attaining a specific performance level that is not necessarily dependent upon peers or another reference group's performance [34].The choice of comparators can result in the use of alternate mechanisms of motivation, such as motivation related to social norms vs personal goalsetting.Another key type of motivating information is trends that represent change in performance (getting better or worse) [22].Comparisons and trends may co-occur in performance data to represent an achievement, such as reaching a goal or a loss, such as losing a top-performer status [22].
Comparisons and trends are represented using a wide range of visualizations in clinical quality dashboards and feedback reports [20].These visualizations vary both in their content, such as the use of measures, comparators, and duration of time intervals, as well as the display format, such as bar charts, line charts, and tables to represent performance data.A review of published displays from feedback reports and dashboards identified 6 unique combinations of visualized performance information content [20].For example, feedback displays vary in the number of performance measures, time intervals, and comparators that they visualize.
The display of feedback is theorized as one of many factors affecting the success of clinical performance feedback in Clinical Performance Feedback Intervention Theory (CP-FIT) [37], a leading theory of audit and feedback.Motivating performance information in clinical performance data concerns configurations of types of feedback display, but is also closely related to CP-FIT's goal construct, which concerns the importance and relevance of feedback to healthcare professionals.
Precision feedback may contribute to additional CP-FIT constructs, including health professional characteristics (knowledge and skills in quality improvement), feedback delivery (function), and implementation process (adaptability and ownership).
To understand anesthesia provider preferences for motivating performance information and feedback display format, we investigated the following 4 research questions: 1. To what extent do anesthesia providers' selected messages reveal an overall preference for:

Methods
To address these questions, we developed a test set of feedback messages that a software application could generate.We formatted these as brief email messages, but designed them as "least common denominator" content that could also be delivered via other channels for feedback, such as clinical quality dashboards.
In the absence of instruments with validity evidence for assessing provider feedback preferences, we created an experimental design to elicit preferences that would expose participants, who were anesthesia providers, to contrasting message versions.To enable measurement validity assessment, we developed performance scenarios in which the same motivating performance information and display characteristics could be repeated in contrasting messages.

Ethical considerations
This study was approved by the University of Michigan Health Sciences and Behavioral Sciences Institutional Review Board (IRB-HSBS HUM00167426).All participants provided consent to participate and were informed about the ability to opt out of the study.No participant identifiers were collected with their research data for this study, preventing the linking of participant's responses with their identities.No incentives for participation were provided.We offered participants an opportunity to receive a copy of the study results upon completion.

Email test set development
We developed the email message test set iteratively in three phases: 1) knowledge modeling, 2) display format development, and 3) message set development (Figure 1).

Phase 1: Knowledge modeling
In the first phase we modeled knowledge about the elements of performance information, types of motivating information, and the influence of motivating performance information (Figure 1).We iteratively refined a model of the elements of performance information through an analysis of published feedback reports [20], resulting in the identifications of 5 key elements: measures, recipients, comparators, performance levels, and time intervals.We developed a model of motivating information that combines the 5 elements of performance information into types of motivating information including comparisons, trends, achievement, and loss.Each type of motivating information is defined using the elements of performance information.For example, a comparison (a kind of motivating performance information) is defined as a discrepancy between the performance levels of a feedback recipient and a comparator.
Through modeling types of motivating performance information, we recognized that the choice of comparator could affect which type of motivation was employed to influence a recipient.For example, choosing a 90th percentile peer benchmark as a comparator does not necessarily leverage motivation from goal-setting, when recipients do not form an intention to reach the benchmark as their personal goal.By inviting providers to set goals, feedback that shows performance improving towards a goal may leverage motivation arising from a desire for growth and achievement, rather than a desire for safety and avoidance of harm.These sources of motivation can differentially interact with the feedback sign (i.e.valence) to have counterintuitive effects, such as goal abandonment, relaxation, or the delivery of low-value feedback [2,10].
To understand how different types of motivating performance information might relate to theoretical mechanisms of influence, we created causal pathway models [35] for each type of motivating information with benchmark and explicit target comparators (Appendix 1).For example, in one causal pathway we modeled the expected influence of a feedback intervention that combines 3 elements in a recipient's performance: 1) performance below a comparator (low performance level), 2) a benchmark (such as a peer average), and 3) performance getting better (improving trend).This pathway could represent the influence of precision feedback emails that show performance approaching a peer average, which could indicate to recipients that efforts to improve performance appear to be succeeding.Based on the theoretical construct of positive velocity [30] (i.e.showing performance improvement), this causal pathway (which we named social approach due to the recipient reducing a performance gap with a peer benchmark) uses motivation as a mechanism of action, through which a feedback recipient may decide to increase or sustain effort to improve performance.We drafted and refined example messages for each type of motivating information.For the causal pathway social approach, an example message is "Your performance is approaching the benchmark".We implemented the causal pathway models in computer-interpretable form in a knowledge base, to enable automation of the processing of performance information to identify motivating information in a precision feedback system.Phase 2: Display format development In the second phase we developed display formats for motivating information in the body of an email message.We selected common visualizations (i.e.bar chart, line chart) used in healthcare organizations that would use a familiar format to convey the minimal amount of information necessary for each causal pathway.We developed software to generate visualizations within an email message using the R programming language.We included the absence of a visualization (i.e.text only) to accommodate recipient preferences for concise, text-based communication (Figure 1).

Phase 3: Message set development
In the third and final phase we created a test set of email messages with balanced characteristics of motivating information and display formats.We began by creating four performance scenarios with alternate performance levels (high vs low) and trends (improvement vs worsening vs stable).The resulting scenarios were 1) improvement to a high level, 2) worsening to a low level, 3) consistently high (stable) performance, and 4) consistently low (stable) performance (Table 2).In all scenarios, the recipient's performance could be compared with either the peer average (benchmark comparator) or an organizational goal (explicit target comparator).We set the recipient's performance level to have the same relationship with each comparator (better or worse), enabling either comparator to be displayed while maintaining balance with other elements.
We selected types of motivating information and their example messages across 3 characteristics: 1) performance temporality (time series vs time point), 2) performance comparison basis (benchmark vs explicit target), and 3) performance display format (bar chart, other).We selected the bar chart format as a key display format because of its common use in healthcare organizations.We further divided the other display format into line chart and text only.We composed emails with example messages from each type, based on a single quality measure (Avoiding postoperative nausea and vomiting -PONV-03) for anesthesia providers.The resulting emails contained information from the same performance scenarios, but not all information from each scenario was provided in each message.For example, of the four emails that each participant read in each scenario, 2 messages contained a goal comparator (explicit target), while the other 2 messages showed a peer benchmark comparator instead (Table 2)

Study design
We designed a within-subjects, repeated measures study of provider preferences for precision feedback using a test set of paper prototype email messages.We created two versions of the test set with alternate display formats for each message (Group A vs Group B) to enable randomization of the pairing of display format with motivating information (Table 2).We created a document containing all of the email messages in the test set (Appendix 2).We printed paper copies of the messages and organized them into packets in varying order for a paper card selection task.Based on our experience, we estimated that a sample of more than 30 participants would provide adequate power to detect meaningful differences in summary statistics and internal consistency of preferences.

Population and setting
We recruited anesthesia providers from a single academic medical center in the midwestern United States.Anesthesiologist (physicians) and Certified Registered Nurse Anesthetists (CRNAs) were eligible to participate.A member of the study team recruited anesthesia provider participants by email.All participants received monthly provider feedback emails from MPOG. that would benefit my practice."We adapted this question from an instrument with good validity evidence for assessing the usability of feedback displays [38].Responses were collected on a 5-point Likert scale ranging from strongly disagree to strongly agree.The survey questions did not ask directly about preference for information content or display format.Instead, participants' preferences were inferred through the types of content and display format that the selected message contained.
After participants completed the questionnaire, we conducted brief interviews and collected qualitative data that was analyzed separately and will be reported elsewhere.

Analysis
To identify preferences, we analyzed two characteristics of selected messages: motivating information (including temporality type and comparator type) and display format.We summed the selected messages with each type of motivating information and display format, and calculated descriptive statistics for these sums (Q1).To investigate the consistency of participants' preferences, we calculated Cronbach's alpha for each preference characteristic in participants' selected messages across the 4 performance scenarios (Q2).We used descriptive statistics to assess relationships between participants' preferences and the characteristics of the 4 performance scenarios, including performance level (high vs low) and trend presence (present vs absent).Similarly, we considered relationships between participants' preferences and their professional background using descriptive statistics (Q3).
To understand participants' perceptions of the potential benefit of precision feedback to their clinical practice, we analyzed ratings of perceived benefit for selected messages using descriptive statistics (Q4).We conducted analyses using R (The R Foundation for Statistical Computing, Vienna, Austria) and Google Sheets (Google, Mountain View, CA, USA).

Results
We recruited 35 anesthesia providers, including 18 anesthesiologists and 17 CRNAs (Table 3).All participants completed all message selection tasks, resulting in the selection of 140 preferred precision feedback messages.To what extent do anesthesia providers' selected messages reveal an overall preference for temporality (Q1a), basis of comparison (Q1b), and display format (Q1c)?
An overall preference for multiple time intervals (i.e.time series) was apparent, with 110 out of 140 (79%) messages being selected over those with a single time interval (i.e.time point) (Q1a).
Preferences for display format were highly varied, with selected messages being equally distributed between bar charts vs other formats (Table 4 and Figure 2) (Q1c).Preferred messages were also highly varied in their comparators, with 74 out of 140 (53%) preferred cards containing explicit target comparators (i.e.organizational goals not dependent on population performance) (Q1b), but our assessment of the consistency suggests that the comparator result was not reliable as a preference characteristic (see Q2 below).
For performance comparators, participants' selected messages were negatively correlated (=-.40), indicating an absence of consistency, perhaps from an incorrect measurement model [39].We consider this result to be an artifact of the study design, given that our message test set balanced several characteristics and created opportunities to select them in combination.We anticipate that comparators were not salient for participants, relative to the visual display and temporality characteristics, therefore we are unable to draw conclusions about preferences for comparators.
To what extent do provider preferences depend on performance level and trend, and their professional background?(Q3) Participant preferences for temporality and display format did not appear to depend on messages' performance level, with relatively similar means for the selection of each type of message content.
Similarly, these preferences did not appear to vary with the presence or absence of performance trends (Table 5).Preferences for temporality and display format varied within participants' professional background (Table 6).Some professional role-based differences in means were apparent, such as a higher CRNA preference for time point messages than anesthesiologists (M=1.59vs 0.56).However, a majority of CRNAs preferred time series messages, and all message characteristics were repeatedly observed in selections by participants from both professional background-based groups.would welcome the enhancement of feedback interventions with precision feedback that prioritizes motivating information.These findings are important because they point to a possible approach for improving audit and feedback that can leverage both high and low performance, as well as increasing or decreasing trends to prioritize performance feedback.
To our knowledge, this is the first quantitative study of preferences for clinical performance feedback.As an exploratory study, the findings primarily demonstrate the existence of differences in preferences for feedback, rather than speaking to the significance of their role in the success of clinical performance feedback.Our findings are related to CP-FIT, which recognizes that health professional knowledge and skills for engaging with feedback can be important factors for the success of feedback [37].Differences in feedback preferences could be driven by differences in healthcare professionals' knowledge and skills related to the interpretation of performance data.For example, participants' variable and consistent selection of messages could be related to their graph literacy skills [40,41].Precision feedback could be used to accommodate these and other individual differences by enabling health professionals to configure their feedback delivery and display, which further holds potential to increase feelings of ownership of feedback.By prioritizing motivating information according to recipients' preferences, precision feedback could be a strategy for reducing the cognitive load required by health professionals to recognize and assess the priority of learning opportunities.Precision feedback has also potential to improve feedback cycle completion by delivering information that is more likely to be perceived and accepted, resulting in increased formation of intentions to sustain or improve performance.In terms of CP-FIT, precision feedback can be understood as an approach for prioritization of feedback messages that are more likely to result in successful completion of the feedback cycle.
Our findings are aligned with the idea that positive feedback can be effective for learning and improvement [13], as well as sustainment of high performance.It is noteworthy that participants rated precision feedback messages as beneficial even when performance was high, such as the messages "you are a top performer" or "you reached the goal".This finding points to the possibility that a key function of feedback may be to motivate recipients through appreciation of accomplishments [42], including recognition of high performance, in addition to motivating providers to learn to improve.

Limitations
As an exploratory study for a novel type of feedback intervention, there are several important limitations for this study.The poor consistency of preferences demonstrated for performance comparators suggests that participants did not meaningfully differentiate between peer-based benchmarks and explicit targets, as presented in the message test set.This may be a function of the labels used for these comparators message test set, and during the study we discovered that some of the printed messages contained the abbreviation "ave" instead of "avg" for the peer average comparator.Competing explanations are that 1) providers equated the value of both comparator types or did not perceive them as fundamentally different, and 2) that this characteristic was less salient than the others, such that its significance was negligible.
Using performance scenarios based on synthetic performance data may have introduced bias in participants' responses.However, the consistency of participant preferences for temporality of motivating information and display format suggests that this bias was not significant.Nevertheless, our study design assessed preferences within types of motivating information (e.g.high and improving performance, low and worsening performance) that were presented with unambiguous motivating information, such as trends showing marked improvement or worsening.As such, our results do not address the appropriateness of using performance scenarios to elicit the strength of provider preferences directly, rather they primarily demonstrate the existence of individual differences, as an exploration of factors that may moderate the influence of feedback on healthcare professional learning and improvement.
We asked participants to rate the perceived benefit of messages that they had already selected as their preferred message, which may have resulted in positively biased ratings.Furthermore, we used a single performance measure for all messages (avoiding post-operative nausea and vomiting) that may not be representative of other performance measures, both in terms of perceived benefit and preferences for motivating information.We did not evaluate feedback about clinical outcome measures, which may result in a different preference profile across this population.We also did not evaluate participants skills or knowledge to engage effectively in feedback, which is a recognized factor [37] that may have resulted in further insight into participant preferences.
Additional limitations include the context and nature of the preference elicitation task, which was done in a video call with paper prototypes, which differs from the context of email use in healthcare organizations.When designing this study, we chose to use paper-based emails because we could not identify a remote, video-call proctored approach that would allow participants to consider 4 different messages types in the same field of view on their personal or work computer, without a risk of technical complications from participants' particular computer monitor and device configurations.
Our model of preferences in this study was linear and static, and assumed that available information was complete, but provider preferences may be non-linear, dynamic, and may depend on missing information that we did not consider.When designing the test set of messages, we paired the textonly display format consistently with time-point information, and line-charts with time-series format.
As such, preferences for line charts and text-only display formats were not independent from temporality.We recruited providers from a single academic institution, whose population is not necessarily representative of other anesthesia provider populations.We did not recruit any providers who identified as Black or Hispanic, increasing the likelihood that our results are racially and ethnically biased towards the perspectives of providers who identify as White and non-Hispanic.In spite of all of these limitations, we note that the variability that we observed demonstrates that preferences are non-uniform in this small population, which suggests that a one-size-fits-all solution may be inadequate for feedback reporting to anesthesia providers more generally.

Future Studies
We anticipate that preference clusters may exist and may be identifiable in studies that are better powered to detect such differences.Such clusters could be used to develop profiles for precision feedback, such as a profile for providers who prefer text-only messages about low performance, or who prefer visualization of performance changes (i.e.trends) using time-series displays in line charts.Future studies may be able to detect preference clusters to better understand diversity of preferences for performance feedback, across a larger provider population that is more racially, ethnically, and geographically diverse.Furthermore, we would welcome studies that aim to better understand diversity of provider preferences in association with additional provider characteristics such as duration of professional experience, clinical setting, and organization type.

Conclusions
Clinical performance feedback to healthcare professionals has potential to support continuous learning and influence practice, but this potential is frequently not achieved.By prioritizing motivating performance information based on the preferences and needs identified for a provider population, precision feedback may increase the effectiveness of clinical performance feedback for healthcare professionals' continuous learning and resulting quality improvement.Among a sample of anesthesia providers, preferences for precision feedback were varied, yet consisting within participants.Furthermore, participants' perceived benefits of precision feedback messages were observed to be high across a diverse set of performance scenarios.Based on these findings, it appears that precision feedback holds potential to improve support for healthcare professionals' continuous learning.
a. messages containing time series vs. time point information (temporality) (Q1a) b. messages relative to benchmarks vs. explicit performance targets (basis of comparison) (Q1b) c. messages formatted as bar charts vs. line charts and text only (display format) (Q1c) 2. How consistent are individual provider preferences?(Q2) 3. To what extent do provider preferences depend on performance level, trend, and their professional background?(Q3) 4. To what extent are preferred feedback messages perceived to hold potential to improve future clinical practice?(Q4)

Table 1
, has developed approximately 70 performance measures for anesthesia care quality and outcomes.

Table 1 : Glossary Term Description Source
).A key type of motivating performance information is a comparison that