Teaching practices are ever-evolving in science education. In documented reform efforts during the 1990s, scholars called for science teaching to transition from the simple transmission of discrete content knowledge to a greater focus on science practice (American Association for the Advancement of Science [AAAS], 1994; NGSS Lead States, 2013; Rutherford & Ahlgren, 1991). The authors of these reform documents focused on the importance of science literacy, not only on what students should know, but also on what they should be able to do in science, such as conducting investigations, analyzing and evaluating data, and using models to explain natural phenomena (AAAS, 1995).
More recent work by the National Research Council (2012) built on prior reform documents and science education research to propose a framework for K-12 science education that prioritized science practices, cross-cutting concepts, and disciplinary core ideas. This Framework for K-12 Science Education was used to write the Next Generation Science Standards (NGSS), reform-based standards for K-12 science that, as of 2023, have been adopted or influenced the development of standards in 49 states (National Science Teaching Association, 2023). Although the Next Generation Science Standards and Framework of K-12 Science Education provided a new direction for the goals of science teaching (Pruitt, 2014), researchers have identified a concern among teachers regarding their ability to effectively implement the standards (Haag & Megowan, 2015; Harris et al., 2017; Ricketts, 2014).
This concern inspired the development of Ambitious Science Teaching (AST), a pedagogical framework that conceptualizes how teachers can meet the teaching demands of the NGSS. Recent research has just begun to explore the potential link between the use of the AST framework in professional development and teacher’s science teaching self-efficacy (Williams & Mourlam, 2022). The purpose of this study was to build on that work, further analyzing the use of AST as part of a remotely delivered professional development (PD) experience to determine any changes in teachers’ beliefs about their ability to implement AST instructional practices.
Literature Review
In the sections that follow, research on the use and effectiveness of remotely delivered PD will be summarized, and the AST framework will be described. Additionally, the literature on the use of AST in science teacher PD will be reviewed.
Remotely Delivered Professional Development
Remotely delivered PD experiences have gained increased attention recently as a result of health and travel concerns associated with the COVID-19 pandemic and widespread access to videoconferencing technology. As with in-person PD experiences, remote PD intends to change teacher beliefs, improve pedagogy, and ultimately, lead to greater student outcomes. Though certainly offering greater convenience for participants, understanding the relative effectiveness of remote PD in achieving its goals is a major priority for those who fund and implement these experiences. Several recent meta-analyses have shed some light on the effectiveness of remote PD in educational contexts.
Kraft et al. (2018) reviewed 60 studies that examined the effectiveness of teacher coaching and PD on student outcomes. Twelve of the studies in the review involved virtual or remote delivery. The researchers found no significant difference in effect sizes of student outcomes between those taught by teachers who participated in the remote and in-person PD experiences.
Analyzing effects of PD specifically, Basma and Savage (2018) conducted a meta-analysis of 17 studies involving in-service teacher PD and student reading abilities. Of the 17 studies reviewed, three featured remote delivery of PD. The results were consistent with Kraft et al.’s (2018) findings, with effect sizes of literacy outcomes for students taught by teachers who participated in the remotely delivered PD experiences comparable to studies that utilized face-to-face PD delivery.
In a meta-analysis of research on science, technology, engineering, and mathematics (STEM) PD experiences, Lynch et al. (2019) examined how various features of a PD experience may have affected student learning outcomes. Of the 95 studies reviewed in the meta-analysis, 19 involved online components. The researchers found that STEM PD experiences with online components led to teacher changes that improved student outcomes, but on average led to smaller improvements for students when compared to PD experiences where teachers participated in-person only.
In general, available research on remotely delivered PD suggests that this approach has the ability to affect teachers in ways that increase student outcomes. However, no consensus exists on whether remotely delivered PD leads to changes in teachers that generate different student outcomes, when compared to teachers who participate in traditional in-person-only PD experiences.
Ambitious Science Teaching
AST involves “engaging all learners with rigorous science in ways that value the diverse experiences the learners bring to the classroom” (Windschitl & Barton, 2016, p. 1106). Windschitl et al. (2018) identified four core practices of AST, including (a) planning for engagement with important science ideas, (b) eliciting students’ ideas, (c) supporting ongoing changes in thinking, and (d) pressing for evidence-based explanations. Each of these elements of AST is described in the following sections.
Planning for Engagement With Important Science Ideas
In planning for engagement with important science ideas, a central science idea is identified around which a curriculum unit or module can be created. Windschitl et al. (2018) explained that the central science idea is written as a complex description of a natural phenomenon. From the central science idea, an anchoring event is created to provide a context-rich scenario of the natural phenomenon that aligns with the central science idea. Windschitl et al. stated that context-rich events are ones that happen at a specific place, at a specific time, and under specific conditions.
For example, in planning a unit on the nature of waves, pressure, and sound, curriculum planners may select a video showing a singer breaking a wine glass with just their voice. Anchoring events are meant to be puzzling to students and provide motivation for the students to try to explain what is happening (Windschitl et al., 2018). A final component of an anchoring event is to establish the relevance of the phenomenon to students’ lived experiences.
Once the anchoring event has been developed, an essential question can be generated that naturally arises from the anchoring event. Generic examples of ways teachers can engage students in a critical exploration of the anchoring event include questions such as, “Why did that happen?” or “What caused the event we see here?” Through essential questions, students are required to reflect on a variety of science concepts and investigate the issue to develop a high-quality response (Windschitl et al., 2018).
Eliciting Students’ Ideas
The second core practice of the AST framework requires students to call upon prior knowledge to generate preliminary perspectives and explanations regarding the anchoring event. Such student contributions are defined as resources. Once elicited from students, teachers can build on their ideas (Stroupe & Windschitl, 2015). Strategies involving purposeful questioning help students generate initial hypotheses and initial models for the anchoring event.
For example, a teacher may use a video of a person throwing boiling water in the air on a subzero day for their anchoring event, showing the students that the boiling water turns to snow in the air. A purposeful line of questioning might include prompts such as, “What happened in this video that was different from what you expected?” and “What factors that are unique to this scenario might account for the unexpected outcome?” In responding to these questions, students would generate initial models to share their ideas and provide the teacher with the resources necessary to inform instruction moving forward (Stroupe, 2017).
Supporting Ongoing Changes in Thinking
The third component of AST is to support student thinking as it changes throughout the lesson. Students perform investigations and participate in activities that provide information needed to understand the anchoring event (Windschitl et al., 2018). Teachers facilitate student investigations and present new information that prompts students to reevaluate and refine their initial hypotheses or models, a learning process often referred to as conceptual change (Lappi, 2013). Also important to this core practice is the communication and evaluation of ideas among students with teacher facilitation. Students are frequently asked to share and clarify their thinking with teacher prompts such as, “Can you tell me more about that?” or “Why do you think that’s true?” The use of these teacher discourse techniques that prompt further explanation from students has been shown to enhance the development of content knowledge and improve understanding of complex processes (Gillies, 2019).
Pressing for Evidence-Based Explanations
In the final phase of the AST framework, students construct evidence-based explanations of the phenomenon (Windschitl et al., 2018). Students make claims, cite evidence, and support their reasoning in a process often referred to as scientific argumentation (Sampson & Clark, 2008). Windschitl et al. (2012) explained that a claim is a statement about some event, process, or relationship in the natural world that is supported by evidence ascertained throughout the unit in activities and investigations. Scientific arguments require supportive reasoning, which are the connections between the claim, evidence, and other accepted science that logically points to the correctness of the claim (Driver & Newton, 2000). Teachers scaffold this process, utilizing strategies that help students see connections among the evidence gathered and provide the structure for students’ final explanations and models of the anchoring event (Windschitl et al., 2012).
The Use of Ambitious Science Teaching to Improve Instruction
Due to the recency of the AST framework, there is relatively little in the literature regarding how AST has been a guiding conceptual framework for science teacher PD. There are, however, recent studies that have used individual components of AST to create tools to evaluate teaching practices.
AST as a Foundation for Evaluation Tools
The AST framework has recently been used as the principal source for several innovative evaluation tools in science teaching (Grinath & Southerland, 2019; Johnson & Mawyer, 2019; Tekkumru-Kisa et al., 2021). Grinath and Southerland (2019) used the AST framework to create a tool to evaluate the quality of instructor discourse that occurred during an elicitation discussion in an introductory college biology class. The purpose of an elicitation discussion is to reveal a student’s thinking about and prior experiences with a topic to generate engagement and promote curiosity (Michaels & O’Connor, 2012).
Grinath and Southerland (2019) categorized the rigor of discourse during question-response interactions between teaching assistants and undergraduate students, using best-practice discourse techniques defined by AST. These discourse techniques include the use of questions that seek student explanations of reasoning rather than simple recitation of facts (Colley & Windschitl, 2016). The researchers found that teaching assistants who employed AST discourse techniques promoted greater complexity of student talk, involving more observations and explanations.
AST best-practice discourse techniques were also used by Tekkumru-Kisa et al. (2021), informing the development of a science instructional quality tool. This tool was specifically designed to evaluate the “rigor of tasks and talk that shape students’ high-level thinking and sensemaking in the classroom” (p. 170). A similarly designed tool was created by Boston and Candela (2018), using AST best-practice discourse techniques to evaluate the quality and rigor of discussion in ambitious math instruction.
While these recently developed evaluation tools have allowed researchers to assess the quality of student discourse and sensemaking in science instruction settings, the scope of each tool is limited to only parts of the AST framework. Benedict-Chambers and Aram (2017) created a more comprehensive reflection tool, conceptualizing the use of AST in the Engage-Explore-Explain (EEE) Framework. Preservice teachers participated in science teaching rehearsal activities as part of their science methods coursework and reviewed the lesson performances using video and an EEE Framework feedback form. Results indicated that 81% of preservice teachers focused on areas of student science content learning and the use of scientific practices for improvement when engaged in this form of AST-based reflection.
Similarly, Johnson and Mawyer (2019) used the AST framework to support preservice teacher video analysis of instructional practices. The researchers found that preservice teachers were able to identify patterns in student thinking and achieved high levels of sophistication in their written planned responses to student thinking through use of AST-supported video reflections. Taken together, the results of these studies suggest that more comprehensive reflection tools can be employed to support teacher self-evaluation of AST-themed instructional practices.
AST and Professional Development
Studies involving evaluation tools for AST-themed instructional practices have focused on the efficacy of the tool, rather than on interventions to directly influence change in those practices (Grinath & Southerland, 2019; Johnson & Mawyer, 2019; Tekkumru-Kisa et al., 2021). However, components of the AST framework, often under slightly different names, have been used with regularity in science education PD (Hanley et al., 2020; Kloser et al., 2022; Murphy et al., 2018; Rosebery et al., 2016; Tekkumru-Kisa et al., 2018). This work is supported by recent research that shows PD experiences can affect science teaching practices, bringing them into better alignment with reform-based goals (Fishman et al., 2017; Miller & Kastens, 2018; Ogodo, 2019; Osborne et al., 2019; Pringle et al., 2020).
Tekkumru-Kisa et al. (2018) created a PD program where in-service teachers analyzed their own practice and worked with others to improve their science teaching. Researchers focused on improving in-service teachers’ abilities to increase students’ depth of explanations of natural phenomena. In the AST framework, the process of increasing the complexity of student explanations based on academic experiences and prior knowledge is called sensemaking. The process of sensemaking is part of the AST core practice of supporting conceptual change (Windschitl et al., 2018). The researchers found that a focus on sensemaking in science educator instructional practice through video-based PD led to changes in participants’ instructional practices. These changes involved increases in teacher questions that asked students to explain their thinking or justify their claims.
Rosebery et al. (2016) also showed that a focus on sensemaking could lead to instructional improvements for science educators as a result of a PD experience. By focusing on “practices that encourage, make visible, and intentionally build on students’ ideas, experiences, and perspectives on scientific phenomena,” teachers improved their abilities to notice student thought patterns and support their sensemaking processes (p. 1571). With these studies, researchers focused on a small component of AST-related skills (sensemaking), rather than a more comprehensive approach to abilities across the four domains of the AST framework.
Other work in science education PD has centered on a similarly detailed AST-related skill: student discourse. Murphy et al. (2018) studied the effects of Quality Talk Science, “a comprehensive professional development and support model designed to help teachers instruct, foster, and scaffold scientific argumentation and model-based reasoning” (p. 1241), on the scientific argumentation skills of high school students.
Quality Talk Science connects to the domain of “making and justifying claims in a science community” in AST, as AST seeks to foster instructional practices that guide students in “making claims, using multiple forms of evidence, and responding to others who critique your ideas” (Windschitl et al., 2018, p. 199). The use of Quality Talk Science improved critical-analytical thinking and argumentation in students whose teachers participated in the PD. Students in the treatment group also improved their written scientific arguments over the control (Murphy et al., 2018).
A second reform-based PD program focused on improving student discourse yielded similar results. Hanley et al. (2020) studied the effects of the Thinking, Doing, Talking Science (TDTS) program on upper-elementary science achievement. The TDTS is designed to enhance in-service teacher skills that promote higher order thinking in students through the use of rigorous discourse and activities that involve science practices. The results of the study showed that the use of the program increased science achievement and also attitudes toward science among students in the treatment group.
The available studies in both evaluating and improving AST-related skills have mostly focused on components of the AST framework implicitly, rather than an explicit comprehensive approach that attempts both to assess and increase AST abilities across all four domains. Therefore, the goal of the study reported here was to determine any changes in teachers’ beliefs about their ability to implement AST practices as a result of their participation in a 3-day remotely delivered PD experience.
The research questions guiding this study were as follows: (a) To what extent and in what direction did teachers’ beliefs about their ability to implement AST practices change before and after a remotely delivered PD experience? and (b) To what extent do teachers’ beliefs in their ability to implement AST practices differ as a function of teaching experience, education level, or certification area? We hypothesized that there would be an increase in teachers’ beliefs about their ability to implement AST practices. We also hypothesized that there would be no differences in teachers’ beliefs as a function of their teaching experience, education level, or certification area.
Materials and Methods
This study used a pre-post survey methodology to quantify changes in teacher beliefs after participating in the remotely delivered PD workshop. In this section, a description of study participant demographics, the PD instructional context, study instrumentation, procedure, and data analysis are described.
Participants
The study’s sample was drawn from a group of 190 educators and teacher candidates who participated in a STEM-focused PD experience . Of the 190 participants in the PD, 171 were certified teachers. Of this group, 146 certified teachers completed the pre- and postsurveys necessary for data analysis comparisons (a completion rate of 85.4%). The group of teacher participants came from cities of various sizes, with the greatest number coming from cities of 1,000-10,000 people (n = 47, 32.2%), and fewer coming from cities of greater than 50,000 (n = 39, 26.7%), cities of less than 1,000 (n = 35, 24.0%), and cities of 10,000-50,000 (n = 25, 17.1%).
Teachers in the sample represented primary (n = 72, 49.3%) and secondary (n = 74, 50.7%) certification areas nearly equally. Teachers from K-12 grade levels were represented by the sample. A majority of teachers in the sample taught only one grade level (n = 84, 57.5%), but some teachers taught two grade levels (n = 12, 8.2%), three grade levels (n = 15, 10.3%), four grade levels (n = 22, 15.1%), and even five or more grade levels (n = 13, 8.9%).
In terms of subject area, the most reported teaching science (n = 137, 93.8%). The next largest representation by subject area was math (n = 71, 48.6%). Overall teaching experience within the sample was varied. Veteran teachers having greater than 20 years of experience represented the plurality of participants (n = 42, 28.8%), with lower numbers in other experience ranges, including 16-20 years of experience (n = 23, 15.8%), 11-15 years of experience (n = 29, 17.8%), 6-10 years of experience (n = 21, 14.4%), and less than 5 years of certified teaching experience (n = 31, 21.2%).
The teachers in the sample were predominantly female (n = 118, 80.8%), with the rest male (n = 26, 17.8%) or preferring not to say (n = 2, 1.4%). The sample included mostly participants who identified as white (n = 130, 89.0%). There were also three participants who identified as American Indian (2.1%), six who identified as Asian (4.1%), and two who identified as Hispanic (1.4%). There were also five who preferred not to say (3.4%) and one participant who identified as “Other” (0.6%). The majority of teachers in the sample had a bachelor’s degree (n = 80, 54.8%), with a smaller subgroup having attained the level of master’s degree (n = 61, 41.8%), and still fewer a specialist degree (n = 5, 3.4%).
Instructional Context
The participants in this study took part in a 3-day, remotely delivered PD workshop over the Zoom videoconference platform. K-12 teachers were recruited to participate in the PD, based on their interest in STEM teaching methods. The workshop was designed to provide a balance between both synchronous and asynchronous learning experiences. Each day included approximately 4 hours of synchronous instructional experiences that primarily included small group discussion-based activities.
Participants had the opportunity to work with multiple other workshop participants through small group discussion activities. In these small group discussion activities, participants were placed in Zoom breakout rooms, given a task card with discussion prompts and other activities to complete, and assigned specific roles (e.g., timekeeper, reporter, scribe, etc.). Small group discussions varied in length related to the task, but generally became shorter as the workshop progressed and participants developed greater rapport and relationships with one another.
After small group discussions ended, each group’s reporter would share, and workshop facilitators would lead both large group synthesis of the discussion topic and engage in some brief lecture about the topic. At the end of each day, participants were asked to asynchronously complete assignments related to the day’s activities in preparation of the next day’s PD session.
We did not design, create, or facilitate the workshop. The structure and teaching methods utilized by the workshop developers and facilitators were informed by the theoretical framework of Loucks-Horsley et al. (2010), who have promoted a constructivist approach to PD that emphasizes (a) learning through inquiry and problem solving, (b) reflection on the use of PD-developed skills for use in one’s own classroom, (c) and adequate allotment of time for processing new information. Pedagogy was also drawn from a museum-based STEM PD framework, which promotes (a) highly facilitated large and small group discussions, (b) cooperative exercises, (c) disciplined practice of norms, (d) and frequent opportunity for reflection (Chatman et al., 2019).
The content of the PD workshop was organized around the three-dimensional (3D) science teaching framework described in the NGSS (NGSS Lead States, 2013). These standards advocate for the use of disciplinary core ideas (DCI), science and engineering practices (SEP), and cross-cutting concepts (CCC). Throughout the workshop, special emphasis was placed on the use of phenomena to anchor science lessons and units, a concept of central focus in 3D science teaching and AST. AST practices were also modeled for participants throughout the workshop. For example, to model Eliciting Students’ Ideas, participants were shown a hand and an open mouth and asked to compare each environment using a Venn diagram in a small group breakout room in the Zoom platform.
The workshop took place over 3 days with each day divided into both synchronous and asynchronous learning experiences over 8 hours (see Figure 1). Each PD session began with a 2-hour synchronous meeting where (a) a piece of the 3D framework was defined, (b) time was provided time for participants to discuss the concept in context, and (c) participants were challenged to think about how the use of that component of 3D science teaching might affect their own classroom practice.
In the Zoom platform, these activities were accomplished through the use of several small group activities and breakout rooms. Each activity was supported by group roles (e.g., reporter, scribe, time keeper, etc.) and included a clear discussion structure, such as a Whip Around, where facilitators (a) posed a prompt or question with multiple answers, (b) provided participants time to write down responses after some individual think time, (c) participants whip around the small group where one person shared at a time with each new person adding something new without repeating an idea, and (d) concluded with participants discussing the ideas shared and any emergent themes from their responses.
After an activity concluded, all participants returned to the main room, and the reporter from each small group shared a summary of their discussion. Facilitators then briefly expanded on the topic before beginning the next activity. The first synchronous meeting was then followed by a 2-hour asynchronous block where participants had a working lunch. During this time, participants spent time reflecting on a series of questions about what they had learned in the first synchronous meeting of the day. Participants were instructed to be prepared to share at the start of the second synchronous meeting.
Figure 1
Design of Remotely Delivered PD Workshop

The second 2-hour synchronous meeting then occurred and focused on the use of the 3D science teaching component in an example middle school life science module, several lessons centered around a common theme, in this case the growth of biofilms. This synchronous session was facilitated similarly to the first session using a mix of small group activities facilitated through Zoom breakout rooms and whole group sense-making activities led by the facilitators. The final asynchronous session then occurred where facilitators assigned participants prework for the next day that included videos, readings, and reflective questions. This session took participants approximately 2 hours to complete.
Instrumentation
The questions for the instrument used in this study were developed using sources that outline the domains of AST and the teaching practices necessary to facilitate science learning using AST. Windschitl et al. (2018) provided the most direct support for generation of prompts within the instrument, while an earlier publication by Windschitl et al. (2012) provided secondary support. In total, 18 survey items were created with individual items aligned to one of the four AST domains (see the appendix). Content validity of the instrument was established by a panel of science teaching experts, including science teacher education professors, master teachers, and other educational practitioners with deep knowledge and experience with the AST framework. During the content validity meeting, we facilitated a review of the survey questions with panel members. Panel members were encouraged to ask questions, provide suggestions for improvement, and recommend alternatives where appropriate. During the meeting, we noted recommendations from the panel members that were then incorporated into the survey.
The revised survey was then sent back out to panel members for additional feedback. As a component of data analysis, the internal consistencies for each of the AST domains was calculated and found to be acceptable with Cronbach’s alphas ranging from .77 to .90. Teacher self-reported abilities across the four AST domains are reported in Table 1 in a subsequent section.
Procedure and Data Collection
Institutional Review Board approval was obtained to access existing data on teachers’ AST beliefs collected as part of workshop evaluation requirements for federal grant reporting. Before the workshop began, participants completed demographic and preworkshop surveys. The preworkshop survey included a set of 18 Likert-type items that were designed to measure teachers’ beliefs about their ability to lead classroom instruction aligned with the priorities of AST.
Each item included a statement such as, “I am able to identify naturally occurring phenomena appropriate for science instruction.” Participants were invited to mark their level of agreement on a 7-level scale, from strongly disagree to strongly agree. At the conclusion of the workshop, participants completed a postworkshop survey that included the same 18-items designed to measure teachers’ beliefs about their ability to teach using AST-related instructional practices.
Data Analysis
Before data were analyzed, only participants who fully completed all items on the pre- and postworkshop survey were retained. Additionally, one extreme outlier was removed from analysis due to a likely reverse anchoring error when completing the postsurvey, leading to a final sample size of N = 145. Data were analyzed using the statistical software R (R Core Team, 2022).
To determine potential statistically significant change in a pre/post survey comparison, a paired samples t-test would normally be used. However, the distributions of item scores did not meet the assumption of normality, therefore, making paired samples t-tests inappropriate, as it may have returned misleading results (Sawilowsky & Blair, 1992). Because of pronounced skew in composite score distributions, two nonparametric analysis techniques were used independently to explore potential differences in the data. A bootstrapped paired-samples t-test was used to calculate point estimates and 95% confidence intervals for mean differences, t-scores, and Cohen’s d to determine accurate estimates of difference between pre- and posttest means for the AST domain composite scores.
Separately, the Wilcoxon signed rank test was used to determine whether there was a true difference between pre- and posttest AST domain composite scores. Effect sizes of the Wilcoxon signed rank tests (r) were calculated with

and compared against Cohen’s (1988) standards for the effect size r (|r| = .1, small; |r| = .3, moderate; |r| = .5, large). Because we analyzed pre-post differences across the four domains of AST and overall (a total of five pre-post comparisons), the possibility of a Type I error was increased. To account for this increase, a Bonferroni correction was used, resulting in a higher bar for statistical significance (adjusted α < .01).
To answer whether the changes experienced in teachers’ beliefs to implement AST practices differed as a function of teaching experience, one-way ANOVAs were used. One-way ANOVAs were also used to examine any differences in changes experienced due to education level. Because the score distributions of pre-post changes did not show a departure from normality, and also because ANOVAs are more robust to variations in normality (Blanca et al., 2017), this parametric approach was deemed acceptable for the data being analyzed. Last, independent samples t-tests were used to determine whether there were any statistically significant differences in the extent of change in teachers’ beliefs to implement AST practices as a function of certification area.
Results
On the presurvey, teachers believed they had at least some ability to implement AST practices, with mean scores ranging from 5.18 to 5.36 (see Table 1). After the PD experience, teachers’ beliefs about their AST abilities increased on all AST domains, with mean scores ranging from 5.79 to 5.94.
Table 1
Internal Consistencies and Teachers’ Beliefs About Implementing AST Practices Across Domains
Domain | Presurvey | Postsurvey | ||
---|---|---|---|---|
Cronbach's α | M (SD) | Cronbach's α | M (SD) | |
Plan | 0.89 | 5.34 (0.97) | 0.90 | 5.94 (0.75) |
Elicit | 0.86 | 5.36 (0.82) | 0.88 | 5.87 (0.68) |
Support | 0.77 | 5.2 (0.82) | 0.88 | 5.84 (0.71) |
Explain | 0.89 | 5.18 (0.96) | 0.90 | 5.79 (0.81) |
Overall | 0.95 | 5.28 (0.80) | 0.97 | 5.86 (0.69) |
Note. Sample size N = 145 |
Results of bootstrapped paired-samples t-tests indicated the increases in teachers’ confidence about implementing AST practices were statistically significant after Bonferroni correction (p < .01) for all domains (See Table 2). Effect sizes for these mean differences were medium to large, ranging from d = .67 to .81.
Table 2
Bootstrapped Related-Samples T-Tests for Pre-Post Surveys of Teachers’ Beliefs
Domain | Average Mean Difference | 95% CI | Average t | p | Average Cohen’s D |
---|---|---|---|---|---|
Plan | 0.60 | [0.32, 0.88] | 4.33 | <.001 | 0.69 |
Elicit | 0.51 | [0.25, 0.76] | 4.10 | <.001 | 0.65 |
Support | 0.63 | [0.38, 0.89] | 5.07 | <.001 | 0.80 |
Explain | 0.62 | [0.32, 0.91] | 4.24 | <.001 | 0.67 |
Overall | 0.59 | [0.35, 0.82] | 5.12 | <.001 | 0.81 |
The Wilcoxon signed-rank test was used in addition to the bootstrapped paired-samples t-test to determine whether scores between pre- and postsurveys exhibited statistically significant change. In addition to positive mean differences between the two time points, the medians of the postsurvey AST domain distributions were larger than the medians of the presurvey AST domains, with differences in medians ranging from .60 to .75 (See Table 3). The pre-post median difference was largest in the domains of Support and Explain, both with a median difference of .75. The Plan and Elicit domains both had a median difference of .60. The combined AST self-efficacy construct showed a 0.55 increase in median score between pre- and postsurveys.
Table 3
Wilcoxon Signed-Rank Test Results for Pre-Post Surveys of Teachers’ Beliefs
Domain | Median Difference | W | p | r |
---|---|---|---|---|
Plan | 0.60 | 7680 | <.001 | 0.61 |
Elicit | 0.60 | 6774 | <.001 | 0.59 |
Support | 0.75 | 7426 | <.001 | 0.64 |
Explain | 0.75 | 6694 | <.001 | 0.59 |
Overall | 0.55 | 8476 | <.001 | 0.67 |
The results of these nonparametric analyses corroborated the findings from the bootstrapped paired-samples t-tests, demonstrating statistically significant increases in scores across all AST domains after Bonferroni adjustment. Effect sizes for these tests were large, ranging from r = .59 to .67.
One-way ANOVAs were used to determine whether differences in teachers’ confidence gains varied with teaching experience. The analysis revealed no statistically significant differences in gains across groups. In a similar vein, t-tests were used to evaluate the potential differences with respect to teacher education level (i.e., Bachelor’s degree only vs. advanced degree) and resulted in no significant differences. Additionally, t-tests were used to determine whether elementary and secondary teachers experienced differential gains in confidence in using AST practices and were also found to reveal no statistically significant differences between groups.
Discussion
The purpose of this study was to determine (a) the extent to which and direction teachers’ beliefs about their ability to implement Ambitious Science Teaching practices changed before and after a remotely delivered PD experience, and (b) the extent to which teachers’ beliefs in their ability to implement AST practices differed as a function of teaching experience, education level, or certification area.In the following sections, the general effectiveness of the PD workshop is described, along with a comparison of effectiveness between teacher profiles. Implications for practice and future research is also discussed.
Effectiveness of Comprehensive AST Approach
Statistical analysis of the pre/post surveys using a bootstrapped t-test approach showed statistically significant increases in teachers’ confidence to implement AST practices across all four AST domains, with medium to large effect sizes. Additionally, the overall composite confidence in implementing AST increased from pre- to postsurvey with a large effect size (d = 0.81). These results are consistent with previous research showing that PD experiences can impact teacher’s instructional practices in science (Fishman et al., 2017; Miller & Kastens, 2018; Ogodo, 2019; Osborne et al., 2019; Pringle et al., 2020).
Previous studies on AST PD have included only individual domains (Hanley et al., 20220; Murphy et al., 2018; Rosebery et al., 2016; Tekkumru-Kisa et al., 2018), while the PD workshop in this study was comprehensive in its use of the AST instructional model and strategies related to its implementation. The comprehensive approach to AST is, therefore, seen as a potentially effective means to improve science teaching self-efficacy through a remotely delivered PD experience. Instruction across the four domains of AST may provide an overview that is helpful to in-service teachers as they work to understand the details of the AST instructional model, an overview that would be absent in a more modular approach to learning AST.
Based on these results, organizers and facilitators of science teaching PD experiences should explore the use of a comprehensive approach to AST instruction, as the effectiveness and efficiency of this approach could potentially maximize learning for teachers while minimizing the necessary time to achieve these gains. In addition, an important area for future research emerging from these results is how AST can be integrated as part of a more holistic teacher PD framework. The evidence emerging from this and other prior studies indicates that teachers’ AST knowledge and beliefs can be positively impacted, but in unique ways (i.e., focus on individual AST domains vs. comprehensive AST focus). This assertion begs the question, is one approach more effective than others? If so, what characteristics of the PD are most important for teacher development? This and similar questions should be pursued in future studies.
Despite the apparent preliminary success of the remotely delivered PD experience for increasing teacher confidence in implementing AST practices, there are questions that require further research. First, how durable are the gains in teacher confidence? Would follow-up surveys of teachers 6 or 12 months after the PD workshop show similar scores? Second, what is the relationship between teachers’ beliefs about their ability to implement AST practices and the use of AST strategies in the classroom? A link between teacher beliefs and AST strategy use would allow for better confidence in the results. Finally, to what extent does teacher participation in a PD experience designed to increase teacher confidence and knowledge of AST influence student science learning outcomes? To understand the true benefit of the PD workshop, comparisons between students whose teachers participated in the workshop and students whose teachers did not participate in the workshop should be compared.
Effectiveness of Remotely Delivered Science PD
Due to COVID-19 pandemic limitations, the PD experience designed to increase teacher confidence and knowledge of AST was delivered completely online through synchronous meetings on the Zoom platform with asynchronous assignments and other tasks for participants. As with previous studies on remotely delivered science PD, the results indicated that teacher confidence in the use of new science teaching strategies increased over the course of the PD experience (Lynch et al., 2019).
Remotely delivered instruction in the AST instructional model appears to be a viable alternative to in-person PD and may represent an opportunity to increase teachers’ access to high quality PD. We recommend that PD organizers, especially those in rural areas, consider offering science PD experiences through both in-person and remote or online mediums. Doing so would likely provide more equitable access for teachers with time, travel, financial, and other barriers that would preclude their ability to participate in face-to-face only PD opportunities.
Although there were positive outcomes in the remotely delivered PD in this study, additional research is needed to understand better the potential advantages or disadvantages of remotely delivered science PD experiences. This includes experiences specifically designed to increase AST abilities and confidence to implement these practices, as well as on other topics. In addition, researchers in future studies should explore the role of both synchronous and asynchronous experiences in remotely or online PD opportunities to better understand the impact these experiences may have on participant learning and workshop satisfaction. Although these were key design elements of the workshop in this study, they were outside the scope of this study.
While the results of this study align with previous studies supporting the effectiveness of remotely delivered PD (Basma & Savage, 2018; Kraft et al., 2018), we recommend that future research seek to provide direct comparisons between remotely delivered and in-person versions of the same science PD experience. Such a comparison could be used to determine whether there is a significant difference between delivery methods. These comparisons could be extended beyond teachers’ beliefs in their ability to the use of AST strategies in the classroom and ultimately evaluate student science learning outcomes.
Comparable Effectiveness Across Groups
A second focus area of this study was to determine if there were any differences in teachers’ AST beliefs based on their teaching experience, education level, and certification area (elementary vs. secondary). The results showed no statistically significant differences among participant groups in pre/post survey changes of teachers’ beliefs about employing AST strategies for any of the domains. These are important results, because it means that teachers who were involved in the PD experience reported similar changes in their beliefs regardless of teaching experience, education degree level, or certification area. These results suggest that a remotely delivered PD experience focused on 3D science concepts may be an effective means to increase teacher knowledge and confidence of AST across a wide diversity of in-service teachers. While a large meta-analysis conducted by Lynch and colleagues (2019) on STEM PD investigated the effect size differences of remote vs. in-person PD, there was no comparison of effect sizes between PD experiences that included mixed certification area participant groups and those that had homogeneous participant groups (e.g., elementary teachers only).
These results have implications for organizations that design and facilitate remotely delivered science teaching PD experiences. Based on comparable outcomes in teachers of differing certification areas, PD workshops may be designed to reach a wide variety of teachers, even across certification areas. Organizations may be able to design and facilitate a single PD workshop with the goal of improving science teaching strategies for primary and secondary teachers concurrently. This type of PD experience would increase efficiency and potentially allow for a greater number of PD offerings, since separate primary and secondary-level workshops may not be necessary.
Additional research is needed to determine the efficacy of remotely delivered science teaching PD experiences and its ability to accommodate a wide variety of teachers through the use of this comprehensive approach to AST instruction. Specifically, it will be important to learn what feature of a science teaching PD leads to similar benefits across a wide diversity of in-service teachers.
The designers of AST argue that their framework “represents broadly applicable instructional strategies known to foster important kinds of student engagement and learning” (Windschitl et al., 2012, p. 879). By providing an overview of the four domains of AST during the PD, was it the content that proved more accessible to a wide variety of teachers? Or, was it the use of a middle school lesson module example that provided a bridge between primary and secondary certification levels? These questions could be answered with targeted study.
Future research should attempt to compare the single PD workshop approach that accommodates teachers from different certification levels to a split approach in which teachers participate in a PD workshop with only teachers of their own certification area. In the split approach, teachers of different certification areas would receive AST instruction with examples and considerations specific to the grade-range they teach. The results of teachers’ beliefs about their ability to facilitate instruction using AST could be compared between the two approaches to determine whether significant differences exist.
Limitations
Several limitations remain within the current study of the change in teacher beliefs in their ability to use AST practices over the course of a remotely delivered PD experience. First, the instrument used to gather data on teachers’ beliefs involved self-reporting, which may not capture the true extent of teachers’ ability to use AST practices appropriately and effectively. Key phrases and terminology on the instrument may have been unfamiliar to teachers on the presurvey, potentially suppressing teacher beliefs about the use of AST strategies.
Second, the investigation design lacked a comparison group, so it is unclear whether general exposure to science education professionals during a PD experience (absent specific instruction designed to increase AST abilities) may have influenced teachers’ self-reported AST ability postsurvey scores. There was also no comparison between the AST-based PD program and PD programs based on alternative frameworks. Such a comparison could be helpful in determining whether AST-based programs hold an advantage over other conventional science PD frameworks.
Finally, while the internal consistency of the items contained within the survey are within acceptable limits, the factor structure of the survey must be determined to examine construct validity. Therefore, future studies should examine the factor structure of the AST survey instrument used in this study using both exploratory and confirmatory factor analysis.
Conclusion
As teaching practices in science education continue to change and evolve, researchers and facilitators of teacher professional development will be tasked with identifying effective and novel ways to impact both teachers’ knowledge and beliefs. Accomplishing this impact is critical to ensure that contemporary, research-based approaches are adopted and implemented effectively in the science classroom to promote student learning. As a result of the COVID-19 pandemic, PD providers and researchers had to leverage novel approaches to teaching and learning, such as the remote professional development approach used in this study. During this time of discovery, both effective and ineffective distance learning approaches emerged, but were just that, emergent. Further research is needed to better understand these novel approaches and how they can support teacher growth in ways previously unknown. Although the pandemic continues to fade into our memories, the discoveries made during this time need not, as they may still be useful in meeting teachers’ pedagogical needs.
References
American Association for the Advancement of Science. (1994). Benchmarks for science literacy. Oxford University Press.
American Association for the Advancement of Science. (1995). Project 2061: Science Literacy for a Changing Future: A Decade of Reform. https://files.eric.ed.gov/fulltext/ED398051.pdf
Basma, B., & Savage, R. (2018). Teacher professional development and student literacy growth: A systematic review and meta-analysis. Educational Psychology Review, 30(2), 457-481. https://doi.org/10.1007/s10648-017-9416-4
Benedict-Chambers, A., & Aram, R. (2017). Tools for teacher noticing: Helping preservice teachers notice and analyze student thinking and scientific practice use. Journal of Science Teacher Education, 28(3), 294-318. https://doi.org/10.1080/1046560X.2017.1302730
Blanca, M. J., Alarcón, R., Arnau, J., Bono, R., & Bendayan, R. (2017). Non-normal data: Is ANOVA still a valid option? Psicothema, 29(4), 552-557.
Boston, M. D., & Candela, A. G. (2018). The Instructional Quality Assessment as a tool for reflecting on instructional practice. ZDM, 50(3), 427-444. https://doi.org/10.1007/s11858-018-0916-6
Chatman, L., Strauss, E. V., Sandland, T., Haupt, G., & Goeke, M. (2019). Leadership, equity, STEM, and systems change: A museum-based extension service model [Poster presentation]. In Museums and Teacher Professional Development: Illuminating the “Invisible Infrastructure” of Science Institutions in Teacher Learning. American Educational Research Association, Toronto, ON, Canada.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Lawrence Erlbaum Associates.
Colley, C., & Windschitl, M. (2016). Rigor in elementary science students’ discourse: The role of responsiveness and supportive conditions for talk. Science Education, 100(6), 1009-1038. https://doi.org/10.1002/sce.21243
Driver, R., & Newton, P. (2000). Establishing the norms of scientific argumentation in classrooms. Science Education, 84(3), 287. https://doi.org/10.1002/(SICI)1098-237X(200005)84:3
Fishman, E. J., Borko, H., Osborne, J., Gomez, F., Rafanelli, S., Reigh, E., Tseng, A., Million, S., & Berson, E. (2017). A practice-based professional development program to support scientific argumentation from evidence in the elementary classroom. Journal of Science Teacher Education, 28(3), 222-249. https://doi.org/10.1080/1046560X.2017.1302727
Gillies, R. M. (2019). Promoting academically productive student dialogue during collaborative learning. International Journal of Educational Research, 97, 200-209. https://doi.org/10.1016/j.ijer.2017.07.014
Grinath, A. S., & Southerland, S. A. (2019). Applying the ambitious science teaching framework in undergraduate biology: Responsive talk moves that support explanatory rigor. Science Education, 103(1), 92-122. https://doi.org/10.1002/sce.21484
Haag, S., & Megowan, C. (2015). Next Generation Science Standards: A national mixed-methods study on teacher readiness. School Science & Mathematics, 115(8), 416-426. https://doi.org/10.1111/ssm.12145
Hanley, P., Wilson, H., Holligan, B., & Elliott, L. (2020). Thinking, doing, talking science: The effect on attainment and attitudes of a professional development programme to provide cognitively challenging primary science lessons. International Journal of Science Education, 42(15), 2554-2573. https://doi.org/10.1080/09500693.2020.1821931
Harris, K., Sithole, A., & Kibirige, J. (2017). A needs assessment for the adoption of Next Generation Science Standards (NGSS) in K-12 education in the United States. Journal of Education and Training Studies, 5, 54. https://doi.org/10.11114/jets.v5i9.2576
Johnson, H. J., & Mawyer, K. K. N. (2019). Teacher candidate tool-supported video analysis of students’ science thinking. Journal of Science Teacher Education, 30(5), 528-547. https://doi.org/10.1080/1046560X.2019.1588630
Kloser, M., Borko, H., Wilsey, M., & Rafanelli, S. (2022). Leveraging portfolios in professional development for middle school science teachers’ assessment and data-use practice. Science Education, 106(4), 924–955. https://doi.org/10.1002/sce.21712
Kraft, M. A., Blazar, D., & Hogan, D. (2018). The effect of teacher coaching on instruction and achievement: A meta-analysis of the causal evidence. Review of Educational Research, 88(4), 547-588. https://doi.org/10.3102/0034654318759268
Lappi, O. (2013). Qualitative quantitative and experimental concept possession, criteria for identifying conceptual change in science education. Science & Education, 22(6), 1347-1359. https://doi.org/10.1007/s11191-012-9459-3
Loucks-Horsley, S., Stiles, K. E., Mundry, S., Love, N., & Hewson, P. W. (2010). Designing professional development for teachers of science and mathematics. Corwin.
Lynch, K., Hill, H. C., Gonzalez, K. E., & Pollard, C. (2019). Strengthening the research base that informs STEM instructional improvement efforts: A meta-analysis. Educational Evaluation & Policy Analysis, 41(3), 260-293. https://doi.org/10.3102/0162373719849044
Michaels, S., & O’Connor, C. (2012). Talk science primer. TERC.
Miller, A. R., & Kastens, K. A. (2018). Investigating the impacts of targeted professional development around models and modeling on teachers’ instructional practice and student learning. Journal of Research in Science Teaching, 55(5), 641-663. https://doi.org/10.1002/tea.21434
Murphy, P. K., Greene, J. A., Allen, E., Baszczewski, S., Swearingen, A., Wei, L., & Butler, A. M. (2018). Fostering high school students’ conceptual understanding and argumentation performance in science through Quality Talk discussions. Science Education, 102(6), 1239-1264. https://doi.org/10.1002/sce.21471
National Research Council. (2012). A framework for K-12 science education: Practices, crosscutting concepts, and core ideas. The National Academies Press. https://doi.org/https://doi.org/10.17226/13165
National Science Teaching Association. (2023). K-12 science standards adoption. https://ngss.nsta.org/About.aspx
NGSS Lead States. (2013). Next generation science standards: For states, by states. https://www.nextgenscience.org/
Ogodo, J. A. (2019). Comparing Advanced Placement physics teachers experiencing physics-focused professional development. Journal of Science Teacher Education, 30(6), 639-665. https://doi.org/10.1080/1046560X.2019.1596720
Osborne, J. F., Borko, H., Fishman, E., Gomez Zaccarelli, F., Berson, E., Busch, K. C., Reigh, E., & Tseng, A. (2019). Impacts of a practice-based professional development program on elementary teachers’ facilitation of and student engagement with scientific argumentation. American Educational Research Journal, 56(4), 1067-1112. https://doi.org/10.3102/0002831218812059
Pringle, R. M., Mesa, J., & Hayes, L. (2020). Meeting the demands of science reforms: A comprehensive professional development for practicing middle school teachers. Research in Science Education, 50(2), 709-737. https://doi.org/10.1007/s11165-018-9708-9
Pruitt, S. (2014). The Next Generation Science Standards: The features and challenges. Journal of Science Teacher Education, 25(2), 145-156. https://doi.org/10.1007/s10972-014-9385-0
R Core Team. (2022). R: A language and environment for statistical computing.R Foundation for Statistical Computing. http://www.R-project.org/
Ricketts, A. (2014). Preservice elementary teachers’ ideas about scientific practices. Science & Education, 23(10), 2119-2135. https://doi.org/10.1007/s11191-014-9709-7
Rosebery, A. S., Warren, B., & Tucker-Raymond, E. (2016). Developing interpretive power in science teaching. Journal of Research in Science Teaching, 53(10), 1571-1600. https://doi.org/10.1002/tea.21267
Rutherford, F. J., & Ahlgren, A. (1991). Science for all Americans. Oxford University Press.
Sampson, V., & Clark, D. B. (2008). Assessment of the ways students generate arguments in science education: Current perspectives and recommendations for future directions. Science Education, 92(3), 447-472. https://doi.org/10.1002/sce.20276
Sawilowsky, S. S., & Blair, R. C. (1992). A more realistic look at the robustness and type II error properties of the t test to departures from population normality. Psychological bulletin, 111(2), 352.
Stroupe, D. (2017). Ambitious teachers’ design and use of classrooms as a place of science. Science Education, 101(3), 458-485. https://doi.org/10.1002/sce.21273
Stroupe, D., & Windschitl, M. (2015). Supporting ambitious instruction by beginning teachers with specialized tools and practices. In J. A. Luft & S. L. Dubois (Eds.), Newly hired teachers of science: A better beginning (pp. 181-196). SensePublishers. https://doi.org/10.1007/978-94-6300-283-7_13
Tekkumru-Kisa, M., Preston, C., Kisa, Z., Oz, E., & Morgan, J. (2021). Assessing instructional quality in science in the era of ambitious reforms: A pilot study. Journal of Research in Science Teaching, 58(2), 170-194. https://doi.org/10.1002/tea.21651
Tekkumru-Kisa, M., Stein, M. K., & Coker, R. (2018). Teachers’ learning to facilitate high-level student thinking: Impact of a video-based professional development. Journal of Research in Science Teaching, 55(4), 479-502. https://doi.org/10.1002/tea.21427
Williams, J., & Mourlam, D. (2022). Professional development gone remote: Analysis of in-service teacher beliefs about the implementation of ambitious science teaching practices and science teaching self-efficacy through a workshop delivered on Zoom. In E. Langran (Ed.), Proceedings of Society for Information Technology & Teacher Education International Conference (pp. 272-277). San Diego, CA, United States: Association for the Advancement of Computing in Education. https://www.learntechlib.org/primary/p/220738/.
Windschitl, M., & Barton, A. C. (2016). Rigor and equity by design: Locating a set of core teaching practices for the science education community. In D. H. Gitomer & C. A. Bell (Eds.), Handbook of research on teaching (5th ed., pp. 1099-1158). American Educational Research Association. http://www.jstor.org/stable/j.ctt1s474hg.23
Windschitl, M., Thompson, J., Braaten, M., & Stroupe, D. (2012). Proposing a core set of instructional practices and tools for teachers of science. Science Education, 96(5), 878-903. https://doi.org/10.1002/sce.21027
Windschitl, M., Thompson, J. J., & Braaten, M. L. (2018). Ambitious science teaching. Harvard Education Press.
Appendix
Ambitious Science Teaching Survey
Directions: Please indicate your level of agreement with each of the items below: Strongly Disagree (1) to Strongly Agree (7).
Planning for Engagement with Important Science Ideas
I am able to…
- Identify naturally occurring phenomena appropriate for science instruction.
- Create real-world, context-rich scenarios that allow students to explore natural phenomena related to their lives.
- Create essential questions that guide students’ knowledge of science concepts.
- Create instruction where students use their knowledge of science concepts during investigations.
- Promote my students’ scientific thinking.
Eliciting Students’ Ideas
I am able to…
- Use strategies that elicit students’ prior knowledge.
- Formalize student thinking through whole group discussion.
- Engage students in developing preliminary models that explain natural phenomena.
- Engage students in developing hypotheses that explain natural phenomena.
- Determine student misunderstandings based on evidence shared in class (e.g., conflicting or competing ideas, models, and/or hypotheses).
Supporting On-Going Changes in Thinking
I am able to…
- Create instruction that leads students to develop high-level explanations for natural phenomena.
- Provide direct instruction on non-discoverable science concepts.
- Engage students in sense-making activities.
- Facilitate student reflection to revise and refocus their thinking.
Pressing for Evidence-Based Explanations
I am able to…
- Facilitate instruction requiring students to make evidence-based claims that support their reasoning regarding natural phenomena.
- Facilitate instruction where students explain natural phenomena using their own models.
- Provide support for students as they complete lesson summative assessments.
Assess students’ explanations of natural phenomena to determine student lear