Teaching Religion and Philosophy through COPI Quasi-Experiment


This quasi-experiment aims to compare two pedagogical models, to see which is better at facilitating learner led critical discussion. I use two nonrandomised groups. The independent variable, whether communities of philosophical inquiry (COPI) was applied or not, is applied to the experimental group. The control group follows the premade lesson plan of the institution. For this, my quasi-experiment follows a mixed methods research design, using observations and an audio tape to gather qualitative data.

The experimental group consists of fourteen, seventeen-year-old learners. They will partake in one COPI session on the content of their normal Religion and Philosophy (RP) lesson. The control group consists of eighteen, seventeen-year-old learners. They will undergo the institutions RP class. This quasi-experiment aims to answer the research question by measuring one dependent variable, speech. I planned to measure speech in different ways to test four hypotheses. I planned to use the hypotheses to address the research question. To test them, I would have observed, and audio recorded both lessons, then produced transcripts of each lesson.

Concerning content analysis, I would have calculated the percentage of student talking time (STT) in comparison with teacher talking time (TTT), and the comparative percentages of learner to learner discussion. This data would have been displayed using pie charts. Using a tally chart, I would have counted the frequency and variety of critical dialogue. This would have been displayed in a bar chart. Comparing the representative graphs of the experimental group and the control group would have helped to create meaning towards answering the hypotheses and research question.

The quasi-experiment was not conducted, though the method has gone through a peer-review process. After peer review, improvements were suggested for the design of my quasi-experiment. It was noted engagement in critical discussion is not a guarantee that learning would have taken place. Using algorithms to test for surface and deep learning, providing more COPI sessions and using an exam to assess whether learning took place would increase the quasi-experiments effectiveness at answering the research question.


Chris Candlin (cited by White, Holmes and Bhatia, 2019), an international education researcher, in an address to the Adult Migrant English Program, announced that systems should be ‘peopled’. By this he meant that analysts could make systems to account for data, but if these systems were not relevant to the lives of everyday people, then such data had no use. He thought data was not an end in itself. Whilst Candlin’s Kantian sentiment is his own, I found the statement logical and interpreted it as to keep my quasi-experiment simple, achievable, and practicable. By doing so I hoped to contribute a feasible improvement to RP. I would have tested to have seen if a change in pedagogical model in RP at my placement could have increased and improved critical discussion.

General religious education, referred to as RP at my institution, was developed by the National Open College Network (NOCN). To provide evidence to NOCN, the college have created a booklet, which once completed covers all the marking criteria. Not all RP lessons are booklet lessons, which includes the level two Virtue Ethics lesson used in my quasi-experiment. NOCN (2019) write ‘The qualifications will also allow them to develop and articulate their own points of view about religion and be able to apply these to everyday events’. I believe that if learners are to develop their own points of view, they should practice critical discussion with each other. Moon (2008) describes critical thinking as‘the examination of an idea thoroughly and in depth rather than taking it at its face value’. If learner to learner critical discussion is increased, the NOCN assessment criteria is better met.

I have observed every RP facilitator teach, all struggle to generate discussion between learners, rather than facilitator to learner. Many teachers comment that it is hard to generate any discussion at all, and that the learners are often uninterested. I want to try to increase discussion between learners and increase engagement by using Matthew Lipman’s pedagogical model of COPI. As Lipman, Sharp and Oscanyan (1980) write ‘when people engage in dialogue with one another, they are compelled to reflect, to concentrate, to consider alternatives, to listen closely, to give careful attention to definitions and meanings, to recognise previously unthought of options, and in general to perform a lot of mental activities that they might not have engaged in had the conversation never occurred’. Learner to learner dialogue is fundamental to the development of critical thought.  By building on Lipman and Bierman’s 1970 COPI experiment in Montclair, New Jersey, I hope to test its effectiveness at a Sixth Form College. I discuss this further in the section entitled Rationale.

The institutions RP lesson consists of four activities. Activity one consists of the facilitator instructing the learners, using PPT slides 1-7. The following three activities consist of pair work and discussion. A horseshoe seating plan is employed. Matthew Lipman’s COPI model (Lipman, 1991, cited in Ndofirepi & Musengi, 2019) consists of the following elements ‘ 1) the communal reading of a text 2) the construction of an agenda, i.e. the identification of questions which the reading of the text has raised and the cooperative decision about where to begin the discussion; 3) solidification, which includes the articulation of positions and counter positions, the definition of terms under discussion, and the search for criteria by which to make sound judgements about the subject; 4) exercises and discussion plans, based on the ideas in the text 5) further responses, which may be in the form of creative writing, dramatization, art, or some other modality’. The activities of the RP COPI lesson follow these components. A circle seating plan is employed. Both lesson plans can be seen in the appendix. 

Like Lipman and Bierman I have a control group and an experimental group, though unlike Lipman and Bierman I only have two lessons rather than eighteen lessons. In the control group I would have taught the lesson as the institutions fixed lesson plan dictates. In the experimental group, I would have employed Matthew Lipman’s COPI model. I use four hypotheses to better answer the research question. The first of my hypotheses is that teacher talking time (TTT) will be lower in the experimental group than the control group. The second hypothesis is that learner to learner critical discussion will be higher in the experimental group than the control group. The third hypothesis is that the experimental group will verbally engage in more critical questions than the control group. The fourth is that the experimental group will verbally engage in a wider variety of cognitive functions than the control group.

Engagement in critical discourse is measured in both groups using a tally chart, which uses Edward De Bono’s Six Hats to categorise the data. I explain De Bono’s Hats in the section entitled Method. I use a mixed methods research approach, combining two methods of collecting qualitative data, eliminating single source bias. Gibbs (2007) writes that a strength of qualitative data is its potential to convey meaning in human discourse, providing a means of further analysis into the effectiveness of the lessons. I use qualitative data with audio recordings transferred into transcripts, and display this data in quantitative models, transferring the data in mathematical percentages of the transcript data into pie charts and bar charts. I explain the mixed methods research (MMR) approach further and the reliability and validity of my data in the section entitled Method.

I have chosen this topic as I think the quasi-experiment is both plausible and useful. There is a problem at the Sixth Form College of getting learners to talk. Previous experiments displaying the long-lasting effects of COPI infer their students must have been practising cognitive faculties related to critical thinking. I believe this was done through increased critical discussion. If this quasi-experiment does display that discussion is increased through COPI, it could lead to the opportunity of further experimentation on this topic in the future, and the creation of a more effective Religion and Philosophy lesson. A critical discussion of my method can be seen in the sections entitled Method and Peer Review.


Previous research

COPI is a pedagogical model developed by Matthew Lipman, which focuses on ‘doing’ Philosophy in a learner lead environment as opposed to instructing a class on philosophical matters. In this way learners are encouraged to engage in learner lead critical discussion within COPI.  By using COPI rather than an instructor lead approach, as seen in the appendix, I think critical discussion between learners will increase. Currently, all facilitators of RP at my institution have commented that critical discussion between learners is challenging to elicit. I chose the research question ‘Is COPI a better pedagogical model for facilitating learner lead critical discussion than the one currently implemented?’ to see if this pedagogical model could help solve the problem. If this quasi-experiment shows favourable results it may inspire further testing, and ultimately an improved RP lesson.

The first experimental test on the COPI model was conducted by Bierman and Lipman in 1970 (Lipman, Sharp and Oscanyan, 1980), which consisted of two classes of 20 fifth graders. The control group was given to a social study experiment, whilst the experimental group undertook 18, 40-minute sessions of COPI over nine weeks. There is no mention of what social study experiment the control group undertook or how this may have been factored in influencing later results. Though this is clearly a necessary factor to understand the differing results of both groups, scores on the 1962 edition of the revision long form test, aimed at assessing critical thinking skills, displayed significant gains for the experimental group.

Furthermore, the same groups of students were assessed two and a half years later. The gains had stretched even further for the experimental group. Though many other factors unknown to Lipman and Bierman could be the cause of this, other studies reinforced these results (Lipman, Sharp and Oscanyan, 1980), concluding that COPI improved reading and mathematics, reasoning skills and academic readiness. I do not have the opportunity to carry out as long an experiment as Birman and Lipman, but I can test to see if the students are engaging in faculties that, with practice, would have the potential to achieve such results.

Banks (1987) conducted further experimentation on the COPI model. Using opposing control groups and experimental groups, with a sample of 272 students, groups of three classes from five separate schools were selected. These classes represented grades, two, four and five. Banks hypothesised that the total maths, total reading, and total language scores of the experimental groups in the California Achievement Test would be higher than the control groups. Banks (1987) reported that all three hypotheses were proved correct. I have used multiple hypotheses in my quasi-experiment due to Banks, as it helps to try and answer the research question from more than one perspective.

Banks (1987) writes on the concept of transfer “The premise is that skills taught in one subject area of the curriculum should assist students in other subject areas”. The above research is clear evidence COPI can benefit learners in reading, mathematics and reasoning, evidence for the transfer of skills into other areas. While this is removed from my four hypotheses, it does suggest that the learners in these experiments were engaging in critical dialogue with each other, as they would have needed to practise these elements to progress. By capturing the dialogue into a transcript, it will be possible to show a heightened engagement in these activities, and thus the need for further research.

A concern for Banks was that teachers might not teach COPI in the same way, some being worse than others. Banks claims COPI is very teacher sensitive. I have not been trained in facilitating COPI. I am the main limitation on my own study. The results of my quasi-experiment have the potential to be dramatically different if facilitated by a fully trained COPI facilitator. Additionally, another limitation that Banks (1987) was concerned about was that the Hawthorne effect, individuals acting in an unauthentic manner whilst being observed, could contribute to teachers self-selecting. I do not see this as a problem. While it is true that changes in productivity and behaviour whilst being observed in trials has been a factor (McCarney, 2007), even in randomised groups, the parts that make my research a whole would remain uninfluenced. Though the content of my lesson plan cannot be changed by the Hawthorne effect, the learners may be susceptible to the Hawthorne effect. Both Lipman and Banks prominently use fifth graders, who are nine to eleven years old. My participants are seventeen to eighteen, and as such may be more image conscious than children. COPI is used with differing ages worldwide, so I am not concerned about the age of participants making a difference, though in the future, this quasi-experiment will need to be repeated and conducted on larger scales for longer durations to determine any impact from the Hawthorne effect or differences in age.

Capturing discourse

Unlike Banks and Lipman, I do not have the resources to expose learners to a course of COPI, or the resources to assess quantifiable progression in learners. As I have one lesson, I can measure how frequently learners were verbally engaging in critical discussion. To determine if learners are engaging in critical discussion the transcript must be analysed for linguistic features. Bloom and Clark (2006) write ‘capturing discourse-in-use requires description of the linguistic features people in interaction with each other use as they mutually construct an event’. A transcript is only understandable if the context of the dialogue, and the relationship between the participants, is understood. Without context and an understanding of the social dynamic of the community, it is much harder to understand if learners are engaging in philosophical discourse or not. Other meanings are inferred by linguistic features.

Bloom and Clark (2006) write ‘By Linguistic features we are referring to the broad range of semantic tools that people have available for communicating their intents and responding to each other’. Due to these linguistic features, making a transcript that accurately represents what members of the community are trying to convey is challenging. A transcript should contain more details than simply what was said at what time (Bloom and Clark, 2006), as potential inference of meaning is lost on someone who does not understand the groups’ social history and dynamic.

Nonverbal communication between members of the community is such a feature. Though not linguistic, gesticulations and other forms of conveying meaning through physical movement can greatly impact the context of a partnered verbalisation. Nonverbal communication can be a way in which we draw social boundaries between each other. These social boundaries will also impact understanding a transcript, as members of the community will convey meaning to each other through the lenses of these boundaries. The relationship between facilitator and learner is such an example. Impacts on conformity concerning time and place on dialogue is a similar but separate issue, the symbolic resonance of a place impacts the behaviour of our learners and indeed ourselves, as Martin (1999) writes ‘Physical and spatial aspects of a learning environment communicate a symbolic message of what is expected to happen in a particular place’. The way people conduct themselves is commonly different in a church than from a pub for example. What seems socially acceptable in each, differs. This impact on social behaviour will influence what our learners feel like they can or cannot convey.

Animated discourse is discourse that holds in a social dynamic as its own agent or person. Bloom and Clark (2006) write ‘Such animation occurs when discourse is viewed as capturing a person or as positioning a person’. As such, animated discourse is often political, through the lens of dividing practices. A dividing practice of an institution can be seen in the word heretic. By labelling individuals as heretics, the Catholic church created the concept and naturalised it, meaning it is accepted as a common truth. By normalising that heretics exist, others believe that heretics exist (Bloom and Clark, 2006), thus the animated discourse is taken as truth.

It is important to be able to spot animated discourse not just for inclusive purposes, protecting our learners from abuse, but also if such animated discourse has been used to influence the meaning of STT. All these linguistic features greatly influence how to interpret a transcript. All these processes misunderstood by a reader of the transcript could influence understanding of how effective a COPI was, and if learners were really asking and engaging in critical discussion. The opposite is also possible, that learners may use linguistic features to contribute to critical discussion without it being observed, such as a nonverbal cue to denote a viewpoint related to the discourse. I write on how I increase the reliability and validity of the qualitative data in the section entitled Methods.



Shadish (2007) writes that a quasi-experiment is commonly an experiment in which groups of participants have not been randomised.  My quasi-experiment follows a non-equivalent control group design, as Shadish (2007) writes ‘in which the outcomes of two or more treatment or comparison conditions are studied, but the experimenter does not control assignment to conditions’. Though I have chosen which classes of students I will use, I have not been able to randomise the learners. Due to no funding and limited resources, most of my decisions have had to be based on what is convenient. Quasi-experiments are prone to issues concerning internal validity, whether the difference between two variables is caused by the dependent variable, in this case it is whether COPI is employed or not.

Randomisation helps to reduce concerns surrounding internal validity. Shadish (2007) suggests using pre-test and post-test data to reduce the probability of an error occurring. I will use pre-test personal observations of the students’ characteristics, as I am a common factor between the selected classes and teach each class on a weekly basis. I would use this observational data to validate if the dependent variable impacted the study.
Though a quasi-experiment, I used a mixed methods approach to gather data. Tashakkori and Teddie (2003) see research practices as being assignable into three categories. The first was quantitatively focused research practices, influenced by positivist and postpositivist schools of thought. The second was qualitatively focused research practices, influenced by constructivist and naturalist schools of thought. Tashakkori and Teddie, cited by Riazi and Candlin (2014) write ‘(c) mixed methods research practices, working within multiple research paradigms and interested in both quantitative and qualitative data’. I use a mixed methods approach, as although I only use qualitative data, it is collected through both observation and audio recording. Though I employ a tally chart which some would see as quantitative data, the tally chart is used to count qualitative data. Thus, I see the tally chart as a different way of representing the qualitative data; as the data is not numerical it is subjective. I discuss content analysis in the section entitled research methods.

I use two methods of collecting data to follow a triangulation MMR approach. Dezin (1978) and Green (1989) define triangulation as intentionally using both quantitative and qualitative methods to find a corroboration between the results (Riazi and Candlin, 2014), thus removing the bias found from the use of a single method. Though I only use qualitative methods, I cross reference the observations with the audio recording to reduce the chance of misinterpreting linguistic features.

Gibbs (2007) writes on qualitative data’s strength of capturing human discourse, which is why I planned to create a transcript from an audio recording. That said, due to a quasi-experiment’s inherent weakness of determining the impact of the dependent variable, it is important to be able to determine linguistic features and their implications on the transcript. The first and the second hypotheses of this quasi-experiment do not display the effectiveness of the dialogue. To display the quasi-experiment’s effectiveness, the frequency of dialogue engaged with critical thought, and the breadth of critical thought engaged must be measured. I display the result of hypotheses one and two with a pie chart. I display the results for hypotheses three and four in a bar chart. Both decisions were made for ease of viewing and are displayed in the appendix.

The third and fourth hypotheses, calculating how many critical questions were asked and how wide a variety of critical questions, determine the effectiveness of the dialogue.
Burbules and Warwick (2006) write on how to distinguish between different types of philosophical thinking, to help people to better observe philosophical dialogue. They describe ten methods, which exist as categories of critical modes the interlocutor may engage in. That said, Burbules and Warwick (2006) also write ‘Although we describe these as 10 methods, they rarely appear in pure or separate forms, there are countless hybrids and multi-layered versions possible’. As the variety of philosophical thought available is vast, it does not seem feasible to catalogue them all.

Lipman (1980) similarly offers multiple definitions of what can be included in philosophical thought, a multiplicity of terms dotted across several chapters of Lipman, Sharp and Oscanyan’s (1980) work, also suggesting creating classes based just on syllogistic logic. There are a wide variety of cognitive faculties, trying to distinguish and isolate them all on a transcript is unnecessary to test my hypotheses. To see if critical thought is taking place, I use Edward De Bono’s Hats because they are broad categories that can incorporate other logical functions. Six questions based on De Bono’s Hats are used as sample questions on the PPT for the experimental group, but not the control group. I do not think this creates a bias, as the definitions of the hats do encompass questions found naturally in critical dialogue, discussed later. Edward De Bono (1933) lists the categories of thinking resembled by the hats as follows: information, positivity, criticism, creativity, emotion and metacognition. Under these broad terms many cognitive faculties can be found.

 De Bono (1933) made the Six Hat method to teach different modes of thinking as distinct, planning for them to be used in parallel thought, or the same hat as a group at the same time, to avoid conflict and arrive at a solution. He does not say that these types of questions cannot happen in natural discourse, making the quasi-experiment fairer to the control group. De Bono (1933) writes ‘when we make decisions on our own, we go through more or less the same process (pros, cons, feelings, facts). The Six Hats method does all that’. Critical dialogue that is stimulated in the control group can be categorised by De Bono’s Hats, regardless of not having been provided with questions based on the hats.


My participants consisted of two RP classes. The control group would have consisted of eighteen students and the experimental group of fourteen. The groups are culturally diverse, consisting of multiple faiths and ethnicities, making it more likely that participant variables will be on a wide spectrum, though this does prove true for both groups. Due to participant variables further experimentation would be best to validate results, however, as there is a different group for each of the two sessions order effects will not impact my study. This includes fatigue effect, a decrease in performance due to repetition, and practice effect, improvement due to repetition.  

I chose both classes for convenience, however the size of the groups is an issue for the validity of my study. It is easier, in my experience, to facilitate discussion in a smaller group. The experimental group does have this advantage over the control group. There is nothing I can do about this ethically, as all students need to be taught the content on Virtue Ethics. A week before my quasi-experiment took place, I planned to send an email informing the learners what I intended to do and that I needed to audio record the lesson. This email can be seen in the appendix. Before the lesson began each group was asked to fill out a consent form, also seen in the appendix. Lauumann (2020) writes on qualitative data reliability that ‘With qualitative data analyses, the interviews are usually audio-recorded and transcribed, since this more accurately captures what the participants actually say, than the mere analyst notes’. My quasi-experiment begins with teaching and audio recording the control and experimental groups, and the cassette recorder placed in the centre of the circle seating plan. Whilst Lauumann (2020) agrees that audio recording is a reliable method, she adds that videos, photos, and notes should also be taken to add to the transcript. Were I to take photos or videos, I would not be able to provide the same level of anonymity that I currently offer the learners. Therefore, I also use observation and participant notes as a means of retrieving qualitative data. I will ask a colleague to observe and take notes whilst the quasi-experiment is taking place, and will later cross-reference the notes with my own observations. By using more than one perspective, I can reduce the influence of my own bias on the collected data, and make sure to mark any linguistic features that will illuminate other meanings within the transcript. By using these observations, it will be easier to tell if the dependent variable was responsible for variations in results between the control and experimental group, increasing the validity of my data.

Research Instruments

For the quasi-experiment I use an audio cassette recorder and tape, two lesson plans, observational data, a transcript, and a tally sheet. To employ the COPI pedagogy and my institution’s planned lesson, I use the corresponding lesson plans in the appendix, making the quasi-experiment repeatable. The quasi-experiment is formatted to perform qualitative content analysis through counting and generating proportions. For content analysis, the data contained in the transcripts are used to measure the effectiveness of both lessons as, unlike Lipman and Banks, I do not have the resources to use exam results as evidence. Thus, I use the transcripts to gain data on learner engagement in critical thinking. In this way it can be seen if COPI increases engagement in critical discussion between learners or not, and if it is worth pursuing further experimentation.

Lacy (2014) writes on proportions ‘because the reference point is 100%, the importance of such a finding is easily grasped, and comparisons are possible across samples’. By counting words and producing percentages of TTT to STT, I can compare the data between both groups and produce meaning.  That said, Bocholtz (2000) cited by Jaffe (2001) writes ‘All transcripts take sides, enabling certain interpretations, advancing particular interests, framing specific speakers and so on. The choices made in transcripts link the transcripts to the context in which it intended to be read’. Capturing discourse on an audio recording is limited by the transcriber. I have produced this quasi-experiment to investigate my hypotheses, which may lead me to interpret the transcript differently to a different researcher. Jaffe (2001) warns that if a transcript is interpreted by its author then the transcripts validity is undermined, as the discourse could be interpreted in a way that better serves the author’s purposes. Using observational data to analyse the data will aid in validating the transcript. The observational data will not take a fixed form, rather it will consist of notes ordered in whatever way the observer chooses, though will catalogue behaviour from learners that could impact understanding of the transcript.

The tally sheet presents a simple way to catalogue the frequency and variety of critical thought engaged in. Once the transcript has been reviewed considering observational data and additional notes from participants, I would have used a tally sheet to categorise the breadth and frequency of critical discussion, as it is simple and effective. This tally sheet can be seen in the appendix. The students are anonymised on the tally sheet and the transcript. Lacy (2014) writes ‘raw numbers do not provide a reference point for discerning the meaning of those numbers’. That said, I use the numbers as a point of comparison between both groups, and further comparisons can be made with future experiments. Critical thought is counted from the transcript and added to the tally chart, then transferred into a bar graph. This would be done so the data is easy to view.  


My quasi-experiment was structured to follow ethical guidelines set out by the British Educational Research Association (BERA). I chose to follow BERA guidelines due to the international connections of the association (Anon, 2018), presenting my research when completed to a wide potential audience. As this document will not be published, I am the only researcher, I have no clients, stakeholders or sponsors, my only duties are to my participants. I was not able to carry out my quasi-experiment, though the aspects I had to consider were (Anon, 2018) consent, transparency, right to withdraw, incentives, harm arising from participation in research, privacy and data storage and disclosure.

As previously mentioned, an email would be sent a week before the quasi-experiment was to take place to participants. This was done so participants did not feel pressured into signing the consent form on the day, providing an opportunity to decline privately without embarrassment. The email also provided transparency, describing to my participants exactly what I would do with the data I required and why, and an early option to withdraw. All participants would have been told before the quasi-experiment that they could withdraw at any point. I did not offer any incentives for participating, as I did not want the learners’ behaviour to be impacted by incentives during the quasi-experiment. 

Causing embarrassment was a fear of mine from the quasi-experiment, as I planned to ask participants to contribute to the transcript. Jaffe (2011) writes ‘The researcher needs to take measures to ensure the trustworthiness of the transcripts. One way to do this is to have interviewees validate the transcripts by correcting them if necessary and by clarifying unclear issues’. By asking participants to validate the transcripts I will make the qualitative data more reliable. It will help me to understand what parts of the dialogue were critical, and how better to represent it statistically. That said, Jaffe (2011) also writes ‘Reading transcripts is often cause for embarrassment and anxiety for interviews both because of the exposure of what was said as well as their perception of how it is presented in the transcripts’. This raises two concerns, firstly that participants must be asked if they would like to receive the transcript, and that they are happy for other participants to read the transcript.

Causing non-consensual embarrassment to participants would be highly unethical, so I will make sure participants know they do not have to make an edition or read the transcript if they do not wish to. Secondly, it is a clear problem that participants may wish to edit the transcript to make themselves come across in a way that they view as ‘better’. To guard against this possibility, I would request that participants suggest edits to the text, but overall, I would have the final word to include them or not. If an edit seems particularly disingenuous, then I will not include the edit. That said, I cannot know the true validity of a claim. Trusting in my best judgement gives me power to protect the integrity of my transcript, though it does not fully protect the integrity of the transcripts from mistakes I may make.

For data protection, after recording both lessons I would have destroyed the tape, removing any possibility of mishandling data. Once the statistics were put into graphs, I would also delete and shred the transcripts. All statistics will be considered through the lens of the transcript, used to help determine critical dialogue from other linguistic features. I will calculate the percentage of STT to TTT by dividing the total words enunciated by students by total words enunciated, and then multiply the result by one hundred. I would then display the results on a pie chart, two inputs being easier to visualise on a pie chart than other options. This method cannot consider the speed of different speakers, or that single words might not carry their intended meaning independent of other words. It does however give an indication of the distribution of talking time. I would represent the data with pie charts, displaying the overall percentage of STT to TTT. The comparative percentage of learner to learner critical dialogue will be displayed on a separate pie chart, calculated by dividing words directed between learners by total words enunciated, and multiplying the result by one hundred. These pie charts will provide visual aids for addressing hypotheses one and two. The research question can be addressed by comparing the pie charts and bar graphs of the experimental and control groups to produce meaning, potentially encouraging further experimentation into which pedagogical model is superior.

Feedback and discussion

The following points have been raised during peer review on the above planned quasi-experiment. Four of my peers, all of us studying a Post-Compulsory PGCE at UCL, provided a point each for consideration. Their names are not included for the purpose of this document. Due to the word allowance of this section, I have chosen to focus on two comments, each addressed in its own section. The last section addresses what I will take into my future practice, and my development as a teacher.

There seems to be difficulties with measuring the effectiveness of the discussion

Qualitative data is inherently subjective. As content analysis depends on my interpretation of transcripts I wrote, it is impossible to be completely certain that learners were engaging in critical dialogue. Furthermore, even if the data is accurate, engaging in critical dialogue does not determine if learning took place. This is a weakness of the research question, as it was written assuming engagement in critical discussion would be conducive to learning, which is not a certainty. Surface learning is thought to function when the participant focuses on the thing itself that is supposed to learnt, memorising it through repetition and similar techniques. Deep learning is thought to function when the participant is focusing on what the thing being learnt means, and how this thing fits into the participants wider frame of knowledge. It is commonly believed that information acquired through surface learning cannot be recalled for as long a time as information acquired through deep learning (Marton and Säljö, 1976), which if true causes the hypotheses to have the same underlying problem. Calculating the percentages of talking time of participants does not determine if critical dialogue was being engaged in, or if learning took place. The hypotheses do not consider any other factors that may increase talking time other than engagement. For instance, if learners are disengaged and were discussing an unrelated topic, learner to learner dialogue will show to be increased, even though the learners were not participating in the planned activity.

Better content analysis could help solve this problem. Henri (1991) wrote on cognitive features which indicated surface learning in critical dialogue, and metacognitive features which indicated deep learning in critical dialogue. Kerrin (2001) used these features in his experiment, consisting of recording prechosen groups of learners engaging in critical thinking tasks. Transcripts were made from audio and video recordings of the groups. Using Garrison’s five categories of critical thought, which draw from Dewey’s five stages of reflective thought and Brookfield’s five categories of critical thought (Kerrin, 2001), Henri’s cognitive and metacognitive indicators were applied to the categorised dialogue to determine if surface or deep learning took place. A computer code was used to analyse the data, which utilised an algorithm to determine the level of learning. Incorporating this method into a future experiment would better answer the research question, helping to detect engagement with critical discussion and to assess learning. Karin (2001) does write he had a problem with Garrison’s indicators having overlapping concepts, and that some features were impossible to categorise altogether. I would keep to De Bono’s Hats for categorisation, as the hats occur in everyday discourse, though incorporate Hari’s indicators of surface and deep learning.

The research question and the hypothesises are unanswerable with the data that can be collected from my quasi-experiment. There are some approaches which would strengthen the validity of my quasi-experiment, as such additional data collecting methods like video recordings, or content analysis methods such as applying algorithms to data through computer coding. This would increase the validity of the study, providing more means of determining if learners were engaging in critical dialogue. However, this still would not provide sufficient evidence that learning took place. More COPI sessions and an exam after all sessions were completed would
provide further insight into answering the research question.

The age of learners is inconsistent with previous research

In both studies mentioned in the section entitled Rationale, learners are significantly younger. Grade five contains the ages 9-11 in the US, whilst my leaners are 17-18 years old. COPI can and has been used from primary education to adult education. Arie Kizel (2019) has written extensively on the COPI model and has employed the pedagogy internationally to differing age groups, proving that the model can be implemented with young adults. Furthermore, Kuhn (2006) writes ‘The findings presented here suggest that some 12-year-olds have become as capable as many adults in managing the interaction of theory and evidence in their own thinking, in a way that supports effective learning’. Whilst how learning best takes places differs from learner to learner, young adults and children are shown to be able to engage with critical thinking and learn from it. Though I do not think the age of the learners will impact the effectiveness of the COPI, it seems likely age will impact how the learners interact with the COPI.

Herbert (2006) writes that during adolescence ‘They become capable of general propositional thinking, i.e. they can propose hypotheses and deduce consequences’. This new development in cognition, that children in Grade 5 are unlikely to have undergone, may change the way in which learners interact with the pedagogical model, potentially being less effective with an older age group. Also written in the section entitled Rationale, is that 17-18 year old learners could be more likely to be impacted by the Hawthorne effect, as adolescents may be more concerned about the opinions of their peers than that of primary school children. Herbert (2006) writes ‘adolescents are thought to become preoccupied with finding for themselves a satisfactory answer to the question ‘Who am I?’. They may ‘try out’ a variety of identities’. This focus on the self may lead to greater anxiety in learners in how they are perceived by their peers than children, potentially hindering participation. In the future, I need to do more research into previous studies concerning COPI and older learners. 

My development as a teacher

Planning this quasi-experiment has taught me the important link between teaching, learning and research. A teacher should be able to use their reflections as a basis for enquiry. As Gregson (2015) writes ‘The scholarship of teaching and learning connects pedagogic development with pedagogical research’. Through this quasi-experiment I would be able to compare two pedagogical models, connecting pedagogical scholarship, research, and an attempt to improve learning. I will continue to use research strategies in my future practice to shape my development as a teacher.

This will best take place through joint professional development (JPD), as Gregson (2015) writes ‘At the heart of the JPD approach is the idea that when teachers, education leaders and organizations learn from one another as they experiment with putting research findings into practice, real change can happen’. By carrying out experiments with colleagues, data can be collected and seen through multiple lenses, then subsequent experiments can be carried out based on the previous data until a means of improving teaching practice is found. Through research, we can inspire evidence based, long lasting change in our own teaching and within wider educational policy. Whilst this quasi-experiment was designed to be carried out at a micro-level, in a school context, macro-level elements such as national educational policy should also be informed by research and experimentation. My JPD practice could influence other educational professionals into conducting wider experimentation, which could lead to eventual change and reform within educational policy.

Recommendations and reflections

Though I have not carried out my quasi-experiment, the peer review process raised recommendations to improve the validity and reliability of my method. To address the difficulties of measuring the effectiveness of the research question, I would wish to increase the number of COPI sessions taken to 18, 40-minute sessions of COPI over nine weeks, mimicking Lipman’s 1970 experiment. After this course participants would be asked to take a test in mathematics, reading and reasoning, following Banks’ experiment. These changes should allow for sufficient time for the COPI to take effect and would produce quantitative data to better answer the research question. Data from transcripts would still be recorded, though better data analysis applied through computer coding, allowing for the running of algorithms to test for surface or deep learning. If the results show significant improvements for the experimental group then participation in experimentation from facilitators on a macro-level, by which I mean a national level, should be requested. More content analysis on larger data samples is needed to address the research question.

In regards to learning gained from the quasi-experiment, even if coding and exams were used to test for learning, Gregson (2015) writes ‘‘Evidence can tell us that something has ‘worked’ in a particular situation for someone else; it cannot tell us if or how we should use that in the complex and unfolding situations which face teachers’. To be able to confidently claim COPI was more conducive to learning than an instructor-based approach would take vast experimentation with multiple facilitators. As for my own learning, planning a quasi-experiment has taught me the benefits of JPD, and the need to incorporate it into my future practice. Producing the quasi-experiment has taught me about ethical considerations, reliability, validity, and the necessary considerations to conduct research. Collaborating with colleagues in future experiments will offer a variety of opinions on collected data and will help construct future experiments, based on data to improve institutional teaching practice.  

Considering teaching professional standards on teaching in the FE sector, some standards are relevant to the planning of this quasi-experiment. A strength of mine within the category ‘professional values and attributes’ was that my quasi-experiment challenged my creativity and ability to innovative. I attempted to select and adapt an existing pedagogical model to facilitate learning. That said, a weakness in the same category is that I did not challenge or evaluate my practice, values, or beliefs enough. My focus was on challenging the instructor led model of my institution, rather than challenging my own suggestion of COPI. Critical thinking is not the only skill supposed to be learnt in RP, content also needs to be learnt. The control group may be better at learning information about Virtue Ethics. I personally prefer learner led methods rather than instructor focused methods; in the future I will try and consider the weaknesses of my own suggestions.

Within the category ‘professional knowledge and understanding’, this assignment evidences that I have evaluated my practice with others through peer review, which lead to an evaluation of whether engagement in critical discussion would be conducive to learning. My peers provided feedback which I have evaluated and incorporated as action to be taken in a future quasi-experiment. I could however work more on maintaining and updating my knowledge of educational research to develop evidence-based practice. It was only until the peer review that I had considered critical discussions link to learning, and I had not found research that confirmed the COPI would be appropriate for the age group of my learners. I lacked sufficient evidence that the pedagogy was appropriate.

Supporting Mathematics and English capabilities is found within the category ‘professional skills’. Banks has shown evidence that the transfer of the skill of reasoning is possible through prolonged exposer to COPI, and that reasoning aids Mathematics and English. This concept, though not proven, is something I will try and incorporate into future JPD research. Though I have tried to plan for learners to control their own learning, I have not planned to give them the means to self-assess. In the future, the quasi-experiment should incorporate a task at the end so the students can self-assess their own learning experience and decide which model they believe is more beneficial.


Anon, (2019) NOCN General Religious Education Qualification Specification 7.0, (March 2019)

Anon, (2018) British Educational Research Association [BERA] (2018) Ethical Guidelines for Educational Research, fourth edition, London.

Banks, J, and Ragland, C. (1987) A STUDY OF THE EFFECTS OF THE CRITICAL THINKING SKILLS PROGRAM, PHILOSOPHY FOR CHILDREN, ON A STANDARDIZED ACHIEVEMENT TEST. Ph.D. diss., Southern Illinois University at Edwardsville, https://search-proquest-com.libproxy.ucl.ac.uk/docview/303531257?accountid=14511 (accessed April 27, 2020).

Carol, S. and O’Sullivan, P. (2001) Measuring Critical Thinking in Problem-Based Learning Discourse, Teaching and Learning in Medicine, 13:1, 27-35, DOI: 10.1207/S15328015TLM1301_6

Clark, M. and Shadish, W. (2007) Quasi-experimental method. In N. J. Salkind (Ed.), Encyclopedia of measurement and statistics (Vol. 1, pp. 806-808). Thousand Oaks, CA: SAGE Publications, Inc. doi: 10.4135/9781412952644.n369

De Bono, E. (2000) Six thinking hats / Edward de Bono. Rev. and updated., London: Penguin.

Gibbs, G. (2007) Analysing Qualitative Data / Graham R. Gibbs. Los Angeles, Calif. ; London : SAGE, 2007.           

Green, J. (2006) Handbook of Complementary Methods in Education Research, edited by Judith L. Green, et al., Taylor & Francis Group, 2006. ProQuest Ebook Central, https://ebookcentral.proquest.com/lib/ucl/detail.action?docID=446575.   

Gregson, Margaret. Readings for Reflective Teaching in Further, Adult and Vocational Education / Edited by Margaret Gregson … [Et Al.]. London : Bloomsbury Academic, 2015.

Henri, F. (1991) Computer conferencing and content analysis. In A.R. Kaye (Ed.), Collaborative learning through computer conferencing (pp. 117–36). Heidelberg, Germany: Springer-Verlag, 1991.

Herbert, M. (2006) Clinical Child and Adolescent Psychology : from Theory to Practice / Martin Herbert. Chichester : Wiley, 2006.

Kuhn, D. and Pease, M. (2006) Do Children and Adults Learn Differently?, Journal of Cognition and Development, 7:3, 279-293, DOI: 10.1207/s15327647jcd0703_1

Kizel, A. (2019) “Enabling Identity as an Ethical Tension in a Community of Philosophical Inquiry with Children and Young Adults.” Global Studies of Childhood 9.2 (2019): 145-55. Web

Laumann, K. (2020) Criteria for qualitative methods in human reliability analysis, Reliability Engineering & System Safety, Volume 194, 2020, 106198, ISSN 0951-8320, https://doi.org/10.1016/j.ress.2018.07.001.

ML, AS, FO, Philosophy in the Classroom / Matthew Lipman, Ann Margaret Sharp, Frederick S. Oscanyan; [photos., Joseph B. Isaacson]. 2d ed. Philadelphia: Temple UP, 1980. Web.

Martin, S. (2009) Environment-Behaviour Studies in the Classroom. Journal of Design & Technology Education,[S.l.], v. 9, n. 2, aug. 2009. ISSN 1360-1431.

Mccarney, R. (2007) “The Hawthorne Effect: a Randomised, Controlled Trial.” BMC MED RES METHODOL , 7 , Article 30. (2007), 2007.

Moon, J. (2008) Critical Thinking An Exploration of Theory and Practice, (London: Routledge, 2008)

Marton, F., & Säljö, R. (1976). On qualitative differences in learning: I—Outcome and process. British Journal of Educational Psychology, 46(1), 4–11.

Mero-Jaffe, I. (2011) ‘Is that what I said?’ Interview Transcript Approval by Participants: An Aspect of Ethics in Qualitative Research. International Journal Of Qualitative Methods, 10(3), pp.231–247.

Riffe, D. Lacy, S., Fico, F. (2014) Analyzing Media Messages. New York: Routledge, https://doi-org.libproxy.ucl.ac.uk/10.4324/9780203551691

Riazi, M, and Candlin, C. (2014) Mixed-methods research in language teaching and learning: Opportunities, issues and challenges. Language Teaching 47, (2) (04): 135-173, https://search-proquest-com.libproxy.ucl.ac.uk/docview/1506740898?accountid=14511 (accessed April 27, 2020).

White, C, Holmes, J, and Bhatia, V. (2019) Trading Places, Creating Spaces: Chris Candlin’s contribution to aligning research and practice. Language Teaching, 52(4), 476-489. Doi:10.1017/s0261444818000204


 Appendix currently unavailable on WordPress

[email protected]

Privacy Policy and Cookies

















Published by Coffee & Alex

Alexander Clarke is a sole trader who writes and teaches. He’s published articles, blog posts, short stories and poems. He’s taught philosophy, theology, ESOL and PSHE.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: