Main Page  | The Language Teacher  | JALT Journal  | Other Publications  | JALT National |

The Language Teacher

Using Oral Interviews at a Junior College

Ann F.V. Smith and Wilma Nederend

Tokushima Bunri University


Finding effective ways to include communicative language goals in oral class assessment is one of the challenges facing teachers in Japan today. As students are acutely aware of any upcoming assessment, be it a test or an exam, teachers need an oral assessment which takes advantage of this positive backwash to influence students' goals, motivation, and daily classroom learning.

As foreign teachers at Tokushima Bunri University (TBU), a private university in Shikoku, we were faced with the task of designing an appropriate summative oral assessment for the end of each semester of the oral communication course. We wanted a format that would reflect the conversational fluency activities used in the classroom and allow students to demonstrate their oral competency. So we decided to use a form of oral interview to assess speaking proficiency.

Here, we will first consider various oral interview formats. Then we will explain the format developed for first and second year college students. Finally we will discuss its merits and drawbacks.

The Interview

The interview is a popular oral assessment framework for eliciting language samples, but it can vary considerably in format and content. According to Scarcella and Oxford (1992), who apply Canale and Swain's framework of communicative competence for speaking proficiency, an oral interview should assess not only traditional grammatical competence. It should also include sociolinguistic competence in appropriate use of register, speech acts, and intonation; strategic competence in communication strategies, gestures, and circumlocution; and discourse competence in coherence and cohesion (p. 154).

Initially, we reviewed a number of interview formats before developing one suitable for the TBU situation. For example, an interview may consist of one student, a pair, or a small group. There may be one, two, or three interviewers, and/or scorers, who may be native and/or non-native speakers, or instructions on tape. The interview may last from 10 to 30 minutes, be taped, and scored holistically or objectively (Hughes, 1989; Underhill, 1991; Wier, 1990).

Perhaps the best known is the American Council on the Teaching of Foreign Languages' (ACTFL) individual, oral proficiency interview, which follows a 20minute, four-part format: introduction and warm-up, level check, probe, wind-down and closure. The Canadian Test for Scholars and Trainees (CanTEST), and the International English Language Testing System (IELTS) follow a similar format.

In this four-part format, part one, the warm-up, is not generally scored. The interviewer puts the candidate at ease by asking predictable personal questions, and IELTS includes completion of a personal information form. In part two, the level check, the interviewer asks familiar questions about school, work, leisure, daily routines, and future plans (Cohen, 1994), and gains an indication of the student's level. Part three, the probe stage, consists of more demanding, in-depth questions on one or two particular topics to assess how well the student can answer before requiring repetition or closure; students may also ask questions in this part. In the final part, or wind-down, the interview returns to easier, predictable questions that build confidence, and then closes with the thank-you and goodbye (CanTEST, 1993; Garbutt & O'Sullivan, 1991; Hughes, 1989; Nagata, 1995; Underhill, 1991).

For our interview, we decided three of the four parts would be appropriate: part one, the warm up; part two, the level check; and part four, the wind down. However, we realized the in-depth questioning of the probe stage would not only be too difficult for our first-year students, who are mostly false beginners, but would also limit the interview solely to a question-and answer format. So we looked to other oral assessments for feasible alternatives.

The Cambridge Preliminary English Test (PET), TOEFL's Test of Spoken English (TSE), IELTS, and the recently introduced Standard Speaking Test (SST)-- from ACTFL and ALC Press Japan --include a variety of activities between the warm-up and the wind-down. The PET includes a simulated situation based on a task which requires the use of functions such as requests or choices. PET, TSE, and SST all include a response to visual stimuli, such as a map, a picture, a picture sequence, or a photograph (PET, 1996). The response requires description, narration, and/or general conversation. TSE also includes a response to a topic prompt, such as sports or fashion, as well as interpretation of a graph, and a notice (1995-1996 Bulletin of Information, pp. 13-14). IELTS adds an elicitation in which the candidate asks the interviewer questions on a task. SST has a role-play with the interviewer (1996). As we use functions, visual stimuli, and role plays in class, we decided these options could be developed to suit our situation.

We also decided to try scoring with both a holistic and an objective scale, as some interviews rate holistically, while others rate objectively either on-the-spot, or later on tape--as in the TSE. A holistic rating, such as the IELTS nine-band proficiency scale, assigns a candidate a score on a scale from non-user to expert user (Garbutt & O'Sullivan, 1991; Wier, 1990). The ACTFL scale is subdivided into superior, advanced, intermediate, or novice and there is a high, mid, or low definition within each band (Nagata, 1995). "[ACTFL] scale definitions include a wide range of components, including vocabulary, syntax, pronunciation, accuracy, spontaneity, fluency, understanding, coherence, cohesion, functions, and situations Bachman & Savignon, l9866, p.381). Objective rating assigns a candidate a score on specific criteria, such as listening, accuracy, range, fluency, and pronunciation as used in the CanTEST (1993). So the next step was to clarify administrative procedures, scoring, interpretation, and bias (Chamot & O'Malley, 1994).

Oral Interviews at Tokushima Bunri Junior College

Debate and experimentation preceded the final choice of interview format. As English teachers, we wanted a format that would elicit the students' best performance and reproduce classroom activities. Our interview is not college wide, but is given by some teachers at of each semester to first and second-year junior college oral communication classes. Class sizes range from ten to twenty-four, and students' language skills range from false beginner to intermediate. The purpose of this summative interview is "...to encourage students to study, review or practice the material being covered..." (Brown, 1995, p.l3) in class and allow them to show their oral proficiency, language functions, and clarification strategies.

Students are familiar with this interview format from class activities, and receive an interview information sheet ahead of time. While interviews last 12 to 15 minutes, they are scheduled every 20 minutes to allow five minutes for scoring. They take place in a fairly authentic situation around a table in the teacher's office, rather than in a huge, impersonal classroom, and are taped for future reference. The classroom teacher, who is a native speaker, interviews random pairs that students chose by drawing names. Pairs are not always evenly matched, as oral skills can vary considerably. There are also differences based on personality, confidence, and language levels, so the initial idea of scoring each pair jointly was replaced by individual scoring. Hughes agrees that "the performance of one candidate is likely to be affected by that of the others" (1989, p.105). Each student is rated according to her interview performance, and there is no specified "pass mark" for the interview. The interview counts for 25 per cent of the final mark in the first year and 30 per cent in the second year.

The oral interview is often criticized as unnatural and biased partly because of the imbalance of power in favour of the teacher, who, especially if s/he is a stranger, has a "role of considerable power" (Cohen, 1994, p.213). In this case, the teacher/interviewer often initiates conversation, and the student seldom has an opportunity to initiate or control the conversation. In this way, the language samples are limited and somewhat biased as the interviewer usually speaks clearly and often accommodates the student by using "or -questions, slowdowns and display questions..." (Cohen, 1994, p.268) or repetition. However, we use student pairs and a familiar teacher as the interviewer to counteract this.

In order to make the interviews as natural as possible, the teachers are supportive and give verbal responses to show they are listening e.g. mmm that's great "really,"), and nonverbal positive feedback such as nods, smiles, and leaning in (Hughes, 1989). We link topics using phrases such as "you said earlier," "let me go back to ...," or "I'd like to move on to another topic now." We also ask a series of questions on one topic, rather than hopping quickly from one item to another, which makes the conversation more coherent and easier for the students to follow. "Yes/no" questions and "or" questions are used less frequently as open ended "Wh-" questions produce more information (Fradd & McGee, 1994; Garbutt & O'Sullivan, 1991). We try to avoid correcting, interrupting, or finishing students' sentences, and give "as many 'fresh starts' as possible" (Hughes, 1989) when communication breaks down. We let the silences run for approximately ten seconds, but rescue students who simply cannot clarifyÑor understandÑby repeating, rephrasing, or moving on. In this way, we make the interview discourse as realistic and coherent as possible.

Both scoring the oral interview and interpreting the rating are problematic. "Given the variety of norms of language use, the choice of criterion for evaluating language performance is difficult, to say the least, and is often complicated by social and political considerations" (Bachman & Savignon, 1986, p.383). After experimenting with various scoring criteria, we give each student a holistic score and an objective rating, based on five criteria. If the criteria score does not agree with the holistic score, the holistic score is reassessed. The holistic score descriptors are used to rate the overall performance from one (weak or limited speaker) to five (very good or advanced speaker). The criteria for rating were developed from the CanTEST (1993). There are five criteria: (a) appropriate content, language and vocabulary; (b) active listening and natural interaction; (c) accurate grammar and range of structures; (d) pace, fluency and cohesion; (e) pronunciation, intonation and volume. The student score chart has a continuum for each criteria from one (weak) to five (very good). The criteria carry equal weighting and the total criterion score can be doubled to give a holistic score equivalent. The score chart is also used in class for student evaluation and for peer evaluation in second year.

First year

The first-year interview begins with part one, the warm-up, and part two, the level check, in which students answer factual, descriptive, or narrative questions about home, family, hobbies, pets, and regular daily activities. At this time, the interviewer makes a preliminary holistic assessment of the student's level. The following first-year transcript shows the student understands the questions about home and gives limited, but appropriate answers.

Assessor: Where are you from?
Student: I'm from Tokushima city..
Assessor: And how many people are in your family
Student: 6 people.
Assessor: Can you tell me about your family?
Student: ...my brother, my mother, grandfather, grandmother, and younger sister.
Assessor: How old is your younger sister?
Student: She...16 years old.

The in-depth probing, part three, is replaced, due to the low level of the 1-1 (first year/first semester) students by a response to visual stimuli such as maps or pictures as in PET, TSE and SST. However, choosing an appropriate stimulus is not easy, as it should be understandable, relevant, and culturally sensitive (Underhill, 1991) in order for the task to be clear, predictable, and limited enough to produce an extended sample of description and narration. We use big pictures (from calendars or the text) with lots of details which allow students to select things which relate to their personal experiences. Picture stories, which require specific vocabulary and sequencing, are quite difficult for first-year students, unless the stories include essential vocabulary items. These help generate confidence and improve performance. Map exercises also work well using an authentic local map or one already familiar to the students. For example, one student initiates by giving directions from the starting point to a map destination. The partner follows and names the final destination aloud.

In the 1-2 assessment, the visual stimuli are replaced by functions similar to those used in the PET, and students demonstrate their sociolinguistic command of particular language functions automatized in class through role play dialogues and other pair activities. Each student selects a (brief, specific) function card such as "Call your friend and invite her for dinner at your favourite restaurant." The student then demonstrates the invitation function in a dialogue with her partner, who then responds.

Then student pairs also write and perform a twopage role play, similar to those done in class, in order to develop discourse competence. Unlike other tests such as the SST, the role play does not include the teacher. Students draw on classroom dialogues, role plays, and information gap activities to help them prepare a conversation on one out of three topics, such as a conversation between a Canadian and a Japanese about Japan, a conversation between friends about a part-time job, or about spring vacation. A well-written and accurate script with an interesting opening, lots of exchanges, and an appropriate closing is required. Students can use the scripts as prompts, but are encouraged not to read them word-for-word. If possible, scripts are corrected by the teacher beforehand as some students memorize. After this, the interview winds down with a few easy questions about plans for vacation, and then closes. For this dialogue, the pair is scored jointly and a score for the script is added.

We have found that first-year students generally do well on the pictures and brief specific functions. Their prepared role plays, where they are innovative and confident, show greater creativity as they make use of actions, gestures and sometimes props. They are often better prepared than during the semester. In addition, the 1-1 interview provides valuable feedback to both the teacher and the student about the student's language learning which can be taken into account in the 1-2 semester.

Second year

The second year interviews (2-1 and 2-2) also open with the warm-up and the level check, in which students are asked to express opinions and make comparisons on topics such as past events, future career, or travel plans. More advanced students can justify an opinion, speculate, and deal with hypothetical questions. The representative second-year student transcript from the warm-up (below) shows that answers are usually longer, more complex and complete than the 1-1 and 1-2 interview responses.

Assessor: Where are you from, (student name)?
Student: I'm from Aizumi.
Assessor: Where is that in relation to Tokushima?
Student: Where is it? It's in the north part.
Assessor: Is it a big city?
Student: Recently, Aizumi is bigger ... getting bigger and bigger...but actually it is not so big.

Part three of the 2-1 interview continues with a response to a topic prompt similar to TSE. Each student chooses a small topic card and gives a brief talk (about two minutes) using background knowledge as well as pertinent vocabulary and functions acquired in class. Students practice at home and in class where we also review the topics by brainstorming vocabulary and possible subtopics. For example, on environment, students came up with such subtopics as "sorting garbage at home" and "always carry your own chopsticks."

This is followed by a spontaneous role play which puts students "on the spot"; this is more stressful (Halleck, 1995) but builds on the first-year use of role plays. Each student chooses a detailed situation/function card based on themes from class, and, with her partner, creates a dialogue. For example, "Your family will move to a new home. You and your sister disagree on what kind of home you would like." Some of these role plays introduce problem solving or using formal registers, some are informal, or specific to the students' situations. All role plays have some useful vocabulary items on the reverse side of the card.

In the 2-2 assessment, part three changes a probe into opinion on topics at least partially covered in class. The student chooses a topic card and the teacher initiates; the student responds and also asks questions, as may the second student, until the candidate cannot cope and requires repetition or closure. Finally the interview winds down and closes. Second-year students respond well to the interview, but prefer the topics and role plays to the probing.

Discussion

The TBU oral interview format provides a positive backwash effect because it reflects class activities and thus students become more aware of the need to speak in class in order to prepare for the interview.

Using pair interviews, rather than individual ones, not only saves time, but also reduces students' anxiety, and allows weaker students to translate and check with a peer. Even with the random pairing of students, most pairs are well-matched and take turns effectively. Some, however, have a dominant partner who takes the initiative, translates, and will speak most of the time if allowed to. The occasional need for a trio has sometimes proved difficult as one student may be left out; so occasionally an individual student is interviewed with a friend, who is not scored.

Interviewing, scoring, and keeping the conversation going can certainly be demanding for a teacher. Regular in-depth training sessions for standardization can greatly improve interview reliability and reduce bias, even for experienced scorers. This is especially important for teachers who have to assess students they have been teaching all year. We have found that the teacher must be aware of a number of biasing factors:

a. past student performance

b. student motivation and class attendance

c. student health

d. student exam anxiety

e.teacher health

f. agreement or disagreement with student's point of view

g. Iike or dislike of student's personality

h. overly sympathetic listening / teacher interpretation

i. difficulty of questions (too easy or too difficult)

j. speed of questions

k. memorized answers

l. teacher's gender, cultural background, and status.

Some of these factors may be a problem so it is important for the teacher to be alert to them and to try to counterbalance them wherever possible.

Presently the scoring criteria are working well. The student language sample is taken from part two and part three of the interview and the teachers find the holistic scoring has become much easier with practice. The students also find the scoring criteria easy to manage and can now assess each other's performances during class. However, occasionally a student's language sample may be considerably better or worse than in the As with most scoring systems, this needs further investigation to make it more relevant to the learners' needs, more valid, and more reliable.

The oral interview has gained considerable popularity over the past few years and our use of interviews has shown that the range of possible interview formats means that it can be useful not just at advanced levels, but for false beginners and intermediate students too. Although interviews are somewhat subjective and time consuming, the positive backwash affect has encouraged student motivation, confidence, and oral proficiency in class. Mclean reminds us that, "Testers are rarely held accountable for their methods of grading" and there is little consistency between test criteria (1995, p.38), but we hope other classroom teachers will be willing to share their methods of assessment in order to promote reflection and accountability.

References

Bachman, L. F. & Savignon, S.J. (1986). The evaluation of communicative language proficiency: A critique of the ACTFL oral interview. Modern Language journal, 70(4), 380-390.

Bachman, L. F. (1990). Fundamental considerations in language teaching. Oxford: Oxford University Press.

Brown, J. D. (1995). Differences between norm-referenced and criterion referenced tests. Brown, J. D. & Yamashita, S. O. (Eds.) Language testing in Japan. Tokyo: The Japan Association for Language Teaching.

1995-96 Bulletin of information for the TOEFL, TWE and TSE. (1995). Princetown, NJ: Educational Testing Service.

Canadian Tesst for Scholars and Trainees: Information booklet for test canadidates. (1993). Halifax; Canada: Saint Mary's University.

Cohen, A.D. (1994). Assessing language ability in the classroom. Boston, Mass.: Heinle & Heinle Publishers.

Chamot A.U & O'Malley, J. M. (1994). The CALLA Handbook: Implementing the cognitive academic langllage learning approach. MA: Addison-Wesley Publishing Company.

Fradd, S. H. & McGee, P. L. (1994). Instmctional assessment: An integrative approach to evaluating student performance. MA: Addison-Wesley Publishing Company.

Garbutt, M. & O'Sullivan K. (1991). IELTS strategies for study. Sydney, Australia: National Centre for English Language Teaching and Research.

Halleck, G.B. (1995). Assessing oral proficiency: A comparison of holistic and objective measures. Modern Language Journal, 79(2), 223234.

Hughes, A. (1989). Testing for language teachers. Cambridge: Cambridge University Press.

Nagata, H. (1995). Testing oral ability: ILRnd ACTFL oral proficiency interviews. Brown, J. D. & Yamashita, S. O. (Eds.). Language testing in Japan (pp. 108-115). Tokyo: The Japan Association for Language Teaching.

Nunan, D. (1989). Understanding language classrooms: A guide to teacher initiate action. Hemel Hempstead, UK: Prentice Hall.

Mclean, J. (1995). Cooperative assessment: Negotiating a spokenEnglish grading scheme with Japanese university students. In Brown, J.D. & Yamashita, S.O. (Eds.), Language testing in Japan (pp.136-148). Tokyo: The Japan Association for Language Teaching.

Preliminary English Test (1996). Cambridge, UK: UCCLES.

Scarcella, R. C. & Oxford, R. L. (1992). The tapestry of language learning: The individual in the communicative classroom. Boston, Mass.: Heinle and Heinle.

Standard Speaking Test. (1996). Tokyo, Japan: ACL Press.

Underhill, N. (1987). Testing spoken language: A handbook of oral testing techniques. Cambridge: Cambridge University Press.

Wier, C.J. (1990). Communicative language testing. Prentice Hall.



Article copyright © 1998 by the author.
Document URL: http://www.jalt-publications.org/old_tlt/files/98/apr/smith.html
Last modified: Oct 1, 1998
Site maintained by TLT Online Editor