Defenders of Certification: Sign Language Interpreters Question “Enhanced” RID NIC Test17 min read
At this point in our history, the NIC assessment is the foundation for determining who is “one of us” and, as such, certified members of RID should be the defenders of the certification process. However, the fact that certified RID members are unsure of the validity of the current NIC assessment is unacceptable. I believe that the NIC Task Force and the Board of Directors have implemented changes to the RID assessment process the validity of which has not all been transparent to the certified membership. And so, instead of being defenders of the process, we find ourselves in the position of questioning, challenging and/or belittling the recent RID assessments procedures.
On March 18,2012, I sent an email letter to each member of the RID Board of Directors in which I raised a number of questions regarding the new “enhancements” to the NIC test. That letter is reprinted below.
Before reading the letter, it is important to me that you understand the spirit in which that letter was sent.
My intent in sending the letter was neither to create or enflame divisiveness within RID nor was it to attack the current leadership of the RID. Rather it was a request that the Board provide the information necessary so that the RID membership, especially the certified membership, could feel confident and secure in the knowledge that the “enhanced NIC” was indeed valid and reliable; information that was not made available for the previous iteration of the NIC.
Until the day when RID (and we are RID) has a transparently valid and reliable certification process that determines who will be “one of us”, we will always have division and animus (parenthetically, I believe this can only be avoided if we, RID, decide to divest ourselves of the assessment process). My letter was sent to the Board requesting that all the information and documentation that provided the psychometric basis for the “enhanced NIC” be made available to all of the members. The Board has committed to releasing a report that would address the questions I raised.
On April 22 I received an email from the RID President that stated, in part: “…the board of directors and national office staff agree the comprehensive report would be shared with the entire membership. Therefore, this will take some time and resources to complete and request your patience and continued support to allow us the time to complete this comprehensive report. In fact, the work has been underway since the receipt of your letter.”
To be sure, it is unclear to me why the answers to the questions I raised should “…take some time and resources to complete.” After all the questions I raise are the essential questions one must ask and the evidence one must have in advance of implementing such a radically new assessment approach. The information should be readily available; if it has to be created in response to the questions I raise, there are even more serious questions about the process by which this iteration of the NIC was developed and implemented. Nevertheless, I applaud the fact that the RID Board will share full information regarding the new NIC with the membership. Hopefully that report will be issued in a timely manner and, in my opinion, it certainly must happen in advance of the regional conferences.
Reactions — Keep Them in Check
Given all of this, I trust you will read the following letter in the spirit in which it was intended. I sincerely hope that any reaction you may have will be held in check until we all receive the “comprehensive report” from the Board. I believe that any action prior to receipt of the “comprehensive report” would be premature and uniformed.
March 18, 2012
To Members of the Board,
I am writing this letter to the Board, one of the very few I have written since 1972, as a concerned and dedicated member of RID for over forty years and as a Past President of RID. Specifically, I am extremely concerned about the new “enhancements” to the NIC test. I think it goes without saying that the last iteration of the NIC was significantly flawed. Claiming, as we did, (lacking both the sophistication and the empirical data) that a three-tiered certification based on a single evaluation test was valid and defensible, was clearly shown to be a serious mistake (one which we made earlier in our first effort at testing – CI/CT/CSC). With this latest unsubstantiated testing attempt, not only did we damage the credibility of the NIC and the RID itself in the minds of many RID members but perhaps more importantly in the minds of many Deaf people. Both interpreters and Deaf people saw that the test results and tiered certifications awarded often did not match the reality experienced by the “eyes on the street”.
I believe that the lesson that must be learned here is clear — we should definitely not advance an approach to testing that is not directly supported by empirical data on sign language interpretation and that we must make that empirical data clearly and widely known to interpreters and Deaf people.
Make no mistake, I applaud some of the changes to the NIC, specifically uploading a candidate’s video data to a secure server and having those video data available to be viewed by multiple raters. Unfortunately I believe we have made the same fatal mistake – lack of empirical data – with the newest iteration of the NIC as we made with the last iteration and as we made in 1972. Unless there is evidence that has not been made publically available, I believe that the current NIC testing approach lacks face validity — it does not look like what interpreters regularly do. Perhaps better stated, I believe the current test cannot claim to validly certify a candidate’s ability to interpret in a way that reflects real world practice. Certainly there is nothing in the research literature relevant to sign language interpreters of which I am aware that would support the current testing approach. I make the following statements and raise the following questions and concerns based on the new Candidate Handbook 2011 and on conversations with several candidates who have taken the current NIC.
1. It appears that someone predetermined that the test should last only an hour and then the resultant math determined that each of the two ethical and five performance scenarios would last only 4 minutes. If true, RID members need a more thorough explanation of why time and a simple mathematical formula should be the primary drivers behind the format of the certification test; if this is not true, then a clear explanation should be provided for how the current 4-minute per vignette test segmentation was determined.
2. I agree that that it may be possible to make a marginally valid, albeit shallow, determination of one’s approach to ethical decision-making and one’s knowledge of the Code of Professional Conduct from two 4-minute vignettes. However, one would hope that the vignettes are sufficiently complex that they will elicit higher levels of ethical thinking than mere regurgitation of the Code of Professional Conduct. A description of the guiding principles used to develop and/or select the ethical vignettes must be provided to the RID membership. Note I am not asking for the rating rubrics (I agree that teaching to the rubrics was a significant issue in the last iteration), I am simply asking that an explanation for the process/principles used in the selection of and/or development of the vignettes be made known to the membership.
3. I am aware of no research that provides evidence that a 4-minute sample of a piece of interpretation is sufficient to make a determination of overall interpretation competence. What the research does show is that during the first five minutes of a twenty minute monologue an interpreter’s work is often “less challenging” because it is the most predictable – introductions, niceties, setting an overall tone for a talk or meeting, etc. This is also true of the last five minutes of an interpreter’s work – summaries, next steps, closings, etc. Consequently, if all of the five performance vignettes were from the first five minutes of interactions, we would only be sampling and rating the “less challenging” parts of interactions and thus would not be presented with a true and valid representative sample of a candidate’s overall interpreting proficiency. I might agree that if we had five 20-minute samples of an interpreter’s work and we wished to select 4-minute samples from each 20-minute sample (some from the beginning, some from the middle and some from the end) then perhaps we might have a more thorough and more time efficient way of rating an interpreter’s work. But what we have here with the current NIC is clearly not 4-minute samples from longer samples of work. A full explanation of the empirical justification for this 4-minute sampling approach must be provided to the membership.
4. According to the Candidate Handbook, however, some of the vignettes will require that the candidate begin interpreting in the middle portions of interactions after providing the candidate with only a written synopsis of what has transpired up to that point in the interaction. Here again, I contend there is no empirical data that can justify this as a valid approach to obtaining a true and valid sample of a candidate’s overall interpreting competence. As any experienced interpreter knows, by the mid-point of any interpreted interaction the interpreter has developed some content background information (which I presume the NIC proposes to present in printed form). But more importantly the interpreter has a sense of communicative preferences, interactional rhythm, signing style, accents, spoken/signing speeds, prosodic features, etc. None of this can be presented in printed form in any manner that assists the candidate nor can it be presented in a manner that validly replicates what happens in real life.
On this basis alone, I would contend that this 4-minute assessment approach does not provide the essential cognitive, discourse or linguistic tools/knowledge that are available and that unfold in “real life” situations. Additionally, and perhaps more importantly, by the halfway point in any interaction the interpreter has acquired an “interactional schema”. As any experienced interpreter knows, this relates directly to critical areas such as over-arching goals, what counts as success and the overall interactional rhythm and flow. Absolutely none of this is accessible to a candidate suddenly instructed to begin in the middle of an interaction for which only written background content information has been provided. Of necessity, the written background will be about content, but none of this is what is most important to interpreters. A clear explanation of the rationale and justification for placing candidates at such an interpreting disadvantage must be provided to the membership.
5) Given that each performance vignette provides only 4 minutes of a candidate’s work, it would appear that we, as an organization, are no longer concerned about the ability to sustain quality of work during an interpreted interaction. For the past forty years the RID evaluations have contained interactions (monologues and/or dialogues) that have lasted 15-20 minutes in length. This was essentially due to the fact that this most closely reflected the real world work and experience of interpreters and then raters could sample within interactions, not across what are essentially 4-minute, flawed interactions. A detailed explanation of the rational for, and empirical support for this decision and this deviation from forty years experience is also needed by the membership.
6. Given that each performance vignette provides only 4 minutes of a candidate’s work, it would appear that we, as an organization, are no longer interested in the ability to produce work of sustained quality over time. Clearly, a 4-minute text simply does not allow time for the candidate to demonstrate or time for the rater to assess meaning sustained over time. The rater has no opportunity to assess features such as consistent use of grammatical features (manual and non-manual), consistent use of space, consistent use of deitic markers, etc. Simply put, a 4-minute sample simply does not provide sufficient opportunity to demonstrate a candidate’s ability to sustain quality work over time. If there is evidence that supports the claim that a 4-miute sample can validly and reliably assess a candidate’s ability to assess sustained quality over time, then it must be made known to the membership.
7. With a 4-miute segment to assess, the question must be asked “What are the raters looking for?”. It is clear that there is a new rating paradigm (pass/marginal pass, fail/marginal fail) and one could make a solid case for this. Certainly raters for the signed portions should be looking for grammatical features such as agreement, consistent use of “nonce signs” (signs established for this situation only), the use of coordinated and reflexive space, etc. But it is unclear what raters would be asked to assess in a 4-minute sample of work. Certainly raters are unable to assess the full range of linguistic competencies that interpreters must posses in order to able to interpret (if there evidence to support this it must be made public). What are the various English and ASL grammatical and semantic features in vignettes that raters will be assessing and do these five 4-minute vignettes provide sufficient linguistics and discourse variation to elicit an appropriate range of English and ASL grammatical and semantic features?
8. As was true with the last iteration of the NIC we offer the candidate no opportunity to demonstrate the exercise of discretion. This clearly begs the question of whether there is any research that demonstrates that the five performance vignettes somehow represent “seminal” vignettes, i.e. vignettes for which no candidate would ever deem that he or she was an unsuitable fit. Clearly the message sent to candidates taking the NIC and to interpreters in general that one “must interpret everything presented to them” stands in stark contrast to our long held organizational belief that discretion in accepting assignments is critical. Since using discretion in selecting assignments is one of the core operating principles of our long-standing Code, the rationale for adopting an “all or nothing” approach must be made clear to the membership.
9. Virtually all of the candidate’s with whom I have spoken have the same reaction and response to the 4-minute performance vignettes. They state “They [the vignettes] were too short”; “I was just getting warmed up”; “I didn’t have the right information to start in the middle [of a vignette]”; “I don’t think it was a fair sample of my work”; “I needed more time to get over my nerves”; “This isn’t what I do everyday”. These comments are, to me as I hope they are to you, extremely troubling. Even if we assume there is a valid and reliable empirical basis for the “4-minute vignette” approach, the experience of the candidates is quite at odds with that basis. The danger here is that the candidates will, rightly or wrongly, begin to spread these perceptions to certified and not-yet certified interpreters. The end result will be that we return to the set of circumstances that resulted in abandoning the former iteration of the NIC – acting in the absence of empirical data to guide our decision-making. A clear, empirically supported explanation of why the current NIC assessment is valid and can be reliably assessed by raters must be provided to the membership.
The issue of how and the process by which we determine who will be viewed “as one of us” (i.e. who is certified) is of grave concern to many in the membership. As you should well know, it has clearly created some very, very deep rifts within the organization. So deep are the rifts that there is on-going discussion of creating an alternate organization. Yet, we in RID continue to move forward without the necessary empirical support we need to offer a credible approach to the testing process. The “alphabet soup” of certification that we have produced sadly moves us closer and closer to being quite laughable in the eyes of those who view professional organizations as knowing clearly how to determine who will be viewed as “one of us”.
In an ideal world, we would out-source the testing process so that RID could be the “assessment watch-dog” and thus RID could avoid any appearance of conflict of interest. Lacking that possibility at the present time, I believe that the Board should muster the political and moral will to insist on a truly valid and reliable certification test, accepted by the certified members. Then the Board should declare a phased in process by which ALL former certificates (save SC:L and CDI) would be declared invalid and no longer recognized. A staggered timeline would be put in place by which ALL those holding any certificate prior to the valid and reliable test would have to be retested and the “alphabet soup” would eventually no longer exist.
But we are where we are and that is that we have the current iteration of the NIC.
On behalf of the membership and all those who have served in positions of leadership, I am asking for a much greater level of transparency regarding the crafting of the current iteration of the NIC. If there is research data to support the decisions underlying the format of this iteration of the NIC those data must be made very public. I, for one, need to see the consultant’s report on why they believe this approach/format is valid and reliable before I can support this approach. I know that many of my colleagues, who are both members and organizational leaders, feel the same way.
Please know that I raise these questions and ask for this unprecedented level of public transparency in the best interests of RID the organization, of RID members and of Deaf people. I am happy to discuss any of these questions and concerns with the Board, individual or collectively, and/or the psychometric consultants hired to oversee the new NIC test.
Please let me know if you have any questions or need further clarification on any of the issues/questions raised. I eagerly await and expect your response to the questions and issues I have raised in this letter in a timely manner.Sincerely Dennis Cokely Director, American Sign Language Program Director, World Languages Center Chair, Department of Languages, Literatures & Cultures
We should definitely not advance an approach to testing that is not directly supported by empirical data on sign language interpretation and that we must make those empirical data clearly and widely known to interpreters and Deaf people
The Questions that Need Answers
1. RID members need a more thorough explanation of why time and a simple mathematical formula should be the primary drivers behind the format of the certification test; if this is not true, then a clear explanation should be provided for how the current 4-minute per vignette test segmentation was determined.
2. An explanation for the process/principles used in the selection of and/or development of the vignettes be made known to the membership.
3. A full explanation of the empirical justification for this 4-minute approach must be provided to the membership.
4. A clear explanation of the rationale and justification for placing candidates at such an interpreting disadvantage must be provided to the membership.
5. A detailed explanation of the rational for, and empirical support for this decision and this deviation from forty years experience is also needed by the membership.
6. If there is evidence that supports the claim that a 4-miute sample can validly and reliably assess a candidate’s ability to assess sustained quality over time, then it must be made known to the membership.
7. What are the various English and ASL grammatical and semantic features in vignettes that raters will be assessing and do these five 4-minute vignettes provide sufficient linguistic and discourse variation to elicit an appropriate range of English and ASL grammatical and semantic features?
8. Since using discretion in selecting assignments is one of the core operating principles of our long-standing Code, the rationale for adopting an “all or nothing” approach must be made clear to the membership.
9. A clear, empirically supported explanation of why the current NIC assessment is valid and can be reliably assessed by raters must be provided to the membership.