作 者: ;
机构地区: 广东外语外贸大学英语语言文化学院外国语言学及应用语言学研究中心
出 处: 《考试研究》 2008年第4期65-78,共14页
摘 要: 口语考试作为一种相对真实(authentic)和直接(direct)的测试手段,已被越来越广泛地应用于语言测试实践中。然而,在测试过程中引入的主观判断、评分标准和量表的设计与使用等因素,使分数受到更多考生能力以外因素的影响。本研究基于2007年某考点PETS三级口语考试数据,用多侧面Rasch模型(Many-facet Rasch Model,简称MFRM)对这次考试的评分进行了事后质量控制研究。MFRM将语言运用测试多方面因素综合在一个数学模型中,不仅能够把所有侧面在同一标尺下进行衡量,还能对单独侧面,甚至每个个体进行具体分析,有针对性地找到潜在的"问题评分员"和可能被误判的考生,是主观评分环节有效的质量监控手段。 Speaking test,as a means of authentic and direct measurement,has been widely adopted in language testing practice.However,the subjective judgment by raters as well as the development and implementation of rating scale might well introduce sources of score variability other than examinees’ speaking proficiency. Based upon raw scores from one administration of PETS (Public English Test System) Band 3 speaking test,the present study attempts to conduct post hoc quality control for scoring by using Many-Facet Rasch Model(MFRM).MFRM can provide rich statistic indices with regard to individual rater and examinee and facilitate us to pinpoint raters exhibiting aberrant rating patterns and examinees who have received scores higher or lower than they deserve.MFRM thus manifests itself as an effective tool to detect potential problems in ratings and provide specific feedback for test improvement.
领 域: [语言文字]