Objective: It has been recommended a lot of statistics in the literature to test the agreement (consistent) of both clinicians (the raters) for a diagnostic test with two categories. Scott’s p statistics and Cohen’s kappa statistics used most frequently in the literature. As an alternative to this statistics, the AC1 statistics have been developed by Gwet. The purpose of this study, to determine agreement statistics, used to test the agreement between two raters, whether affected by the prevalence. Material and Methods: For testing marginal homogeneity of all agreement statistics, it’s formulated and calculated by taking partial derivatives on behalf of prevalence. Results: It’s determined that p, kappa, G-index and AC1 gave similar results with prevalence case is equal to 0.50. Moreover G-index is determined equal to a fixed value for all prevalence values. In addition, p and kappa statistics are equal to 0 when prevalence values case is high (equal to 1) or low (equal to 0) and its observed these values are not accurately reflect the agreement between raters. Conclusion: In this study, it’s researched that the agreement statistics between raters in the 2x2 trial designs. And, it’s concluded that G-index and AC1 statistics not affected sensitivity, specify and prevalence value than other agreement statistics and showed a better performance.
Key Words: Cohen’s Kappa Statistics; AC1 Statistics; Prevalance; Agreement Between Raters.
|