Table 2 |
|||||
|
Statistical methods used in the included papers to compare direct and proxy measures of behaviour |
|||||
|
Report |
ni |
nj |
nk |
Statistics used |
Notes |
|
|
|||||
|
Item-by-item comparisons: items treated as distinct |
|||||
|
|
|||||
|
Flocke, 2004[9] |
10 |
19 |
138 |
||
|
Stange, 1998[19] |
79 |
32 |
138 |
||
|
Ward, 1996[20] |
2 |
26 |
41 |
Sensitivity = a/(a + c) |
|
|
Wilson, 1994[21] |
3 |
20 |
16 |
||
|
Zuckerman, 1975[22] |
15 |
17 |
3 |
||
|
|
|||||
|
Stange, 1998[19] |
79 |
32 |
138 |
||
|
Ward, 1996[20] |
2 |
26 |
41 |
||
|
Wilson, 1994[21] |
3 |
20 |
16 |
Specificity = d/(b + d) |
|
|
Zuckerman, 1975[22] |
15 |
17 |
3 |
||
|
|
|||||
|
Dresselhaus, 2000*[8] Gerbert, 1988[11] Pbert, 1999*[15] Rethans, 1987*[18] Wilson, 1994[21] |
7 4 15 24 3 |
8 3 9 1 20 |
20 63 12 25 16 |
Agreement: comparison of: (i) (a + b)/T, and (ii) (a + c)/T |
Agreement was assessed by comparing the proportion of recommended behaviours performed as measured by the direct and proxy measures. Three reports performed hypothesis tests, using analysis of variance [8], Cochran's Q-test [15], and McNemar's test [18]. |
|
|
|||||
|
Gerbert, 1988*[11] Pbert, 1999*[15] Stange, 1998[19] |
4 15 79 |
3 9 32 |
63 12 138 |
kappa = 2(ad - bc)/{(a + c)(c + d) + (b + d)(a + b)} |
All three reports used kappa-statistics to summarise agreement; two reports [11,15] also used them for hypothesis testing. |
|
|
|||||
|
Gerbert, 1988[11] |
4 |
3 |
63 |
Disagreement = (i) c/T (ii) b/T (iii) (b + c)/T |
Disagreement was assessed as the proportion of items recorded as performed by one measure but not by the other. |
|
|
|||||
|
Item-by-item comparisons: items treated as interchangeable within categories of behaviour |
|||||
|
|
|||||
|
Luck, 2000[12] |
NR |
8 |
20 |
||
|
Page, 1980 [14] |
16-17 |
1 |
30 |
Sensitivity = a/(a + c) |
|
|
Rethans, 1994[17] |
25-36 |
3 |
35 |
||
|
|
|||||
|
Luck, 2000[12] Page, 1980[14] |
NR |
8 1 |
20 30 |
Specificity = d/(b + d) |
|
|
|
|||||
|
Gerbert, 1986[10] Page, 1980[14] |
20 16-17 |
3 1 |
63 30 |
Convergent validity = (a + d)/T |
Convergent validity was assessed as the proportion of items showing agreement. |
|
|
|||||
|
Comparisons of summary scores for each consultation: summary scores were the number (or proportion) of recommended items performed |
|||||
|
|
|||||
|
Luck, 2000*[12] |
NR |
8 |
20 |
Analysis of variance to compare means of scores on direct measure and proxy. |
|
|
Pbert, 1999*[15] |
15 |
9 |
12 |
||
|
|
Summary score: |
|
|||
|
Rethans, 1987*[18] |
24 |
1 |
25 |
|
Paired t-tests to compare means of scores on direct measure and proxy. |
|
|
|
||||
|
Pbert, 1999*[15] |
15 |
9 |
12 |
Pearson correlation of the scores on direct measure and proxy. |
|
|
|
|||||
|
Comparisons of summary scores for each clinician: summary scores were the number (or proportion) of recommended items performed |
|||||
|
|
|||||
|
O'Boyle, 2001[13] |
1 |
NA |
120 |
Comparison of means of scores on direct measure and proxy. |
|
|
|
Summary score: |
|
|||
|
O'Boyle, 2001*[13] |
1 |
NA |
120 |
|
Pearson correlation of scores on direct measure and proxy. |
|
Rethans, 1994*[17] |
25-36 |
3 |
25 |
||
|
|
|||||
|
Comparisons of summary scores for each consultation: summary scores were weighted sums of the number of recommended items performed |
|||||
|
|
|||||
|
Peabody, 2000*[16] |
21 |
8 |
28 |
Analysis of variance to compare means of scores on direct measure and proxy. |
|
|
|
Summary score: |
|
|||
|
Page, 1980*[14] |
16-17 |
1 |
30 |
Pearson correlation of scores on direct measure and proxy. |
|
|
|
|||||
|
a, b, c, d, T are defined in Table 1; i = item, j = consultation, k = physician, ni = average number of items per consultation, nj = average number of consultations per clinician; nk = average number of clinicians assessed; ωi = weight for ith item; xijk = 0 if item is not performed; xijk = 1 if item is performed;. NR = Not reported; NA = Not applicable. * This study used this method for hypothesis testing. |
|||||
|
Dickinson et al. Implementation Science 2010 5:20 doi:10.1186/1748-5908-5-20 |
|||||