+ All documents
Home > Documents > Cognitive Diagnostic Attribute-Level Discrimination Indices

Cognitive Diagnostic Attribute-Level Discrimination Indices

Date post: 14-May-2023
Category:
Upload: independent
View: 1 times
Download: 0 times
Share this document with a friend
50
Cognitive Diagnostic Discrimination 1 Running head: Cognitive Diagnostic Discrimination Cognitive Diagnostic Attribute Level Discrimination Indices Robert Henson, William Stout, Jeff Douglas, Xuming He and Louis Roussos University of Illinois Submitted For ETS Review for Publication Correspondent: Robert Henson 604 Haines Blvd. Champaign, IL 61820 The research reported here was completed under the auspices of the External Diagnostic Research Team, supported by Educational Testing Service. The opinions and statements expressed herein are those of the authors and do not necessarily reflect the position of ETS.
Transcript

Cognitive Diagnostic Discrimination 1

Running head: Cognitive Diagnostic Discrimination

Cognitive Diagnostic Attribute Level Discrimination Indices

Robert Henson, William Stout, Jeff Douglas, Xuming He and Louis Roussos

University of Illinois

Submitted For ETS Review for Publication

Correspondent:

Robert Henson604 Haines Blvd.Champaign, IL 61820

The research reported here was completed under the auspices of the ExternalDiagnostic Research Team, supported by Educational Testing Service. Theopinions and statements expressed herein are those of the authors and do notnecessarily reflect the position of ETS.

Cognitive Diagnostic Discrimination 2

Abstract

Cognitively Diagnostic Models (CDMs) model the probability of correctly answering

an item as a function of an examinee’s attribute mastery pattern. Since estimation

of the mastery pattern involves more than a continuous measure of ability,

reliability concepts introduced by CTT and IRT do not apply. Henson and Douglas

(2004) define the CDI as a measure of an item’s overall discrimination power, which

indicates an item’s usefulness in examinee attribute pattern estimation. Because of

its relationship with correct classification rates, the CDI was shown to be

instrumental in cognitively diagnostic test assembly. This paper generalizes the

CDI to attribute level discrimination indices for an item. Three different attribute

level discrimination indices are defined and their relationship with correct

classification rates is explored using Monte Carlo simulations. It is found that there

are strong relationships between the defined attribute indices and correct

classification rates. Because of their relationship with attribute level correct

classification rates, one important potential application of these indices is test

assembly from a CDM calibrated item bank.

Cognitive Diagnostic Discrimination 3

Cognitive Diagnostic Attribute Level Discrimination

Indices

Introduction

If the goal of a test is to accurately measure an examinee’s general ability,

indices exist that can provide an indication of how reliably a test measures that

ability. However, modern methods for skills diagnosis are interested in determining

mastery on K traits rather than assessing general ability. For instance, to correctly

respond to an item there is often a set of required steps that must be completed

successfully. If an examinee has mastered all of the attributes required for these

steps, it is likely that the item will be answered correctly.

Cognitive diagnostic models (CDMs) are used to estimate a (1 x K) vector, α,

indicating each examinee’s pattern of mastery for a set of attributes given the

dichotomously scored responses to the items in a test for a set of examinees. In

addition, an (J x K) indicator matrix, Q, where J is the number of items, is used to

specify which items require what attributes. Specifically, if qjk = 1 the kth attribute

is required by the jth item and if qjk = 0 the kth attribute is not required by the jth

item.

An example of a cognitive diagnostic model, which will be used throughout

this paper as an example1, is the Reparameterized Unified Model (DiBello, Stout,

and Roussos, 1995, Hartz, 2002). Other examples of cognitively diagnostic models

exist and include the DINA (Junker and Sijtsma, 2001), NIDA (Maris, 1999), and

several others. The Reparameterized Unified Model, RUM, includes three different

Cognitive Diagnostic Discrimination 4

item parameters, π∗j , r∗jk, and Pcj, for j = 1, . . . , J (number of items) and

k = 1, . . . , K (number of attributes). The probability of a correct response for the

ith examinee given α and ηi is:

P (Xij = 1|αi, ηi) = π∗jK∏

k=1

r∗(1−αik)qjk

jk Pcj(ηi), (1)

Here, π∗j is the probability of correctly applying all required attributes for the jth

item given the examinee has mastered all required attributes for that item, r∗jk

represents the discrimination of the jth item for the kth attribute (notice r∗ is a I x

K matrix), qjk is an indicator for whether the jth item requires mastery of the kth

attribute, Pcjis the Rasch Model with difficulty parameter −cj, and ηi is a general

measure of the ith examinee’s knowledge not otherwise specified by the Q-matrix. In

the current research cj is assumed to equal ∞ and therefore Pcj= 1 and is excluded

from the model.

As was mentioned, CDMs model the probability of correctly answering an

item as a function of an examinee’s attribute mastery pattern. Since estimation of

the mastery pattern involves more than estimation of a continuous ability, measures

of reliability initially introduced by CTT and IRT do not apply. For example, the

concept of reliability in CTT is defined as the proportion of the variance of the

observed score that can be accounted for by the variance of the continuous latent

true score (Lord & Novick, 1968). However, in CDMs, an individual is or is not

correctly classified and therefore the interpretation of a reliability coefficient such as

Cronbach’s α is not the same as the interpretation when using CTT. Also, the

concept of Fisher information is no longer applicable. Mathematically, Fisher

information is defined as the negative expectation of the second derivative of the

log-likelihood for a specified ability (Lord, 1980). Since attribute patterns are in a

Cognitive Diagnostic Discrimination 5

discrete space, it is not possible to compute the Fisher information at a specific

attribute pattern. In summary, there is not a clear choice of index that measures

the effectiveness of a skills diagnostic test such as CTT’s reliability or IRT’s Fisher

information.

Instead of the indices, or measures, that are traditionally used in CTT or IRT,

Henson and Douglas (2004) suggest using the the CDI as a measure of an item’s

(or test’s) discrimination power. The CDI is a Kullback-Leibler based index that is

related to the distances between the item response probability distributions for each

attribute pattern. They show that the CDI strongly relates to the average correct

classification rates of examinees for a test. Because of this, Henson and Douglas

(2004) show that the CDI can be a useful index for item selection in test assembly.

Specifically, to assemble a test from an item bank, those items with the largest

CDIs should be selected first. Given the relationship between the CDI and average

correct classification rates across attributes, this test will have a high correct

classification rate when compared to all other tests that could be constructed from

the same item bank.

Because the CDI is a summary of the item’s overall discriminating power, it

does not indicate an item’s discrimination power for a specific attribute. In

addition, by its definition, the CDI ignores which attributes are required by which

items (that is Q). If items are selected based only on the CDI, it would be possible

to construct a test who’s items measure one or more of the test’s attributes poorly.

Therefore, it is necessary to expand the CDI to a set of indices that measure the

discrimination power of an item for each attribute, which incorporates Q. If the

attribute level indices are constructed similar to the CDI, one would expect an

Cognitive Diagnostic Discrimination 6

attribute discrimination index to have a strong association with correct

classification rates for that attribute. Those tests, with only items where attribute

indices are large will also have high correct classification rates. In addition, item

selection for test assembly based on attribute discrimination indices will not suffer

from the same limitations as the CDI. In particular, constructing tests that satisfy

correct classification rates requirements for each attribute should be possible,

solving an important test assembly problem.

We propose three attribute discrimination indices. Because the discrimination

indices are based on the Kullback-Leibler information, as is the CDI, we provide a

description of the Kullback-Leibler information. Then, the three attribute

discrimination indices are discussed and a Monte Carlo simulation study is used to

demonstrate the strong relationship between each index and correct classification

rates.

Kullback-Leibler Information

Kullback-Leibler information, (Lehmann & Casella, 1998), is generally thought

of as a measure of distance between any two probability distributions, f(x) and

g(x). The Kullback-Leibler information (about the distribution f) is defined as

K[f, g] = Ef

[log

[f(X)

g(X)

]], (2)

where the measure K[f, g] is equal to the expectation, assuming f(x) is the true

distribution, of the log-likelihood ratio of any two probability density functions f(x)

and g(x). X denotes the random data and can be a scalar or vector. K[f, g] is

similar to a distance measure in that as it increases it is easier to statistically

discriminate between the two distributions (Lehmann & Casella, p. 259). In

Cognitive Diagnostic Discrimination 7

addition K[f, g] ≥ 0, with equality when and only when f equals g.

Kullback-Leibler information is not new to educational assessment. Chang and

Ying (1996) suggest using the Kullback-Leibler information instead of Fisher

information as a more effective index for item selection in computer adaptive tests

based on IRT models. The Kullback-Leibler information in IRT can be thought of

as global information where Fisher information is local. More specifically, Fisher

information describes the ability to differentiate among abilities that are close to

one another. Specialized to the unidimensional IRT setting, Kullback-Leibler

information is defined for all ability pairs θ and θ′ (Chang, & Ying, 1996). Unlike

Fisher information, Kullback-Leibler information does not require that the

parameter space is a continuum and is hence suitable for CDMs where the attribute

vector, α, is a discrete parameter. Therefore, it is our intent to generalize Chang’s

and Ying’s (1996) results using Kullback-Leibler information as a basis for skills

diagnostic test construction with CDMs.

As in IRT, for the CDMs considered here the item response, X, is a

dichotomous variable (i.e., an examinee either gets the item right or wrong.) In

addition, the probability distribution of X, Pα(X), depends on the pattern of

attribute mastery, α, and therefore the results from Chang and Ying (1996) easily

generalize to CDMs. According to Kullback-Leibler information, an item is useful in

determining the difference between the true attribute mastery pattern, α, and an

alternative attribute mastery pattern, α′, if Kullback-Leibler information for the

comparison of Pα(X) and Pα′(X),

K[α, α′] = Eα

[log

[Pα(X)

Pα′(X)

]], (3)

is large, where Pα(X) and Pα′(X) are the probability distributions of X

Cognitive Diagnostic Discrimination 8

conditional on α and α′, respectively.

Since X is dichotomous, (3) can be written as

1∑

x=0

Pα(x)log[

Pα(x)

Pα′(x)

],

namely

Pα(1)log[

Pα(1)

Pα′(1)

]+ Pα(0)log

[Pα(0)

Pα′(0)

].

Pα(1) and Pα′(1) are defined as the probability of a correct response using the

Reparameterized Unified Model (RUM), and Pα(0) = 1− Pα(1).

It is also possible to compute attribute pattern based Kullback-Leibler

information at the test level. Kullback-Leibler information for a test compares the

probability distribution for a test vector of J item responses, X, given attribute

pattern, α, when compared to the probability distribution of X given an alternative

attribute pattern, α′. The Kullback-Leibler test information can be written as

Kt[α,α′] = Eα

[log

[Pα(X)

Pα′(X)

]]. (4)

Since one assumption of latent cognitive diagnostic models is independence among

items conditional on the attribute patterns α2, (4) can be written as

Kt[α,α′] = Eα

[J∑

j=1

log[

Pα(Xj)

Pα′(Xj)

]],

which simplifies to

Kt[α, α′] =J∑

j=1

[log

[Pα(Xj)

Pα′(Xj)

]]. (5)

Equation 5 is the sum of the Kullback-Leibler information for each item in the

exam. Thus, the Kullback-Leibler test information is additive over items, an

important and useful property. Kt[α,α′] has an interpretation related to the power

Cognitive Diagnostic Discrimination 9

of the likelihood ratio test for the null hypothesis that the true parameter is α

versus the alternative hypothesis that the true parameter is α′, conducted at a fixed

significance level (Rao, 1962). To be specific, if βJ(α, α′) denotes the probability of

a type II error for a test of length J , the following relationship holds,

limJ→∞

log[βJ(α, α′)]−KtJ [α, α′]

= 1.

Thus, the Kullback-Leibler information for discriminating between α and α′ is (for

a long test) monotonically related to the power of the most powerful test (the

likelihood ratio test) of α versus α′.

One complication of Kullback-Leibler information is that it only compares two

attribute patterns when there are 2K possible attribute mastery patterns. Since

Kullback-Leibler information is not symmetric, there are a total of 2K(2K − 1)

possible comparisons. To organize the 2K(2K − 1) comparisons of all attribute

pattern pairs for the jth item, it is natural to define a (2K x 2K) matrix, KLj, such

that each u, v element (u, v indexing possible attribute patterns) equals

KLjuv = Eαu

[log

[Pαu(xj)

Pαv(xj)

]].

For example, if the RUM is used

KLjuv = π∗jK∏

k=1

r∗(1−αuk)qjk

jk log[π∗j

∏Kk=1 r

∗(1−αuk)qjk

jk

π∗j∏K

k=1 r∗(1−αvk)qjk

jk

]

+ (1− π∗jK∏

k=1

r∗(1−αuk)qjk

jk )log[1− π∗j

∏Kk=1 r

∗(1−αuk)qjk

jk

1− π∗j∏K

k=1 r∗(1−αvk)qjk

jk

](6)

where, it is assumed for simplicity that all cj = ∞ and αuk represents the kth

element of the attribute mastery vector, αu. Notice that if one has an item bank of

N items, KLj can be computed for each item j. The elements of each KLj that are

Cognitive Diagnostic Discrimination 10

large indicate the attribute pattern pairs that the jth item is most useful in

discriminating among. In addition, the total Kullback-Leibler information matrix

can be defined for any test of J items, KLt, by simply summing across the KLj of

the items selected. To construct a test one might, for example, choose items such

that all of the elements in KLt are large and therefore the power to discriminate

between any two attribute patterns is high.

Discrimination

Theoretically, each element of KLt could function as an indicator of how well

α is measured when compared to α′. However, in applications, it is not reasonable

to simultaneously consider the 2K(2K − 1) discrimination indices of an exam.

Therefore to produce a useful indication of test discrimination, it is imperative that

a single index, such as the CDI or 2K attribute level indices of discrimination,

based on the entries of KLt, be defined. Specifically, the kth discrimination index

should be useful to predict the correct classification rate achieved by an effective

statistical procedure for the kth attribute for a given test.

It should be noted that, while correct classification of an attribute has been

discussed generically (i.e., correct classification of an attribute), there are two

important components to correct classification, the correct classification rate of the

masters, p(αk = 1|αk = 1), and the correct classification rate of the nonmasters,

p(αk = 0|αk = 0). The purpose of the test (e.g., including differing costs and

differing benefits of correct classification) can often times determine whether the

correct classification of masters, or the correct classification of nonmasters, is more

important. Therefore, instead of only defining a single discrimination index for the

Cognitive Diagnostic Discrimination 11

kth attribute, a discrimination index will be defined to help predict the correct

classification rate of the masters for the kth attribute, δk(1), and a discrimination

index to help predict the correct classification rate of nonmasters, δk(0).

It is our intent to show that effective indices at the attribute level can be

computed from the Kullback-Leibler matrix, KLt. In addition, by using only linear

combinations of the elements of KLt, the defined attribute discrimination index for

a test is simply the sum of each corresponding item attribute discrimination index

(that is additivity holds across items), which will allow attribute specific test

construction in a similar manner as described by Henson and Douglas (2004). The

following subsections provide definitions of three promising indices of attribute

discrimination, each with their benefits and limitations.

Attribute Discrimination Index A (δAk )

Recall that some elements within KL are more important than others. Notice

that by using the attribute patterns that only differ on the kth attribute, the

corresponding KLjuv’s describe the extent to which a master can be discriminated

from a nonmaster, or a nonmaster from a master, on the kth attribute while holding

attribute mastery constant on the remaining (K − 1) attributes.

Of the attribute comparisons that differ only by the kth attribute, there are

2(K−1) comparisons describing the discrimination power of masters from nonmasters

on the kth attribute (i.e., comparing attribute patterns such that αk = 1 and

α′k = 0), and there are 2(K−1) comparisons describing the discrimination power of

nonmasters from masters on the kth attribute (i.e, attribute patterns such that

αk = 0 and α′k = 1). The first index will compute the mean of the elements in KLj

that satisfy the constraints just defined previously. Specifically, equations (7) and

Cognitive Diagnostic Discrimination 12

(8) provide formal definitions of δAk (1) and δA

k (0) in terms of the comparisons made

in KL.

δAjk(1) =

1

2(K−1)

Ω1

KLj(α,α′) (7)

δAjk(0) =

1

2(K−1)

Ω0

KLj(α,α′) (8)

where

Ω1 ∈ αk = 1 ∩ α′k = 0 ∩ αv = α′v∀v 6= k (9)

and

Ω0 ∈ αk = 0 ∩ α′k = 1 ∩ αv = α′v∀v 6= k. (10)

Index δAjk provides a simple measure of the average discrimination that an item

contains about attribute k while controlling for the remaining attributes. It does

not incorporate prior knowledge about the testing population and therefore assumes

that all attribute patterns are equally likely. If the jth item does not measure the

kth attribute (i.e., the j, k element of the Q-matrix is 0) then that item contains no

information about attribute mastery for the kth attribute and therefore δAjk(1) and

δAjk(0) are zero. While the index has been defined at the item level, the test

discrimination, δAk , is the sum across each item discrimination as given in (11).

δAtk =

J∑

j=1

δAjk (11)

The additivity of item discrimination is because the elements in KLj are additive as

described previously.

Cognitive Diagnostic Discrimination 13

Attribute Discrimination Index B (δBk )

Though there are times when an individual may not have prior knowledge of

the specific population, often prior testing has been used to calibrate the items and

therefore there is some knowledge of the population characteristics. For example,

Hartz (2002) estimates attribute associations and the population probability of

mastery using the Fusion Model to fit the RUM. If the Fusion Model is fitted, prior

probabilities of attribute patterns are estimated. In addition, it can be argued that,

in general, there are not many cases such that all attribute patterns are equally

likely. Therefore, a second index, δBjk, is defined, as in equations (12) and (13), such

that the expectation given the distribution of α is used (i.e., the prior probabilities,

or estimates of the prior probabilities, of the attribute patterns are used to weight

the appropriate elements of KLj).

δBjk(1) = Eα[KLj(α,α′)|Ω1] (12)

δBjk(0) = Eα[KLj(α,α′)|Ω0] (13)

where Ω1 and Ω0 are defined in (9) and (10), respectively.

Provided that the distribution of α is known, or can be estimated, equation

(12) can be rewritten as,

δBk (1) =

Ω1

wKL(α, α′), (14)

where

w = P (α|αk = 1),

and (13) can be rewritten as,

δBk (0) =

Ω0

wKL(α, α′), (15)

Cognitive Diagnostic Discrimination 14

where

w = P (α|αk = 0).

Like δAjk, δB

jk provides a simple measure of discrimination but prior population

information is used to weight the elements of KLj giving those values for which α is

more likely higher weights than less likely attribute patterns. δBjk is interpreted as

the amount of information, about attribute k, provided by an item. It should be

noticed that, if all P (α|αk = 1) are equal, then δBjk(1) = δA

jk(1) and, if all

P (α|αk = 0) are equal, then δBjk(0) = δA

jk(0). Therefore δAjk is a special case of δB

jk.

Again, as in the discrimination index, δAjk, additivity holds:

δBtk =

J∑

j=1

δBjk. (16)

Attribute Discrimination Index C (δCk )

Both δAjk and δB

jk are useful indices for discrimination in that they measure the

discriminating power of an item in assessing the kth attribute. However, the intent

is to define an index that is most strongly associated with correct classification

rates. It is possible that δAk and δB

k are not taking full advantage of all the

information available. As an illustrative example, if two attributes k and k′ are

perfectly correlated (i.e., if an examinee is a master of attribute k then he or she is

also a master of attribute k′, and if an examinee is a nonmaster of attribute k then

he or she is also a nonmaster of attribute k′), then by knowing attribute k is

mastered by an examinee and the fact that the correlation between k and k′ is 1,

attribute k′ is also known to be mastered by the examinee. Therefore, an item that

contains information about attribute k can also provide information about k′, even

if the item does not require k′ for its solution. A discrimination index may need to

Cognitive Diagnostic Discrimination 15

incorporate all the information provided from the association between attributes, if

such information is available.

The index δCjk assumes that if attributes are associated, the discrimination of

the kth attribute provided by an item is a function of both the information about αk

contained in the item and the information provided from the known or estimated

associations of αk with other attributes measured by the test. Abstractly speaking,

to incorporate the additional information provided from the association of other

attributes, associated attributes are first re-expressed as a function of a set of newly

defined independent attributes. The likelihood functions used to compute entries of

a KLj can then be re-expressed as a function of the independent attributes and the

Kullback-Leibler information computed for all attribute pairs. Notice that since the

true attributes are associated, each attribute will typically be a function of more

than one of the independent attributes. For this reason, it is possible for an item

that does not measure αk to provide information about αk.

Specifically, we define a set of independent attributes for the ith subject,

α∗1, · · · , α∗K , such that P (α∗k′ = 1|α∗k∀k 6= k′) = P (α∗k′ = 1) for all k 6= k′. To compute

the discrimination index for the kth attribute, the association of each attribute with

the kth attribute is modeled by expressing the true attributes for the ith examinee,

αi1, · · · , αiK , as a function of the independent attributes as given in (17),

αim = bimα∗ik + (1− bim)α∗im;∀i = 1, . . . , I. (17)

Here, bim is a random Bernoulli variable for the ith examinee with probability pbm

and all bim are assumed to be independent in m for each fixed i. By definition, as

the association between the attributes increases the pbm are chosen to be larger.

However, one must consider that for a randomly selected examinee all 2K sequences

Cognitive Diagnostic Discrimination 16

of the bm’s for m = 1, . . . , K are possible (since all bm are random independent

Bernoulli variables). Bl will be used to denote the vector of the lth possible

combination of b1, · · · , bK , where l = 1, . . . , 2K.The RUM is written in terms of the independent attributes using equation

(17) and the Kullback-Leibler matrix with respect to attribute k, KLljk for the jth

item and lth combination of (b1, · · · , bK), denoted Bl = (Bl1, · · · , Bl

K), can be

computed. It should be noted that in the Kullback-Leibler equation (6), all of the

attributes α∗1, . . . , α∗K now potentially play a role, as determined by Bl via Equation

(17). In addition, for any KLljk, discrimination indices for the kth attribute, ∆l

jk(1)

and ∆ljk(0), can be computed using the equations analogous to δB

k (1) and δBk (0)

written in terms of the independent attributes, α∗1, . . . , α∗K . Specifically,

∆ljk(1) =

α∗∈Ω∗1k

wKLljk(α

∗,α′∗), (18)

where

w = P (α∗|elements in Ω∗1k),

and

∆ljk(0) =

α∗∈Ω∗0k

wKLljk(α

∗,α′∗), (19)

where

w = P (α∗|elements in Ω∗0k).

Here Ω∗1k and Ω∗

0k are defined as

Ω∗1k = α∗k = 1 ∩ α′k

∗= 0 ∩ α∗v = α′v

∗∀v 6= k (20)

and

Ω∗0k = α∗k = 0 ∩ α′k

∗= 1 ∩ α∗v = α′v

∗∀v 6= k. (21)

Cognitive Diagnostic Discrimination 17

Notice that KLljk is computed for all 2K vectors Bl. In addition, ∆l

jk(1) and

∆ljk(0) can be computed for each KLl

jk. We define the discrimination indices δCjk(1)

and δCjk(0) as the expectation of ∆l

jk(1) and ∆ljk(0), respectively, across all possible

combinations Bl for all l = 1, . . . , 2K, as determined by the Bernoulli trials

distribution for Bl.

δCjk(1) = EBl

[∆ljk(1)] (22)

δCjk(0) = EBl

[∆ljk(0)] (23)

Equations (22) and (23) can be written as

2K∑

l=1

wl∆ljk(1) (24)

and2K∑

l=1

wl∆ljk(0) (25)

where

wl =K∏

m=1

[pBl

mbm

(1− pbm))1−Blm ]. (26)

As in the previous discrimination index, δBk , this discrimination index

incorporates information about the population by using prior probabilities of all

attribute patterns as weights to determine comparisons that are more likely. In

addition to using the prior probabilities of each attribute pattern to determine

weights, δCk also uses the association between each attribute pattern pair in defining

the individual Kullback-Leibler elements. By incorporating the association between

attributes, the discrimination of the kth attribute is a function of both the

information contained about attribute k in the item, or test, and information

provided by the estimated correlations of the other attributes with the kth attribute.

Also, additivity will hold as in δA and δB.

Cognitive Diagnostic Discrimination 18

It should be noted that if the attributes are uncorrelated, pbk= 0 for all

k = 1, · · · , K and therefore δCk = δB

k . In addition, if all attributes are uncorrelated

and all conditional probabilities used to produce the weights for B are equal then it

is also true that δCk = δB

k = δAk .

Discrimination Index Discussion

The CDIt, δAk , δB

k , and δCk , are indices based on the Kullback-Leibler

information. However, there are some basic philosophical differences between the

indices that should be addressed. Specifically, the indices differ in the extent that

the characteristics of the population influence the value of the index and they differ

in their interpretation.

Both the test discrimination, CDI, and the attribute discrimination index A,

δAk , are computed from the Kullback-Leibler information matrix and the Hamming

distances between all pairs of attributes. They do not incorporate any

characteristics of the population (i.e., distribution probabilities of attribute

patterns) and therefore they must be considered only as a general index of the

amount of discrimination power a test provides to differentiate between any two

attribute patterns. Recall that both the discrimination indices for CTT and the

efficiency of a test in IRT are functions of the population characteristics. However,

when using either the CDI or δAk , there is an implicit assumption that items are

equally discriminating for all populations. There may be instances when a test is

less informative because many of the attribute comparisons for which the test is

highly discriminating are unlikely in the population.

Secondly, δBk and δC

k are influenced by the population parameters, but are

conceptually different in the extent to which such information is incorporated into

Cognitive Diagnostic Discrimination 19

the index. Notice that δBk weights the comparisons by the prior probability of the

attribute pattern. Therefore, if an attribute pattern is unlikely in a population

those comparisons will have small weights. However, δBk only incorporates the

amount of discrimination contained by a test. If a test does not measure an

attribute, δBk , as well δA

k , will indicate that the test contains no direct discrimination

power about that attribute, which is different from δCk .

The attribute discrimination index δCk allows for information, in addition to

what is provided directly by the test, to contribute to the discriminating power of a

test. Specifically, if an association exists between the attributes, then δCk is defined

by the information provided by the test and the information provided from the

association with any additional attributes measured by the test. Therefore, δCk

should be interpreted as the total information provided about an attribute.

Examples

Now a simple one-item example calibrated using the RUM model will

illustrate the calculations of the four indices (i.e., δA1 (1), δB

1 (1) and δC1 (1)). The

single item has an r∗ equal to 0.125, a π∗ = .8, and a Q-matrix entry equal to (1 0).

Notice that the Q-matrix entry indicates that the first of only two attributes are

required to correctly answer the item.

To compute the CDI, δAk , and δB

k the matrix KL must be calculated using

Cognitive Diagnostic Discrimination 20

equation (??). For the example,

KL =

0 0 1.36 1.36

0 0 1.36 1.36

1.14 1.14 0 0

1.14 1.14 0 0

.

In KL, rows 1-4 represent examinees who have not mastered either attribute, (0 0),

examinees who have mastered only the second attribute, (0 1), examinees who have

mastered only the first attribute, (1 0), and examinees who have mastered both

attributes (1 1), respectively. The same is true for columns 1-4. The i, j element of

KL is the Kullback-Leibler information of the ith attribute pattern versus the jth

attribute pattern, K[i, j].

To compute δA1 (1) only the elements that correspond to comparisons of

examinee patterns (1 x) to (0 x) are considered, where x is either a 1 or 0, as

defined in equation (9). Specifically, only the bold elements in

KL =

0 0 1.36 1.36

0 0 1.36 1.36

1.14 1.14 0 0

1.14 1.14 0 0

. (27)

are considered. For example, KL(3, 1) represents the comparison of examinee

pattern (1 0) to examinee pattern (0 0). Since δA1 (1) is the average of the bold

numbers,

δA1 (1) =

1.14 + 1.14

2

= 2.28/2

Cognitive Diagnostic Discrimination 21

= 1.14.

The discrimination index can also be computed for attribute 2, δA2 (1), using the

italicized values in 27. Since the item does not require attribute 2, δA2 (1) = 0. Using

similar equations δA1 (0) and δA

2 (0) can be computed.

Next, to compute δB1 (1), the same bold elements in (27) are used, only now it

is assumed that information about the population is known, or has been estimated.

The index, δB1 (1), is the weighted mean of the elements used for the index δA

1 (1).

For this example, assume that a random examinee has the attribute pattern (0 0)

with probability 0.27, has the attribute pattern (0 1) with probability 0.43, has

attribute pattern (1 0) with probability 0.03, and has attribute pattern (1 1) with

probability 0.27. Therefore,

δB1 (1) =

.03(1.14) + .27(1.14)

.3

= 1.14.

Again, as in δA2 , δB

2 (1) = 0 and the indices δB1 (0), and δB

2 (0) can be computed using

similar equations.

Lastly, index δC1 (1) assumes that an association between attributes 1 and 2 is

known, or can be estimated, using tetrachoric correlations. Tetrachoric correlations

assume that there is a continuous normally distributed variable, α, underlying the

dichotomous 0-1 attribute α. Assume that the tetrachoric correlation between

attributes 1 and 2 is 0.5 and that the proportion of examinees that have mastered

attribute 1 is 0.3 and the proportion of examinees that have mastered attribute 2 in

Cognitive Diagnostic Discrimination 22

the population is 0.7. Therefore using equation (17) the associated attributes can be

expressed as a set of independent attributes and KL11 to KL4

1 are computed as:

KL11 =

0 0 1.36 1.36

0 0 1.36 1.36

1.14 1.14 0 0

1.14 1.14 0 0

, (28)

KL21 =

0 0 1.36 1.36

0 0 1.36 1.36

1.14 1.14 0 0

1.14 1.14 0 0

, (29)

KL31 =

0 0 1.36 1.36

0 0 1.36 1.36

1.14 1.14 0 0

1.14 1.14 0 0

, (30)

and

KL41 =

0 0 1.36 1.36

0 0 1.36 1.36

1.14 1.14 0 0

1.14 1.14 0 0

, (31)

where B1=(0 0), B2=(0 1), B3=(1 0), and B4=(1 1). In addition, B1 has

probability 0.22, B2 has probability 0.08, B3 has probability 0.52, and B4 has

probability 0.18. The probabilities for B1 to B4 are computed3 such that the

association between the attributes is equal to the estimated tetrachoric correlations

as explained in Section 3.3.3. To compute δC1 (1) using the weights specified in

Cognitive Diagnostic Discrimination 23

equation (22) the probability distribution of the α∗’s (i.e., the independent

attributes) must also be estimated. Because α∗’s are independent the probability of

the joint α∗ distribution is the product of the marginal probabilities for each

attribute, α∗k. In this example, the probability a random examinee is a master of

attribute 1, P (α∗1 = 1), is 0.300 and P (α∗2 = 1) = 0.818 so it follows that that

α∗ =(0 0) has probability 0.13 (i.e., (1-.3)(1-.818)), α∗ =(0 1) has probability 0.57,

α∗ =(1 0) has probability 0.05, and α∗ =(1 1) has probability 0.25. So,

∆11(1) =

.05(1.14) + .25(1.14)

.3

= 1.14

∆21(1) =

.05(1.14) + .25(1.14)

.3

= 1.14

∆31(1) =

.05(1.14) + .25(1.14)

.3

= 1.14

and

∆41(1) =

.05(1.14) + .25(1.14)

.3

= 1.14.

Cognitive Diagnostic Discrimination 24

Finally,

δC1 (1) =

.22(1.14) + .08(1.14) + .52(1.14) + .18(1.14)

1

= 1.14.

In addition, δC2 (1) can be computed using the bold values in the newly computed

KL12 to KL4

2 with respect to attribute 2 (matrices 32 to 35).

KL12 =

0 0 1.36 1.36

0 0 1.36 1.36

1.14 1.14 0 0

1.14 1.14 0 0

, (32)

KL22 =

0 1.36 0 1.36

1.14 0 1.14 0

0 1.36 0 1.36

1.14 0 1.14 0

, (33)

KL32 =

0 0 1.36 1.36

0 0 1.36 1.36

1.14 1.14 0 0

1.14 1.14 0 0

, (34)

and

KL42 =

0 1.36 0 1.36

1.14 0 1.14 0

0 1.36 0 1.36

1.14 0 1.14 0

, (35)

Cognitive Diagnostic Discrimination 25

Specifically,

∆12(1) =

.05(0) + .25(0)

.3

= 0

∆22(1) =

.05(1.14) + .25(1.14)

.3

= 1.14

∆32(1) =

.05(0) + .25(0)

.3

= 0

and

∆42(1) =

.05(1.14) + .25(1.14)

.3

= 1.14.

Finally,

δC2 (1) =

.22(0) + .08(1.14) + .52(0) + .18(1.14)

1+

= .30.

The indices δC1 (0) and δC

2 (0) can also be computed using similar equations. Notice

that because of the association between attribute 1 and 2 the discrimination index

δC is nonzero for attribute 2 where indices δA and δB equaled zero.

Cognitive Diagnostic Discrimination 26

A Simulation Study of the Performance of the Three

Indices

Using Monte Carlo simulation, random tests are generated where items are

calibrated using the RUM. For each test, δAk , δB

k , and δCk , and the responses of

10,000 simulated examinees are computed. In addition, attribute patterns of each

simulated examinee are estimated and correct classification rates for both the

masters and nonmasters are computed. Next, correlations are computed between

correct classification rates and each of the three discrimination indices in order to

assess the performance of the indices.

In the simulation study the RUM item parameters were assumed known, as

would be the case with a test bank of precalibrated items. Given a set of item

parameters comprising a test, 10,000 examinees are simulated. When generating

each set of 10,000 examinee attribute patterns it is important to consider two

aspects of the attribute pattern distribution. These are the proportion of examinees

that have mastered the kth attribute, pk, and the associations between the

attributes. In the simulation study all pk = .5 were chosen in order to control for

the influence of pk on correct classification rates. In addition, it is reasonable to

believe that the attributes are associated in the population. If an individual has

mastered one attribute, he is more likely to have mastered a second attribute

measured by the exam. Therefore, an appropriate simulation of examinees would

incorporate both specified pk’s and a specified relationship between the attributes.

A total of 10, 000 multivariate normal K-dimensional vectors

(α ∼ MV N(0,ρ)) are generated to simulate attributes that have a positive

relationship, where ρ represents a correlation matrix with equal off-diagonal

Cognitive Diagnostic Discrimination 27

elements. Using the pk’s, a cutoff κk = 0 is computed for each attribute such that

P (α ≤ κk) = pk = .5. The ith individual’s mastery for attribute k is thus:

αik =

1 if αk ≥ 0

0 otherwise

For reasons that will become clear in the following subsections, for each test

administration a second sample, comprised of 10,000 examinees, is generated

separate from the other simulated sample of 10,000 examinees. The second sample

will be used for an accurate Monte Carlo approximation of the prior distribution of

attribute patterns, as needed for indices B and C.

Given each examinee’s attribute pattern, scores for each item are generated

based on the RUM simulated model. Given the probability of a correct response,

P (Xij = 1|α), a random U(0, 1) variable u is generated and the score Xij is:

Xij =

1 if u ≤ P (Xij = 1|α)

0 otherwise

Here, P (Xij = 1|α) is the probability of a correct response using the RUM.

Since the item parameters are known, instead of using an MCMC approach to

carry out a Baysian analysis, Baysian based classification is accomplished by

computing the likelihood for all possible attribute patterns given the examinees’

scores and multiplying by the prior probabilities of attribute patterns(estimated

from the second sample of 10,000 subjects.) The Baysian posterior mode is then

used to classify the attribute pattern for that individual. Given the estimated

attribute patterns, for each attribute the proportion of examinees for which the

attribute was correctly classified is recorded.

Next, to determine the characteristics of the simulated tests, one must

Cognitive Diagnostic Discrimination 28

remember that the purpose of this study is to study the relationship between each

discrimination index and correct classification rates. Tests are generated that are

realistic while intentionally creating variability of the measurement quality of the

tests (i.e., some tests have high correct classification rates while others do not

perform as well). 1000 randomly generated 40-item tests are constructed to measure

5 attributes. On average, each item requires 2 attributes in each test. In addition,

item parameters are generated such that tests will range from low to high cognitive

structure. In this context, low cognitive structure will be defined as situations for

which the absence of one or more of the required attributes have a relatively small

influence on examinees’ probability of a correct item response and therefore the

items are at best moderately informative whereas in high cognitive structure the

probability of a correct response is strongly influenced by the presence or absence of

one or more of the required attributes. The characteristics of the randomly

generated item parameters for each test are as follows:

π∗’s are randomly generated from a uniform distribution, U(.85, .95) for all 1000

tests.

r∗’s are used to modify the cognitive structure. Specifically, for the ith simulation,

i = 1, . . . , 1000, r∗’s are randomly generated from a uniform distribution,

U(.1 + .6(i−1)999

, .3 + .6(i−1)999

). Notice that for the first simulation r∗’s are

generated to resemble high cognitive structure (i.e., r∗’s range from .1 to .3).

For each simulation, the range slowly shifts to resemble a lower cognitive

structure until the last simulated test contains r∗’s that range from .7 to .9,

low cognitive structure indeed.

Cognitive Diagnostic Discrimination 29

c’s are all set to ∞ and therefore Pc(η) = 1.

It is important to remember that each index incorporates the dependence

between the attributes to a different degree. Specifically, δAk totally ignores the

association between attributes, δBk incorporates the association only in the form of

multiplicative weights (based on the prior probabilities) of the Kullback-Leibler

information, and δCk fully uses the known (or estimated in application) association

between attributes. Three different simulations of 1000 tests are run; a simulation

with all the off-diagonal elements of ρ equal to .5, a simulation with all of the

off-diagonal elements of ρ equal to .75, and a simulation with all of the off-diagonal

elements of ρ equal to .95.

Lastly, a separate simulation study is used to give an extreme example that

provides evidence for the inability of δAk and δB

k to use collateral information in the

absence of direct information about the attribute. 1000 20-item tests are generated

to measure 2 attributes such that all items only measure the second attribute and

the known correlation between attributes 1 and 2 is .999. A high correlation

between attribute 1 and attribute 2 indicates that each item that measures attribute

2 also provides information about attribute 1. That is, because of the strong

association between attributes, if attribute 2 is known, attribute 1 is known with

almost certainty. It should be noted that since none of the twenty items directly

measure attribute 1, δAk (1), δA

k (0), δBk (1) and δB

k (0) are equal to 0, where, as

appropriate, δC1 (1) ≈ δC

2 (1) and δC1 (0) ≈ δC

2 (0) can be shown. It is also true that for

attribute 2, all corresponding discrimination indices for indices A, B, and C, should

be approximately equal. The example illustrates a situation in which it is

advantageous to fully incorporate the associations between the two attributes as in

Cognitive Diagnostic Discrimination 30

δCk . If correlations are not taken into consideration from the empirical Bayes

perspective, the test contains no direct information about attribute 1, yet there is a

large amount of information for attribute 1 provided from knowing attribute 2, so

indices δAk and δB

k are misleading.

Results

Three Simulation Studies.

For each of the basic three simulation studies, the basic descriptive statistics

of the discrimination indices and correct classification rates are provided. In

addition, the correlations between the discrimination indices and the appropriate

correct classification rates are computed. The following paragraphs first summarize

the results from the three simulations then the results from the separate simulation

study are provided.

To begin, the minimum, maximum, and mean values of correct classification

rates (CC) over the 1000 simulation replications, δAk , δB

k , and of δCk , for both masters

and nonmasters, respectively, are summarized in Tables 1 to 3. It should be noted

that, since tests are randomly generated with all pk = .5 and all attribute

correlations are the same within a study, the results of the 5 attributes are

indistinguishable. Therefore, the basic descriptive statistics will be summarized

across all attributes, which provide more efficient estimates of their true values.

Insert Table 1 about here

Cognitive Diagnostic Discrimination 31

Insert Table 2 about here

Insert Table 3 about here

In general, while tests were developed to allow for a large range of correct

classification rates, it is clear that as the correlation between attributes increases the

range of correct classification rates are reduced. For example, the correct

classification rates for the simulation study with attributes that have correlations of

.5 range from .75 to 1.00 with an average of approximately .92, while they range

from approximately .90 to 1.00 with an average of .97 in the simulation study with

attributes that have correlations of .95. In addition, while the intent of this study is

to only define indices that correlate with correct classification rates, it can be seen

that both δAk and δB

k appear to be on similar scales where δCk , on average, is much

larger.

Next, Table 4 provides the means4 of the correlations between the correct

classification rates and δAk , δB

k , and δCk , for the masters and nonmasters. The table

shows that in general correlations are quite high and therefore it is reasonable to use

the discrimination indices as indicators of correct classification rates.

Insert Table 4 about here

Cognitive Diagnostic Discrimination 32

One assumption when using the correlation is that the relationship is

approximately linear. Therefore, it is important that the scatter plots be explored

for linearity. Figures 1 and 2 are scatter plots of attribute 1, as an example, for all

three indices to visually explore the relationships between discrimination and

correct classification rates for masters (Figure 1) and nonmasters (Figure 2).

Columns 1, 2, and 3 represent the plots for the discrimination indices δAk , δB

k , and

δCk , respectively, crossed with the rows, which represent the three simulations (i.e.,

when correlations between attributes are .5, .75, and .95).

Insert Figure 1 about here

Insert Figure 2 about here

Clearly the relationship is not linear due to the asymptotic effect of correct

classification rates (i.e., they approach 1). Therefore, the true relationship is

stronger than what is indicated by the correlation coefficients (i.e., if the values were

transformed such that the relationship is linear correlations will be higher). It is

also possible that some correlations are smaller due to a restricted range, which may

explain the reduction of the correlations for the simulation where all attribute have

a correlation of .95.

So to explore the true strength of the relationship between the discrimination,

or a monotonic function of discrimination, and correct classification, the

log-transformation of the discrimination indices can be used so that the relationship

Cognitive Diagnostic Discrimination 33

is linear. Figures 3 and 4 are new plots, again using attribute 1 as an example, of

the transformed discrimination indices such that now it is reasonable to compute

the correlations between the logarithm of each discrimination index and the correct

classification rates.

In addition, Table 5 provides the correlation between the logarithm of each

discrimination index and correct classification rates, again providing strong evidence

of a useful relationship.

Insert Table 5 about here

Insert Figure 3 about here

Insert Figure 4 about here

The High Correlation Example.

Finally, as was suggested, the example with correlations between attributes

equal to 0.999 provides a situation in which the information provided from attribute

2 being highly correlated with Attribute 1 must be used to estimate attribute 1.

Since no item directly measures attribute 1, all estimates of δA1 and δB

1 equal zero.

Because δC1 incorporates information other than what is contained in the items, for

the non-directly measured Attribute 1 the correlation between δC1 (1) and correct

Cognitive Diagnostic Discrimination 34

classification rates of masters is .72 and the correlation between δC1 (1) and correct

classification rates of nonmasters is .77, both moderately large.

For the directly measured attribute 2, all three indices of discrimination are

nearly equal (as was predicted previously) and therefore it is only necessary to

provide one correlation for the relationship between the discrimination indices and

correct classification rates for the masters and one correlation for the relationship

between the discrimination indices and correct classification rates for the

nonmasters. Specifically, the correlation between any discrimination index (i.e.,

δA1 (1), δB

1 (1), or δC1 (1)) and correct classification of mastery is .73 and the

correlation between any discrimination index (i.e., δA1 (0), δB

1 (0), or δC1 (0)) and

correct classification rates for the nonmasters is .78. Note that these values are

almost identical to those for the non-directly measured Attribute 1, as desired.

Again, the functions are explored for linearity. Both Figure 5 and Figure 6

graph the scatter plots for the masters and nonmasters, respectively, of the three

discrimination indices. Columns 1, 2, and 3, represent the scatter plots for δA1 , δB

1 ,

and δC1 , respectively, and row 1 and row 2 represent scatter plots for attribute 1 and

attribute 2, respectively. It is clear that the functions are not linear and therefore

the correlation is smaller than one might expect if the values were transformed.

Insert Figure 5 about here

Insert Figure 6 about here

Cognitive Diagnostic Discrimination 35

Discussion

The results provide strong evidence that all three indices are good candidates

as possible indicators for an attribute’s correct classification rates for any given test.

Given this relationship, it is now possible to define the discriminating power, and

hence the usefulness of each item for accurately estimating each attribute. Those

items with a high discrimination index for an attribute contribute more to the

estimation of that attribute than those with small values. In addition, as in the case

of the Henson and Douglas (2004) CDI, the set of 2K attribute discrimination

indices for each item can be used to construct an effective attribute diagnostic test

from an item bank. Specifically, by selecting items where the test attribute

discrimination index is large for all attributes, the test will have high correct

classification rates for every attribute when compared to all possible tests that can

be constructed from the same item bank. A future study will compare test

construction based on the CDI to test construction based on the attribute level

discrimination indices.

Finally, the results do not support any one of the discrimination indices over

any other discrimination index. However, it is clear, in particular from the example

where all off diagonal correlations in ρ are .999, that there are situations when one

or more of the indices are unreasonable. Further research will explore the conditions

where specific indices are more useful than others.

Cognitive Diagnostic Discrimination 36

References

Chang, H. & Ying, Z. (1996) A global information approach to computerized

adaptive testing. Applied Psychological Measurement, 20, 213-229.

DiBello, L. V., Stout, W. F., & Roussos, L. A. (1995)Unified

congnitive/psychometric diagnostic assessment liklihood-based classification

techniques. In P. D. Nichols, D. F. Chipman, & R. L. Brennan (Eds.) Cognitively

diagnostic assessment. (pp. 361-389). Hillsdale, NJ : Erlbaum.

Hambleton, R. Swamiinathan, H. (2000) Item Response Theory. Boston, MA.

Kluwer Nijhoff Publishing.

Henson, R. & Douglas, J. (2004) Test construction for cognitive diagnostic

models. Accepted by APM

Hartz, S. (2002) A Bayesian framework for the Unified Model for assessing

cognitive abilities: Blending theory with practicality . Unpublished doctoral

dissertation.

Junker, B. W., & Sijtsma, K. (2001) Cognitive assessment models with few

assumptions, and connections with nonparametric item response theory. Applied

Psychological Measurement, 12, 55-73.

Lehmann, E. & Casella, G. (1998) Theory of Point Estimation: Second

Edition. Springer-Verlag New York, Inc.

Lord, F. M.(1980) Applications of Item Response Theory To practical Testing

Problems. Hillsdale, NJ: Lawrence Erlbaum Associates.

Cognitive Diagnostic Discrimination 37

Lord, F., Novick, M. (1968) Statistical Theories of mental test scores with

contributions from Alan Birnbaum. Reading, MA: Addison-Wesley.

Maris, E. (1999) Estimating multiple classification latent class models.

Psychometrika, 64, 187-212.

McDonald, R. (1999) Test Theory: A Unified treatment. Mahwah, NJ:

Lawrence Erlbaum Associates.

Rao, C. R. (1962) Efficient Estimates and optimum inference procedures in

large samples Journal of the Royal Statistics Society Series B, 24, 46-72.

Cognitive Diagnostic Discrimination 38

Footnotes

1While the RUM is used the proposed indices apply to any cognitive diagnostic

model with a discrete latent examinee space

2It should be noted that if the RUM is used, independence is conditional on α

and η. However, if all cj’s are assumed to be ∞ (as in the computation of the three

indices presented in this paper), the test Kullback-Leibler information is equal to

the sum of the Kullback-Leibler information for all items as in (5).

3Actual computation of the probabilities involves a Monte Carlo simulation of

examinees with the specified tetrachoric correlation and proportion of masters.

4all standard errors less or equal to .01

Cognitive Diagnostic Discrimination 39

Table 1

Basic Descriptive Statistics When Correlation between Attributes is .5

Index Min Mean Max

CC 0.75 0.92 1.00

Master δAk (1) 0.57 4.92 17.81

δBk (1) 0.63 5.41 19.36

δCk (1) 2.26 14.41 41.10

CC 0.75 0.93 1.00

Nonmaster δAk (0) 0.69 5.97 19.93

δBk (0) 0.78 6.76 21.83

δCk (0) 2.30 10.08 23.24

Cognitive Diagnostic Discrimination 40

Table 2

Basic Descriptive Statistics When Correlation between Attributes is .75

Index Min Mean Max

CC 0.80 0.93 1.00

Master δAk (1) 0.60 4.92 15.73

δBk (1) 0.67 5.57 18.24

δCk (1) 3.68 19.99 52.25

CC 0.84 0.95 1.00

Nonmaster δAk (0) 0.72 5.96 16.82

δBk (0) 0.83 7.01 20.25

δCk (0) 3.43 12.22 23.41

Cognitive Diagnostic Discrimination 41

Table 3

Basic Descriptive Statistics When Correlation between Attributes is .95

Index Min Mean Max

CC 0.89 0.96 1.00

Master δAk (1) 0.64 4.91 15.82

δBk (1) 0.68 5.67 17.70

δCk (1) 5.05 26.21 70.35

CC 0.92 0.98 1.00

Nonmaster δAk (0) 0.77 5.96 17.38

δBk (0) 0.86 7.17 20.00

δCk (0) 4.83 15.52 28.58

Cognitive Diagnostic Discrimination 42

Table 4

Correlation between the Discrimination Indices and Correct Classification

Study ρA ρB ρC

Att Cor .5 0.87 0.86 0.82

Masters Att Cor .75 0.87 0.85 0.80

Att Cor .95 0.85 0.83 0.79

Study ρA ρB ρC

Att Cor .5 0.90 0.90 0.94

Non-Masters Att Cor .75 0.88 0.88 0.93

Att Cor .95 0.83 0.83 0.89

Cognitive Diagnostic Discrimination 43

Table 5

Correlation between the Transformed Discrimination Indices and Correct

Classification

Study ρA ρB ρC

Att Cor .5 0.97 0.97 0.97

Masters Att Cor .75 0.95 0.94 0.89

Att Cor .95 0.93 0.92 0.87

Study ρA ρB ρC

Att Cor .5 0.95 0.94 0.90

Non-Masters Att Cor .75 0.96 0.96 0.96

Att Cor .95 0.93 0.92 0.92

Cognitive Diagnostic Discrimination 44

Figure Captions

Figure 1. Scatter plots of the Discrimination Indices with Correct Classification for

Masters

Figure 2. Scatter plots of the Discrimination Indices with Correct Classification for

Nonmasters

Figure 3. Scatter plots of the Transformed Discrimination Indices with Correct

Classification for Masters

Figure 4. Scatter plots of the Transformed Discrimination Indices with Correct

Classification for Nonmasters

Figure 5. Discrimination Indices with Correct Classification for Masters in High

Correlation Example

Figure 6. Discrimination Indices with Correct Classification for Nonmasters in High

Correlation Example

Cognitive Diagnostic Discrimination, Figure 1

0 10 200.7

0.8

0.9

1Reliability Index A

Sim

ulat

ion

with

Cor

r .5

0 10 200.7

0.8

0.9

1Reliability Index B

0 20 40 600.7

0.8

0.9

1Reliability Index C

0 5 10 150.8

0.85

0.9

0.95

1

Sim

ulat

ion

with

Cor

r .7

5

0 10 200.8

0.85

0.9

0.95

1

0 20 40 600.8

0.85

0.9

0.95

1

0 10 200.9

0.95

1

Sim

ulat

ion

with

Cor

r .9

5

0 10 200.9

0.95

1

0 50 1000.9

0.95

1

Cognitive Diagnostic Discrimination, Figure 2

0 10 200.7

0.8

0.9

1Reliability Index A

Sim

ulat

ion

with

Cor

r .5

0 10 20 300.7

0.8

0.9

1Reliability Index B

0 10 20 300.7

0.8

0.9

1Reliability Index C

0 10 200.8

0.85

0.9

0.95

1

Sim

ulat

ion

with

Cor

r .7

5

0 10 200.8

0.85

0.9

0.95

1

0 10 20 300.8

0.85

0.9

0.95

1

0 10 200.92

0.94

0.96

0.98

1

Sim

ulat

ion

with

Cor

r .9

5

0 10 200.92

0.94

0.96

0.98

1

0 10 20 300.92

0.94

0.96

0.98

1

Cognitive Diagnostic Discrimination, Figure 3

−2 0 2 40.7

0.8

0.9

1Reliability Index A

Sim

ulat

ion

with

Cor

r .5

−2 0 2 40.7

0.8

0.9

1Reliability Index B

1 2 3 40.7

0.8

0.9

1Reliability Index C

−2 0 2 40.8

0.85

0.9

0.95

1

Sim

ulat

ion

with

Cor

r .7

5

−2 0 2 40.8

0.85

0.9

0.95

1

1 2 3 40.8

0.85

0.9

0.95

1

−2 0 2 40.9

0.95

1

Sim

ulat

ion

with

Cor

r .9

5

−2 0 2 40.9

0.95

1

0 2 4 60.9

0.95

1

Cognitive Diagnostic Discrimination, Figure 4

−2 0 2 40.7

0.8

0.9

1Reliability Index A

Sim

ulat

ion

with

Cor

r .5

−2 0 2 40.7

0.8

0.9

1Reliability Index B

1 2 3 40.7

0.8

0.9

1Reliability Index C

−2 0 2 40.8

0.85

0.9

0.95

1

Sim

ulat

ion

with

Cor

r .7

5

−2 0 2 40.8

0.85

0.9

0.95

1

1 2 3 40.8

0.85

0.9

0.95

1

−2 0 2 40.92

0.94

0.96

0.98

1

Sim

ulat

ion

with

Cor

r .9

5

0 1 2 30.92

0.94

0.96

0.98

1

1 2 3 40.92

0.94

0.96

0.98

1

Cognitive Diagnostic Discrimination, Figure 5

−1 0 10.8

0.85

0.9

0.95

1Reliability Index A

Attr

ibut

e 1

−1 0 10.8

0.85

0.9

0.95

1Reliability Index B

0 5 10 150.8

0.85

0.9

0.95

1Reliability Index C

0 10 20 300.85

0.9

0.95

1

Attr

ibut

e 2

0 10 20 300.85

0.9

0.95

1

0 10 20 300.85

0.9

0.95

1

Cognitive Diagnostic Discrimination, Figure 6

−1 0 10.8

0.85

0.9

0.95

1Reliability Index A

Attr

ibut

e 1

−1 0 10.8

0.85

0.9

0.95

1Reliability Index B

0 10 200.8

0.85

0.9

0.95

1Reliability Index C

0 20 400.8

0.85

0.9

0.95

1

Attr

ibut

e 2

0 20 400.8

0.85

0.9

0.95

1

0 20 400.8

0.85

0.9

0.95

1


Recommended