Which of the following is an assumption made by the classical model of decision

Classical (Psychometric) Test Theory

R. Steyer, in International Encyclopedia of the Social & Behavioral Sciences, 2001

Classical Test Theory (CTT) has been developed to quantify measurement error and to solve related problems such as correcting observed dependencies between variables (e.g., correlations) for the attenuation due to measurement errors. Basic concepts of CTT are true score and measurement error variables. These concepts are defined as specific conditional expectations and its residual, respectively. The definitions of these concepts already imply a number of properties that were considered axioms in early presentations of CTT. Models of CTT consist of assumptions about the true score and error variables allowing to identify the theoretical parameters (such as true score variance and error variance) from the variances and covariance's of the observable measurements (test score variables). A number of implications of the assumptions defining models of CTT may be tested empirically via structural equation modeling. Hinting at more recent theories and their goals such as Item Response Theory, Generalizability Theory, and Latent State-Trait Theory concludes this article.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B008043076700721X

Reliability: Measurement

D. Rindskopf, in International Encyclopedia of the Social & Behavioral Sciences, 2001

1.2 Weaknesses of Classical Test Theory

CTT has several weaknesses that have led to the development of other models for test scores. First, the concept of reliability is dependent on the group used to develop the test. If the group has a wide range of skill or abilities, then the reliability will be higher than if the group has a narrow range of skill or abilities. Thus reliability is not invariant with respect to the sample of test-takers, and is therefore not a characteristic of the test itself; in addition, neither are the common measures of item discrimination (such as the item-total correlation) or item difficulty (percent getting the item correct). As if this were not bad enough, the usual assumption that the standard error is the same for test-takers at all ability levels is usually incorrect. (In some extensions of CTT, this assumption is dropped, but these extensions are not well-known or widely used.) CTT also does not adequately account for observed test score distributions that have floor and ceiling effects, where a large proportion of test-takers score at either the low or high end of the test score range.

CTT has difficulties in handling some typical test development problems, horizontal and vertical equating. The problem of horizontal equating arises when one wishes to develop another test with the same properties as (or at least with a known relationship to) an existing test. For example, students who take college admissions tests such as the ACT or SAT should get the same score regardless of which version of the test they take. Vertical equating involves developing a series of tests that measure a wide range of ability or skill. For example, while we could develop completely independent tests of arithmetic for each elementary school grade, we might instead want to link them to have one continuous scale of mathematical skill for grades 1 through 6. While vertical and horizontal equating is not impossible within CTT, they are much more straightforward with item response theory (IRT).

These problems of CTT are partly due to some fuzziness in the theory (the population to be sampled is usually not considered in any detail in the theory). But they are also due to the failure of most data collection to be carried out on a random sample from any population, let alone a population considered appropriate for the test being investigated. In practice, convenience samples are used; these generally fit some criteria that the investigator specifies, but are otherwise taken as they are available to the researcher. Finally, CTT as originally conceptualized was never intended to address some of the practical testing problems described above.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B0080430767007221

Alpha Reliability

Doris McGartland Rubio, in Encyclopedia of Social Measurement, 2005

Definitions

Classical test theory indicates that:

(1)X=T+E

where X is the observed score, T is the true score, and E is the error. Reliability is:

(2)r=1−E

Because error and reliability directly correspond to one another, the type of reliability that we assess for a measure depends on the type of error that we seek to evaluate. When the measurement error within a measure is of concern, we seek to ascertain how much variability in the scores can be attributed to true variability as opposed to error. Measurement error within a measure manifests as a result of content sampling and the heterogeneity of behavior sampled. Content sampling refers to the sampling of items that make up the measure. If the sampled items are from the same domain, measurement error within a measure will be lower. Heterogeneity of behavior can lead to an increase in measurement error when the items represent different domain of behaviors. Other sources of measurement error within a test can occur, including guessing, mistakes, and scoring errors.

Internal consistency indicates the extent to which the responses on the items within a measure are consistent. Coefficient alpha is the most widely used reliability measure of internal consistency. Others measures of internal consistency include split-half and Kuder-Richardson 20.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B0123693985003959

Reliability Assessment

Edward G. Carmines, James A. Woods, in Encyclopedia of Social Measurement, 2005

Reliability

Random error is present in any measure. Reliability focuses on the assessment of random error and estimating its consequences. Although it is always desirable to eliminate as much random error from the measurement process as possible, it is even more important to be able to detect the existence and impact of random error. Because random error is always present to at least a minimum extent, the basic formulation in classical test theory is that the observed score is equal to the true score that would be obtained if there were no measurement error plus a random error component, or X = t + e, where X is the observed score, t is the true score, and e is the random disturbance. The true score is an unobservable quantity that cannot be directly measured. Theoretically, it is the average that would be obtained if a particular phenomenon was measured an infinite number of times. The random error component, or random disturbance, indicates the differences between observations.

Classical test theory makes the following assumptions about measurement error:

Assumption 1: The expected random error is zero,

Ee=0.

Assumption 2: The correlation between the true score and random error is zero,

ρt,e=0.

Assumption 3: The correlation between the random error of one variable and the true score of another variable is zero,

ρe1,t2=0.

Assumption 4: The correlation between errors on distinct measurements is zero,

ρe1,e2=0.

From these assumptions, we see that the expected value of the observed score is equal to the expected value of the true score plus the expected value of the error:

EX=Et+Ee .

However, because, by assumption 1, the expected value of e is zero, E(e) = 0, then,

EX=Et.

This formula applies to repeated measurements of a single variable for a single person. However, reliability refers to the consistency of repeated measurements across persons and not within a single person. The equation for the observed score may be rewritten so that it applies to the variances of the single observed score, true score, and random error:

VarX=Vart+e=Vart+2Cov t,e+Vare.

Assumption 2 stated that the correlation (and covariance) between the true score and random error is zero, so 2 Cov(t, e) = 0. Consequently,

VarX=Vart+Vare.

So the observed score variance equals the sum of the true score variance and the random error variance. Reliability can be expressed as the ratio of the true score variance to the observed score variance:

ρx=Var(t)Var(X).

That is, ρx is the reliability of X as a measure of t. Alternatively, reliability can be expressed in terms of the error variance as a proportion of the observed variance:

ρx=1−[Var(e)Var(X) ].

This equation makes it clear that reliability varies between 0 and 1. If all observed variance consists of error, then reliability will be 0, because 1 − (1/1) = 0. At the other extreme, if there was no random error in the measurement of some phenomenon, then reliability will be 1, because 1 − (0/1) = 1.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B0123693985001985

Inter-Rater Reliability

Robert F. DeVellis, in Encyclopedia of Social Measurement, 2005

Classical Test Theory and Reliability

According to classical test theory, a score obtained in the process of measurement is influenced by two things: (1) the true score of the object, person, event, or other phenomenon being measured and (2) error (i.e., everything other than the true score of the phenomenon of interest). Reliability, in general, is a proportion corresponding to a ratio between two quantities. The first quantity (denominator) represents the sum total of all influences on the obtained score. The other quantity (numerator) represents the subportion of that total that can be ascribed to the phenomenon of interest, often called the true score. The reliability coefficient is the ratio of variability ascribable to the true score relative to the total variability of the obtained score. Inter-rater reliability is merely a special case of this more general definition. The distinguishing assumption is that the primary source of error is due to the observers, or raters as they are often called.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B0123693985000955

Classical Test Theory

Wim J. van der Linden, in Encyclopedia of Social Measurement, 2005

Parameter Estimation

The statistical treatment of CTT is not well developed. One of the reasons for this is the fact that its model is not based on the assumption of parametric families for the distributions of Xjt and TJt in Eqs. (5) and (6). Direct application of standard likelihood or Bayesian theory to the estimation of classical item and test parameters is therefore less straightforward. Fortunately, nearly all classical parameters are defined in terms of first-order and second-order (product) moments of score distributions. Such moments are well estimated by their sample equivalents (with the usual correction for the variance estimator if we are interested in unbiased estimation). CTT item and test parameters are therefore often estimated using “plug-in estimators,” that is, with sample moments substituted for population moments in the definition of the parameter.

A famous plug-in estimator for the true score of a person is the one based on Kelley's regression line. Kelley showed that, under the classical model, the least-squares regression line for the true score on the observed score is equal to

(33)E(T|X=x)=ρXT2x+(1−ρ XT2)μX.

An estimate of a true score is obtained if estimates of the reliability coefficient and the population mean are plugged into this expression. This estimator is interesting because it is based on a linear combination of the person's observed score and the population mean with weights based on the reliability coefficient. If ρXT2=1, the true-score estimate is equal to observed x; if ρXT 2=0, it is equal to the population mean, μX. Precision-weighted estimators of this type are typical of Bayesian statistics. For this reason, Kelley's result has been hailed as the first Bayesian estimator known in the statistical literature.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B0123693985004497

Generalizability Theory

Richard J. Shavelson, Noreen M. Webb, in Encyclopedia of Social Measurement, 2005

Generalizability Coefficient

The generalizability coefficient is analogous to classical test theory's reliability coefficient (the ratio of the universe-score variance to the expected observed-score variance; an intraclass correlation). For relative decisions and a p × I × O random-effects design, the generalizability coefficient is:

(7)Eρ2(XpIO,μp)=Eρ2=Ep (μp−μ)2EOEIEp(XpIO−μIO)2= σp2σp2+σδ2

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B0123693985001936

Graded Response Model

Fumiko Samejima, in Encyclopedia of Social Measurement, 2005

Unique Maximum Condition for the Likelihood Function

The beauty of IRT lies in that, unlike classical test theory, θ can be estimated straightforwardly from the response pattern, without the intervention of the test score. Let Axg(θ) be:

(28)Axg (θ)≡∂∂θlogPxg (θ)=∑u≤xg∂∂θlogMu(θ)+∂∂θlog[1−M(xg+1)(θ)].

Because, from Eqs. (21) and (28), the likelihood equation can be written:

(29)∂∂θlogL(x|θ)=∂∂θlogPυ(θ)=∑xg∈υ∂∂θlogPxg(θ)= ∑xg∈υAxg(θ)≡0

a straightforward computer program can be written for any model in such a way that Axg(θ) is selected and computed for each xg ∈ v for g = 1, 2,…, n and added, and the values of θ that makes the sum total of these n functions equal to zero are located as the MLE of θ for the individual whose performance is represented by v. Samejima called Axg(θ) the basic function, because this function provides the basis for computer programming for estimating θ from the response pattern v.

From Eqs. (28) and (29), it is obvious that a sufficient condition for the likelihood function to have a unique modal point for any response pattern is that the basic function Axg(θ) be strictly decreasing in θ, with a nonnegative upper asymptote and a nonpositive lower asymptote. It can be seen from Eqs. (22) and (28) that this unique maximum condition will be satisfied if the item response information function Ixg(θ) is positive for all θ, except at a finite or enumerably infinite number of points.

The normal ogive model, the logistic model, the logistic positive exponent model, the acceleration model, and models derived from Bock's nominal response model all satisfy the unique maximum condition. Notably, however, the three-parameter logistic model for dichotomous responses, which has been widely used for multiple-choice test data, does not satisfy the unique maximum condition, and multiple MLEs of θ may exist for some response patterns.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B0123693985004515

Test–Retest Reliability

Chong Ho Yu, in Encyclopedia of Social Measurement, 2005

Generalizability Theory for Addressing Multiple Sources of Error

Addressing multiple sources of error is an interesting idea, but classical test theory directs researchers to focus on one source of error with different computing methods. For example, if one computes a test–retest reliability coefficient, the variation over time in the observed score is counted as error, but the variation due to item sampling is not. If one computes Cronbach coefficient Alpha, the variation due to the sampling of different items is counted as error, but the time-based variation is not. This creates a problem if the reliability estimates yielded from different methods are substantively different. To counteract this problem, Marcoulides suggested reconceptualizing classical reliability in a broader notion of generalizability. Instead of asking how stable, how equivalent, or how consistent the test is, and to what degree the observed scores reflect the true scores, the generalizability theory asks how the observed scores enable the researcher to generalize about the examinees' behaviors given that multiple sources of errors are taken into account.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B0123693985000943

Optimal Test Construction

Bernard P. Veldkamp, in Encyclopedia of Social Measurement, 2005

Classical Test Construction

Even though classical item parameters depend on the population and the other items in the test, in practice classical test theory is often applied to construct tests. When the assumption can be made that the population for the test does hardly change, test construction may be possible for classical test forms. In general, the objective function for these tests is to optimize reliability of the test. The reliability of the test is hard to estimate, but Cronbach's α defines a lower bound for it. The objective function for maximizing Cronbach's α can be defined for a fixed length test as:

(19)maxn(n−1)[1− ∑i=1nσi2(∑i=1nσiρiX)2],

where σi2 is the observed score variance, and ρiX is the item test correlation of item i. These parameters are based on earlier administrations of the items. The expression for Cronbach's α is a nonlinear function of the decision variables. In order to formulate the test construction problem as a linear programming problem the following modification is often made. Instead of maximizing the expression in Eq. (19), the denominator of the last term is maximized and its numerator is bounded from above:

(20)max∑i=1nσiρix

subject to:

(21)∑i=1 nσi≤c.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B0123693985004473

Which of the following is an assumption of the classical model of decision making?

Classical decision theory assumes that decisions should be completely rational and optimal; thus, the theory employs an optimizing strategy that seeks the best possible alternative to maximize the achievement of goals.

What are the assumptions of the classical decision theory?

The classical model prescribes the best way to make decisions, based on four assumptions: a clearly defined problem, eliminated uncertainty, access to full information, and rational behavior of the decision-maker.

Which of the following is an assumption made by the classical model quizlet?

adjustment of prices until quantities supplied and demanded are equal. It's a critical assumption made in the classical model.

Which of the following is not an assumption of the classical decision model?

Which of the following is not an assumption of the classical model of decision making? Managers will use intuition rather than rational analysis to make sound decisions when information is incomplete.