Keith Markus Professional and Personal Information

**Professional and Personal Information**
Contents	Back to...
Frontiers of Test Validity Theory (book)
Publications	Site Map
Presentations
Biographical Information
Contact Information
Criminal Justice Statistics Links
Consulting Services

Now in its second edition, this important book examines test validity in the behavioral, social, and educational sciences by exploring three fundamental problems: measurement, causation, and meaning. Psychometric and philosophical perspectives and unresolved issues receive attention, as the authors explore how measurement is conceived from both the classical and modern perspectives.

Split into three accessible sections, the first contrasts theories of measurement as applied to the validity of behavioral science measures, and the second considers causal theories of measurement as well as alternative theories of causation. The final section explores the meaning and interpretation of test scores as they apply to test validity, offering a conceptual overview of the field and its current state. Each carefully revised chapter begins with an overview of key theories and literature, concludes with a list of suggested readings, and features boxes with real-life situations that connect theory to practice. Examples of specific issues include:

How tests can assess an attribute without measuring it.
The role of values in test validity.
Interpreting responses to the same question in different languages.

Researchers, practitioners, and policy makers interested in test validity or developing tests will appreciate the book's cutting-edge review of test validity. Focusing on both the underlying concepts, as well as practical challenges of test construction and use, it also serves as a supplement in graduate or advanced undergraduate courses on test validity, psychometrics, testing, or measurement taught in psychology, education, sociology, social work, political science, business, criminal justice, and other fields. The book does not assume a background in measurement.

Routledge book web page.

Errata Sheet (April 14, 2014)

Denny Borsboom's web page.

Contents:

1. Introduction: Surveying the Field of Test Validity Theory.
Part I:
2. Philosophical Theories of Measurement.
3. Psychometric Models.
4. Open Issues in Measurement Theory and Psychometrics.
Part II:
5. Test Scores as Samples: Behavior Domain Theory.
6. Causality in Measurement.
7. Causation, Correlation, and Reflective Measurement Models.
8. Problems in Causation and Validity: Formative Measurement, Networks, and Individual Differences.
Part III:
9. Interpreting Test Responses: Validity, Values, and Evaluation.
10. A Model of Test Score Interpretation.
11. Open Questions About Test Score Meaning.
Part IV:
12. An Integrative View of Test Validity.
13. Epilogue as Dialog: The Future of Test Validity Theory.

Synopsis:

This book examines test validity in the behavioral, social, and educational sciences by exploring three fundamental problems: measurement, causation and meaning. Psychometric and philosophical perspectives receive attention along with unresolved issues. The authors explore how measurement is conceived from both the classical and modern perspectives. The importance of understanding the underlying concepts as well as the practical challenges of test construction and use receive emphasis throughout. The book summarizes the current state of the test validity theory field. Necessary background on test theory and statistics is presented as a conceptual overview where needed.

Each chapter begins with an overview of key material reviewed in previous chapters, concludes with a list of suggested readings, and features boxes with examples that connect theory to practice. These examples reflect actual situations that occurred in psychology, education, and other disciplines in the US and around the globe, bringing theory to life. Critical thinking questions related to the boxed material engage and challenge readers. A few examples include:

What is the difference between intelligence and IQ?

Can people disagree on issues of value but agree on issues of test validity?

Is it possible to ask the same question in two different languages?

The first part of the book contrasts theories of measurement as applied to the validity of behavioral science measures.The next part considers causal theories of measurement in relation to alternatives such as behavior domain sampling, and then unpacks the causal approach in terms of alternative theories of causation.The final section explores the meaning and interpretation of test scores as it applies to test validity. Each set of chapters opens with a review of the key theories and literature and concludes with a review of related open questions in test validity theory.

Researchers, practitioners and policy makers interested in test validity or developing tests appreciate the book's cutting edge review of test validity. The book also serves as a supplement in graduate or advanced undergraduate courses on test validity, psychometrics, testing or measurement taught in psychology, education, sociology, social work, political science, business, criminal justice and other fields. The book does not assume a background in measurement.

Reviews:

Aros, J. R. (2013). Gobstoppers: Psychometric brain candy in the frontiers and boonies of test validity theory. A review of Frontiers of Test Validity Theory: Measurement, Causation and Meaning. PsychCritiques, 58, Release 44, Article 6.

"[I]s this book worth reading? The answer is a resounding yes for those who are looking to make more sense of generalizability theory, IRT, and SEM applications in a testing/assessment-centered arena, as a user, developer, or both." (p. 2)

No Author (2013). Reference & Research Book News, 28(3), 8-12.

"Assuming no previous background in the relevant philosophical areas, they discuss such topics as philosophical theories of measurement, open issues in measurement theory and psychometrics, problems in causation and validity, a model of test score interpretation, and an integrative view of test validity." (p. 8).

Heberle, J. F. (2013). Frontiers of test validity theory: measurement, causation and meaning (book review). Choice, 51(4), 51-2153. doi: 10.5860/CHOICE.51-2153

"Highly recommended. Graduate students, faculty, researchers, professionals." (p. 684).

Kane, M. T. (2014). Frontiers of test validity theory: measurement, causation, and meaning (book review). Assessment in Education: Principles, Policy & Practice, 21, 238-244. 7p. DOI: 10.1080/0969594X.2013.878684.

"Markus and Borsboom provide thought-provoking analyses of the roles played by measurement, causality and meaning in validity theory." (p. 243).

Gotch, C. M, (2014). Frontiers of test validity theory: measurement, causation, and meaning (book review). Journal of Educational Measurement, 51, 463–469.

"Simply put, Frontiers of Test Validity Theory is an essential volume in the library of the measurement professional or advanced graduate student who wants to approach their work with purpose and understanding." (p. 466).

Abstract
Rubin and Pearl offered approaches to causal effect estimation and Lewis and Pearl offered theories of counterfactual conditionals. Arguments offered by Pearl and his collaborators support a weak form of equivalence such that notation from the rival theory can be re-purposed to express Pearl’s theory in a way that is equivalent to Pearl’s theory expressed in its native notation. Nonetheless, the many fundamental differences between the theories rule out any stronger form of equivalence. A renewed emphasis on comparative research can help to guide applications, further develop each theory, and better understand their relative strengths and weaknesses.

Abstract: Bollen and colleagues have advocated the use of formative scales despite the fact that formative scales lack an adequate underlying theory to guide development or validation such as that which underlies reflective scales. Three conceptual impediments impede the development of such theory: the redefinition of measurement restricted to the context of model fitting, the inscrutable notion of conceptual unity, and a systematic conflation of item scores with attributes. Setting aside these impediments opens the door to progress in developing the needed theory to support formative scale use. A broader perspective facilitates consideration of standard scale development concerns as applied to formative scales including scale development, item analysis, reliability, and item bias. While formative scales require a different pattern of emphasis, all five of the traditional sources of validity evidence apply to formative scales. Responsible use of formative scales requires greater attention to developing the requisite underlying theory.

Abstract: Justification of testing practice involves moving from one state of knowledge about the test to another. Theories of test validity can (a) focus on the beginning of the process, (b) focus on the end or (c) encompass the entire process. Analyses of four case studies test and illustrate three claims: (a) restrictions on validity entail a supplement required to obtain justification from validity. (b) Rationales for restrictions assume particular contexts. (c) Claims can be translated between contrasting vocabularies. Implications for consumers of test validity theory include encouragement to focus on content instead of form and to write and read mindfully of the multiplicity of validity vocabularies. Implications for producers of test validity theory include encouragement to consider multiple reconstructions of a particular theory of test validity, clearly distinguish validity theories from validity definitions, and focus on contributing arguments that constrain possible theories rather than contributing definitions or broad frameworks.

Abstract: Causal inference plays a central role in behavioral science. Historically, behavioral science methodologies have typically sought to infer a single causal relation. Each of the major approaches to causal inference in the behavioral sciences follows this pattern. Nonetheless, such approaches sometimes differ in the causal relation that they infer. Incremental causal inference offers an alternative to this conceptualization of causal inference that divides the inference into a series of incremental steps. Different steps infer different causal relations. Incremental causal inference is consistent with both causal pluralism and anti-pluralism. However, anti-pluralism places greater constraints the possible topology of sequential inferences. Arguments against causal inference include questioning consistency with causation as an explanatory principle, charging undue complexity, and questioning the need for it. Arguments in favor of incremental inference include better explanation of diverse causal inferences in behavioral science, tailored causal inference, and more detailed and explicit description of causal inference. Incremental causal inference offers a viable and potentially fruitful alternative to approaches limited to a single causal relation. (Copyright 2013 Springer Science+Business Media Dordrecht)

Abstract: Both qualitative and quantitative methods contribute to research on intimate partner violence (IPV). In this chapter we first report a tabular review of IPV research, focusing on the use of qualitative, quantitative, and mixed methods. We then explore the conceptual basis for distinguishing between qualitative and quantitative methods and argue that in at least one important sense, all methods are mixed methods. (Copyright 2013 Northeastern University)

Abstract: According to Kane (this issue) “the validity of a proposed interpretation or use depends on how well the evidence supports the claims being made”. Because truth and evidence are distinct, this means that the validity of a test score interpretation could be high even though the interpretation is false. As an illustration, we discuss the case of phlogiston measurement as it existed in the 18th century. At face value, Kane’s theory would seem imply that interpretations of phlogiston measurement were valid in the 18th century (because the evidence for them was strong), even though amounts of phlogiston do not exist and hence cannot be measured. We suggest that this neglects an important aspect of validity and suggest various ways in which Kane’s theory could meet this challenge. (Copyright 2013 National Council on Measurement in Education)

Abstract: Haig and Borsboom advocate for psychological science to adopt a correspondence theory of truth. However, their argument requires a hidden premise that only correspondence theories of truth bring the benefits that they ascribe to correspondence. This premise is not plausible and their argument therefore does not support their recommendation. Additionally, considerations extraneous to Haig and Borsboom’s argument speak in favor of considering alternative theories of truth. (Copyright 2013 Keith A. Markus)

Markus, K. A. (2013). Theories of causality: From antiquity to the present & The Oxford handbook of causation [book review]. Structural Equation Modeling, 20, 708-714.

Abstract: These two books hold interest for structural equation modeling in part because methodological literature often passes over the task of explaining what is meant by causation. Losee offers a brief, lively and to-the-point introduction and overview revolving round three key questions: (a) what sorts of things serve as causes and effects, (b) what relation holds between them, and (c) how might one assess a causal claim. Beebee, Hitchcock & Menzies offer a comprehensive and up-to-date compendium of chapters introducing and summarizing various topics in causation. Losee offers an ideal starting point while Beebee, Hitchcock and Menzies offers a more in-depth treatment that can be read cover to cover or used as a reference book. (Copyright 2013 Keith A. Markus)

Abstract: Verbal and written performance feedback for improving preschool and kindergarten teachers’ treatment integrity of behavior plans was compared using a combined multiple-baseline and multipletreatment design across teacher–student dyads with order counterbalanced as within-series conditions. Supplemental generalized least square regression analyses were included to evaluate significance. Maintenance of treatment integrity following termination of performance feedback was included and correspondence between treatment integrity and student behavior change was examined. Results suggested that both forms of feedback were effective for improving treatment integrity but that verbal performance feedback resulted in immediate and sustained improvements with
moderate to strong correspondence with student behavior change. (Copyright 2013 Taylor and Francis Group, LLC)

Abstract: Causal theories of measurement view test items as effects of a common cause. Behavior domain theories view test item responses as behaviors sampled from a common domain. A domain score is a composite score over this domain. The question arises whether latent variables can simultaneously constitute domain scores and common causes of item scores. One argument to the contrary holds that behavior domain theory offers more effective guidance for item construction than a causal theory of measurement. A second argument appeals to the apparent circularity of taking a domain score, which is defined in terms of a domain of behaviors, as a cause of those behaviors. Both arguments require qualification and behavior domain theory seems to rely on implicit causal relationships in two respects. Three strategies permit reconciliation of the two theories: One can take a causal structure as providing the basis for a homogeneous domain. One can construct a homogeneous domain and then investigate whether a causal structure explains the homogeneity. Or, one can take the domain score as linked to an existing attribute constrained by indirect measurement. (Copyright 2011 Elsevier Ltd.)

Abstract: This book provides a welcome update and compilation of Stanley Mulaik's contributions to structural equation modeling. The book is not organized in a manner that would make it easy to use in a semester long course, but it covers a range of important topics not customarily covered in an introductory course. Coupled with a comprehensive introductory text, the book provides sufficient background to read primary literature on structural equation modeling methods. The book offers a useful presentation of the author's views on causation in structural equation modeling despite some faulty arguments directed at alternative perspectives. The book also sheds new light on the tensions between realist and empiricist influences in the author's views on objectivity. The primary audiences for the book will be those with an interest in philosophy of science issues as they apply to structural equation modeling, instructors seeking supplementary reading for courses, and those new to structural equation modeling looking for an accessible introduction to more advanced topics. (Copyright 2012 Keith A. Markus)

Abstract: The possibility or impossibility of quantitative measurement in psychology has important ramifications for the nature of psychology as a discipline. Trendler’s (2009) argument for the impossibility of psychological measurement suggests a general and potentially fruitful strategy for further research on this question. However, the specific argument offered by Trendler appears flawed in several respects. It seems to conflate what must hold true with what one must know and also equivocate on the necessary evidence. Moreover, if the argument supported its conclusion, it would rule out qualitative discourse on psychology as well as psychological measurement. Taking Trendler’s argument as an example, one can formulate a general structure to arguments adopting the same basic strategy. An overview of the requirements that such arguments should meet provides a metatheoretical perspective that can assist authors in constructing such arguments and readers in critically evaluating them. (Copyright 2011 the authors)

Absract: Newton provides a thoughtful and valuable contribution to test validity theory. I question the notion of an attribute as constrasted with a construct. I question the strict requirement that a test must measure the attribute entailed by a decision for which the test is used. I also question the rejection of degrees of validity. (Copyright 2012 Keith A. Markus)

Abstract: The third edition of Kline's text offers a welcome update and restructuring of the text. Much is improved from the previous version although the strain of squeezing an adequate introduction to all things SEM into a single semester course cannot be avoided. The text continues to provide a highly articulate and accessible introduction, but not an adequate preparation for the technical literature for most readers. (Copyrigh 2012 Keith A. Markus)

Abstract: Causal inference using statistical models plays a central role in many areas of behavioral science, but the underlying metatheory of causal explanation remains poorly developed. Mulaik's work on causation offers a useful foray into this topic. Evaluation of two negative arguments applied to a broad range of theories of causation offer overdue critical assessment of this contribution. More broadly, the critical evaluation of Mulaik's arguments speak to the need for better integration of substantive theories and statistical models in causal research. (Copyright 2010 Springer Science+Business Media B.V.)

Abstract: Weisberg offers a useful integrative treatment of approaches to causal inference across a variety of disciplines. The book includes useful original thinking through of key issues in causal inference primarily from a potential outcomes perspective. However, the book does not provide a non-circular analysis of causation itself. (Copyright 2012 Keith A. Markus)

Abstract: Pearl's work on causation has helped focus new attention on the nature of causal reasoning and causal inference in behavioral science. Pearl takes an axiomatic approach, presenting axioms as first principles, but these may be better understood at boundary conditions for the application of the theory. Pearl adopts a non-eliminative but instrumental approach to causation which creates some tension with the tradition of ruling out rival hypotheses in the behavioral sciences. Finally, much causal reasoning in the behavioral sciences involves reasoning across possible world that differ in their causal structure, which becomes awkward within the basic architecture of Pearl's system. A neighborhood semantics approach could represent this type of reasoning more naturally. Consideration of these issues may be helpful both to behavioral scientists working to incorporate Pearl's work and also to those working outside the behavioral sciences attempting to explain causal reasoning within those sciences. (Copyright 2011 Oxford University Press)

Abstract: One common application of structural equation modeling (SEM) involves expressing and empirically investigating causal explanations. Nonetheless, several aspects of causal explanation that have an impact on behavioral science methodology remain poorly understood. It remains unclear whether applications of SEM should attempt to provide complete explanations or partial explanations. Moreover, it remains unclear what sorts of things researchers can best take as causes and effects. Finally, the meaning of causal assertions itself remains poorly understood. Attempting to clarify the use of structural equations as causal explanations by addressing these issues has implications for behavioral science methodology because applications of SEM typically remain vague about causation and thus about their substantive conclusions. Research aimed at clarifying these issues can lead to a sharper and more refined use of SEM for causal explanation, and by extension, clarify behavioral science methodology more generally. (Copyright 2010 Taylor & Francis Group LLC)

Abstract: Nancy Cartwright seeks to question the traditional division of labor between the policy analyst, social or behavioral science methodologists, and the philosopher, encouraging instead a closer colaboration between the three. A series of 13 articles explore various aspects and this central theme. The articles provide detailed thoughtful probing of a number of important issues in applied causal inference. Some of these issues reflect active areas of research and others reflect neglected issues that warrant greater attention. The collection offers little in terms of synthesis across the 13 essays, and makes few consessions to readers lacking prior familiarity with the formal notation used in examples at various points in the book. Nonetheless, the book offers a valuable compilation including four new articles that will be of interest to anyone concerned with issues of causation and causal inference in the context of public policy and the research that informs it. (Copyright 2010 Keith A. Markus)

Abtract: Cramer et al. present a thoughtful application of network analysis to symptoms, but certain questions remain open. These questions involve the intended causal interpretation, the critique of latent variables, individual variation in causal networks, Borsboom’s idea of networks as measurement models, and how well the data support the stability of the network results. (Copyright 2010 Cambridge University Press)

Abstract: Provides an overview of construct validity as the term is used in test validity theory.

Abstract: Provides an overview of content validity as the term is used in test validity theory.

Abstract: Researchers often wish to understand the relationshp between two continuous predictors and a common continuous outcome. Many options for graphing such relationships, including conditional regression lines or 3D regression surfaces, depend on an underlying model of the data. The veridicality of the graph depends upon the veridicality of the model, and poor models can result in misleading graphs. An enhanced 2D scatterplot or bubble plot tha reprsents values of a variable using the size of the plotted circles offers a model-free alternative. The R function bp3way() implements the bubble plot with a variety ofuser specifiable parameters. An empirical study demonstrates the comparability of bubble plots to other model-free plots for exploroing three-way continuous data. (Copyright 2010 Springer Science + Business Media)

Abstract: Abductive inference often involves inference to the best explanation. A focus on the bestness of explanations facilitates a comparative analysis of how abductive inference would differ if approached with four contasting sets of assumptions about how scientific inference works: positivism, realism, and two kinds of pragmatism. As a thought experiment, one can imagine a situation in which competing models of psychopathy differ in parsimony and fit to the data, but produce tie when considering both virtues in combination. The thought experiment demonstrates that Steiger's (1990) question about how best to combine competing virtues in scientific inference applies to abductive inference and that the answers depend upon other assumptions about how science works. The comparative analysis helps focus some of the issues that require clarification before abductive inference can enter the Pantheon of standard research methods in psychology. More constructively, the analysis also demonstrates that one need not accept scientific realism to accept the use of abductive inference. (Copyright 2008 Wiley Periodicals, Inc.)

Markus, K. A. (2008). Putting concepts and constucts into practice: A reply to Cervone and Caldwell, Haig, Kane, Mislevy, and Rupp. Measurement, 6, 147-154.

Abstract: The commentary has greatly enriched the discussion initiated by the three target articles. The distinction between constructs and concepts contributes to both the top-down and bottom-up aspects of the dialectic between measurement theory and practice. The distinction also illuminates the abstraction from observation to manifest variables. However, the semantic analysis of variables in terms of ordered pairs of individuals and values of variables does not seek to describe a procedure for defining constructs as part of the test development process. As such, uncertainty about population membership does not pose a pragmatic constraint on the application of the distinction between concepts and constructs. Finally, meaning and reference in the context of test development can best be understood as jointly determined by both how the world is and the nature of the vocabularies chosen to describe it. Thus, both of these factors bear on the definition of specific constructs and concepts measured by individual tests. Distinguishing constructs from concepts can help clarify and advance discourse on testing and measurement across a wide range of domains. (Copyright 2008 Taylor & Francis Group, LLC)

Abstract: A theoretical variable such as integrity, conscientiousness, or academic honesty may correspond to either a construct or a concept, but the standard idiom does not distinguish the two. One can describe the difference between constructs and concepts in terms of set theory. Constructs extend over actual cases, whereas, concepts extend over both actual and possible cases. As such, theoretical claims made about, say, integrity as a construct differ from claims about integrity as a concept. The restrition of constructs to a specified population plays a central role in test validation and psychometric analyses aimed at distinguishing constructs from one another. The extension of concepts over possible populations plays a central role in the adoption of nonactual possibilities as goals in making efforts toward systemic change and also in the comparison of construts across populations. The failure of the standard idiom, which conflates constructs with concepts, to provide a vocabulary that captures both population-dependent and population-independent aspects of variables recommends the modifiction of that idiom to distinguish constructs from concepts. This distinction suggests various changes in practice such as including the intended population in the names of constructs but not concepts. (Copyright 2008 Taylor & Francis Group, LLC)

One can distinguish statistical models used in causal modeling from the causal interpretations that align them with substantive hypotheses. Causal modeling typically assumes an efficient causal interpretation of the statistical model. Causal modeling can also make use of mereological causal interpretations in which the state of the parts determines the state of the whole. This interpretation shares several properties with efficient causal interpretations but also differs in terms of other important properties. The vailability of alternaive causal interpretations of the same statistical models has implications for hypothesis specification, research design, causal inerence, data analysis, and the interpretation of research results. (Copyright 2008 Taylor & Francis Group, LLC)

Abstract: Yu has not written a book that offers significant contributions to cutting-edge work on the philosophical underpinnings of quantitative methods, nor has he written a systematic survey of philosophical foundations of quantitative methods. Yu has written an accessible and engaging book that provides and excellent introductory overview for nonspecialists. Even readers with a background in these issues can appreciate the book for these latter qualities. (Copyright 2008 Taylor & Francis Group, LLC)

Abstract: The scholarly community recently lost two luminary contributors to the causation literature, David Lewis andWesley Salmon, but the two remain very much present
in this book by James Woodward. Woodward aligns himself with Lewis in advocating a counterfactual account of causation (explained later), whereas Salmon
surfaces throughout the book as the representative of noncounterfactual accounts of causation and the prime target for criticism. As such, the book holds interest
both as a contribution to the primary literature, by offering a distinct theory of causal assertion and causal explanation, and also as a contribution to the secondary
literature, by illuminating the work of Lewis, Salmon, and others. (Copyright 2008 Taylor & Francis Group, LLC)

Abstract: First, a case is made that the processes and assumptions underlying judgments of whether someone is lying during a high stakes interview may be similar to movement interpretation processes in a clinical context, and that the former is easier to research than the latter. Graduate students judged the credibility of utterances from actual criminal confessions, explained their decisions, and rated how confident they felt in each decision. Four of the items contained a conventional but invalid nonverbal cue to deception and one contained two conventional, but incorrect, cues to truth-telling. Groups of 30 judged either content only transcripts, verbatim transcripts, audio, or audio/video. Comparison of rationales, confidence level, and accuracy across modality provided evidence of which cues misled judges, how nonverbal cues modified verbal content judgments, and detection patterns that warranted further research. The implications of the results for movement observation and interpretation in dance/movement therapy are discussed. (Copyright 2006 American Dance Therapy Association)

Abstract: For a study of modality differences in deception detection accuracy, groups of graduate students judged segments selected from videotapes of criminal confessions. Twenty brief utterances were presented in four ways: content only transcript, verbatim transcript, audio only, and audio/videotape. No modality difference in unbiased truth hit rate was found, but unbiased lie hit rate varied by modality, with judges of transcripts stripped of pause indications, word repeats, and umms and uhhs less accurate than verbatim transcript judges, audio judges, and audio/video judges. The 62% overall accuracy and 61% lie detection accuracy of audio judges was highest and, in contrast to other judges, audio judgments did not display a response bias. The results remain consistent with the presence of valid visual cues but suggest that at least in some situations focus on valid vocal cues may offer more accuracy. (Copyright 2006 Springer Science+Business Media, Inc)

Abstract: Introduces basic concepts of SEM. Statistical modeling concepts include model specification, parameter estimation, model fit, and model interpretation. Causal modeling with SEM, SEM resources, and alternatives to SEM are also discussed. (Abstract Copyright 2006 Keith A. Markus)

Abstract: Collins, Hall and Paul provide an outstanding resource for those interested in counterfactual theories of causation. The introductory essay would be an ideal supplementary reading in a methodology course. The book systematically develops current advances and problems in counterfactual theories of causation. However, someone looking for a general overview of theories of causation would want to cast his or her net more widely. (Copyright 2006 Ketih A. Markus)

Abstract: Coding statements of criminal suspects facilitated tests of four hypotheses about differences between behavioral cues to deception and the incriminating potential (IP) of the topic. Information from criminal investigations corroborated the veracity of 337 brief utterances from 28 videotaped confessions. A four-point rating of topic IP measured the degree of potential threat per utterance. Cues discriminating true vs. false comprised word/phrase repeats, speech disfluency spikes, nonverbal overdone, and protracted headshaking. Non-lexical sounds discriminated true vs. false in the reverse direction. Cues that distinguished IP only comprised speech speed, gesticulation amount, nonverbal animation level, soft weak vocal and "I (or we) just" qualifier. Adding "I don't know" to an answer discriminated both IP and true vs. false. The results supported hypothesis about differentiating deception cues from incriminating potential cues in high-stakes interviews, and suggested that extensive research on distinctions between stress-related cues and cues to deception would improve deception detection. (Copyright 2005 American Psychology-Law Society/Division 41 of the American Psychological Association)

Abstract: Mellor provides a readable and valuable discussion of current issues in the theory of causation. He argues in favor of facts as causes and effects. A number of points from his discussion have direct relevance for causal modeling. (Copyright 2005 Keith A. Markus)

Abstract: Structural equation models allow for interpretation as causal models within a variety of explanatory strategies. Literal explanatory strategies locate causation in the process modeled whereas non-literal strategies locate causation in the theoretical description itself. Robust strategies apply the model to possible cases as well as actual cases whereas non-robust strategies restrict application to actual cases. Crossing these two basic distinctions yields a fourfold explanatory strategy typology (FEST). The four explanatory strategies differ in their implications for research design, including what makes for the best measures, what form generalization takes, what makes for a good replication, and what makes for a good extension. The best choice of explanatory strategy may depend upon the state of research in a specific topic area. By demonstrating a many-to-one mapping of substantive interpretations onto statistical models, the FEST illustrates syntactical equivalence, an extension of the statistical equivalence concept in which different substantive models share the same structural equation model with similar implications for inferences from data to theory. (Copyright 2004 Keith A. Markus)

Abstract: Statistically equivalent models produce the same range of moment matrices over the domain of their parameter spaces. Raykov and Marcoulides (2001) proposed a proof that leads to the conclusion that all structural equation (SE) models with certain minimal components have infinitely many statistically equivalent models. A variation on their proof covers an even broader class of models. This conclusion has important implications for the application of at least one notion of eliminative induction to structural equation modeling (SEM). Normally, assertion of statistical equivalence imply that the models differ in meaning, giving statistical equivalence its interest. Consequently, a particular complex causal structure provides a counterexample to the proposed proof. This counterexample suggests that a successful proof may require more detailed attention to the concept of semantic equivalence as characterized y different substantive implications. A formal account of semantic equivalence rests on translation between SE models and a model-neutral descriptive language. (Copyright 2002 Lawrence Erlbaum Associates, Inc.)

Abstract: In advocating Bayesian inference as applied to Null Hypothesis Significance Tests, Krueger (2001) took for granted conceptual framework that dichotomizes beliefs as either objective (invariant over observers) or subjective (relative to an observer). Despite having claimed a pragmatic basis for his argument, Krueger overlooked the fact that a more thoroughgoing pragmatic approach would avoid this problematic framework altogether. As a consequence, Krueger drew two unjustified conclusions. He could have avoided these premature conclusions by considering beliefs as grounded in collectively accepted but continuously evolving norms for the justification of knowledge claims. Such a view avoids the false choice between objectivity and subjectivity and thus undermines any inference from the an inability to attain the former to an inability to avoid of the latter. (Copyright 2002 Keith A. Markus.)

Abstract: Critics have put forth several arguments against the use of tests of statistical significance (TOSSes). Among these, the converse inequality argument stands out but remains sketchy, as does criticism of it. The argument states that we want P(H|D) (where H and D represent hypothesis and data, respectively), we get P(D|H), and the 2 do not equal one another. Each of the terms in 'P(D|H) =/= P(H|D)' requires clarification. Furthermore, the argument as a whole allows for multiple interpretations. If the argument questions the logic of TOSSes, then defenses of TOSSes fall into 2 distinct types. Clarification and analysis of the argument suggests more moderate conclusions than previously offered by friends and critics of TOSSes. Furthermore, the general method of clarification through formalization may offer a way out of the current impasse. (Copyright 2001 American Psychological Association.)

Abstract: The present study examined the use of the Graduate Record Examination (GRE-Verbal and GRE-Quantitative) and undergraduate grade point average (UGPA) to predict long-term performance in an MA program in forensic psychology. The criterion measures were graduate grade point average (GGPA) and time to completion (TTC). Data were available for 206 graduates. Regression analysis indicated that a linear combination of GRE-V and GRE-Q, and UGPA correlated 0.63 with GGPA. Predictive efficiency was reduced by only 2% of the variance when GRE subscores are combined into a total score. The correlation with TTC was smaller (R = 0.31) but nonetheless translated into meaningful differences in student performance. Most noteworthy, GRE scores and UGPA appear to predict better for forensic psychology than for social sciences in general. (Copyright 2001 Sage Publications, Inc.)

Abstract: The attitudes toward the police (ATP) of a group of young inner city adolescents were investigated within the context of a program designed to teach dispute resolution skills and promote a dialogue with local police. ATP were measured using a 23 item questionnaire. The results indicated that while ATP were generally positive, girls held more positive ATP than boys and adolescents who reported negative experiences with the police had less favorable ATP. A confirmatory factor analysis of the questionnaire yielded three factors: attitudes toward police behavior, attitudes toward interaction with the police, and attitudes toward interaction with other adults. The results are in general agreement with earlier studies with other populations and have implications for programs designed to improve adolescent relationships with the police. (Copyright 2000 Society of Police and Criminal Psychology.)

Abstract: The traditional static view of organizational culture (OC) takes the persistence of OC for granted and seeks to explain culture change. Through this chapter I seek to engage the reader in a different kind of conversation about OC: a dynamic view that takes change as primary and seeks to use dynamic processes to explain the persistence of OC over time. Three types of processes occurring in organizations offer potential explanations of the day to day reproduction of OC. Intentional processes involve meanings consciously projected from individual minds into their environment. Unconscious processes involve meanings projected from individual minds, but done so without conscious awareness. Discursive processes involve meanings that take shape in the communicative actions that connect individual minds and that may never have been conscious to any member of the organization practicing the culture. Finally, to put discourse theory to good use we must remain mindful that discursive processes are material processes. (Copyright 2000 Keith A. Markus.)

Abstract: The exchange between Hayduk and Glaser (2000) and Mulaik and Millsap (2000) sheds new light on the use of multistep procedures for testing structural equation models. Nonetheless, the fundamental concepts of the discussion remain murky. The notion of a correct number of constructs (interpreted latent variables) rests on a conflation of the model with the reality it models. The articulation of what is tested in terms of model constraints encounters as analogous difficulty. Finally, the appeal to analysis into clear and distinct ideas holds the potential to clarify some of these issues, but still awaits the necessary exposition and application to structural equation modeling. A common thread shows itself in an over-reliance on single languages of description. This calls for greater attention to the active engagement of multiple languages of description. (Copyright 2000 Lawrence Erlbaum Associates, Inc.)

Abstract: Messick's (1989) theory of test validity is profoundly influential (Hubley and Zumbo, 1996; Angoff, 1988) in part because it brings together disparate contributions into a unified framework for building validity arguments. At the heart of Messick's theory lies a synthesis of realism and constructivism with respect to both scientific facts and measurement. Within this synthesis there remains a tension between the evidential basis and the consequential basis for test interpretation and use. This cannot be sidestepped simply by limiting the evidential basis to test interpretation and the consequential basis to test use: Interpretation and use are not so easily held separate. The roles of constructivism and context in Messick's theory underline the inherent link between facts and values, but the assumption that facts are objective and values are subjective goes unquestioned in Messick's theory. The inherent link between facts and values combines with this assumption to produce the unresolved tension in Messick's theory. This suggests that a unified theory of test validity requires a theory of value justification. (Copyright 1998 Kluwer Academic Publishers.)

Frontiers of Test Validity Theory: Measurement, Causation and Meaning

Keith A. Markus and Denny Borsboom

2013 Routledge

Publications