Keith Markus's Urban Sprawl

Professional and Personal Information
Back to...
Frontiers of Test Validity Theory (book)

Site Map

Biographical Information

Contact Information

Criminal Justice Statistics Links

 Consulting Services


Book cover

Frontiers of Test Validity Theory:  Measurement, Causation and Meaning

Keith A. Markus and Denny Borsboom

2013 Routledge

Lisa Haney Illustration
Illustration by Lisa Haney
Routledge book web page.

Errata Sheet (April 14, 2014)

Denny Borsboom's web page.


1. Introduction: Surveying the Field of Test Validity Theory.
Part I:
2. Philosophical Theories of Measurement.
3. Psychometric Models.
4. Open Issues in Measurement Theory and Psychometrics.
Part II:
5. Test Scores as Samples: Behavior Domain Theory.
6. Causality in Measurement.
7. Causation, Correlation, and Reflective Measurement Models.
8. Problems in Causation and Validity: Formative Measurement, Networks, and Individual Differences.
Part III:
9. Interpreting Test Responses: Validity, Values, and Evaluation.
10. A Model of Test Score Interpretation.
11. Open Questions About Test Score Meaning.
Part IV:
12. An Integrative View of Test Validity.
13. Epilogue as Dialog: The Future of Test Validity Theory.


This book examines test validity in the behavioral, social, and educational sciences by exploring three fundamental problems: measurement, causation and meaning. Psychometric and philosophical perspectives receive attention along with unresolved issues. The authors explore how measurement is conceived from both the classical and modern perspectives. The importance of understanding the underlying concepts as well as the practical challenges of test construction and use receive emphasis throughout. The book summarizes the current state of the test validity theory field. Necessary background on test theory and statistics is presented as a conceptual overview where needed.

Each chapter begins with an overview of key material reviewed in previous chapters, concludes with a list of suggested readings, and features boxes with examples that connect theory to practice. These examples reflect actual situations that occurred in psychology, education, and other disciplines in the US and around the globe, bringing theory to life. Critical thinking questions related to the boxed material engage and challenge readers. A few examples include:

What is the difference between intelligence and IQ?

Can people disagree on issues of value but agree on issues of test validity?

Is it possible to ask the same question in two different languages?

The first part of the book contrasts theories of measurement as applied to the validity of behavioral science measures.The next part considers causal theories of measurement in relation to alternatives such as behavior domain sampling, and then unpacks the causal approach in terms of alternative theories of causation.The final section explores the meaning and interpretation of test scores as it applies to test validity. Each set of chapters opens with a review of the key theories and literature and concludes with a review of related open questions in test validity theory.

Researchers, practitioners and policy makers interested in test validity or developing tests appreciate the book's cutting edge review of test validity. The book also serves as a supplement in graduate or advanced undergraduate courses on test validity, psychometrics, testing or measurement taught in psychology, education, sociology, social work, political science, business, criminal justice and other fields. The book does not assume a background in measurement.


Aros, J. R. (2013).  Gobstoppers: Psychometric brain candy in the frontiers and boonies of test validity theory.  A review of Frontiers of Test Validity Theory: Measurement, Causation and Meaning.  PsychCritiques, 58, Release 44, Article 6.
"[I]s this book worth reading? The answer is a resounding yes for those who are looking to make more sense of generalizability theory, IRT, and SEM applications in a testing/assessment-centered arena, as a user, developer, or both." (p. 2)
No Author (2013).  Reference & Research Book News, 28(3), 8-12.
"Assuming no previous background in the relevant philosophical areas, they discuss such topics as philosophical theories of measurement, open issues in measurement theory and psychometrics, problems in causation and validity, a model of test score interpretation, and an integrative view of test validity." (p. 8).
Heberle, J. F. (2013).  Frontiers of test validity theory: measurement, causation and meaning (book review).  Choice, 51(4), 51-2153. doi: 10.5860/CHOICE.51-2153
"Highly recommended.  Graduate students, faculty, researchers, professionals." (p. 684).
Kane, M. T. (2014). Frontiers of test validity theory: measurement, causation, and meaning (book review).  Assessment in Education: Principles, Policy & Practice, 21, 238-244. 7p. DOI: 10.1080/0969594X.2013.878684.
"Markus and Borsboom provide thought-provoking analyses of the roles played by measurement, causality and meaning in validity theory." (p. 243).
Gotch, C. M, (2014).  Frontiers of test validity theory: measurement, causation, and meaning (book review).  Journal of Educational Measurement, 51, 463–469.
"Simply put, Frontiers of Test Validity Theory is an essential volume in the library of the measurement professional or advanced graduate student who wants to approach their work with purpose and understanding." (p. 466).

Pen thumbnail

Ink Thumbnail


Markus, K. (2021). Causal effects and counterfactual conditionals: Contrasting Rubin, Lewis and Pearl. Economics and Philosophy, 1-21. doi:10.1017/S0266267120000437

Rubin and Pearl offered approaches to causal effect estimation and Lewis and Pearl offered theories of counterfactual conditionals. Arguments offered by Pearl and his collaborators support a weak form of equivalence such that notation from the rival theory can be re-purposed to express Pearl’s theory in a way that is equivalent to Pearl’s theory expressed in its native notation. Nonetheless, the many fundamental differences between the theories rule out any stronger form of equivalence. A renewed emphasis on comparative research can help to guide applications, further develop each theory, and better understand their relative strengths and weaknesses.

Markus, K. A. (2021). Philosophical methodology and axiomatic measurement theory: A comment on Uher (2021). Journal of Theoretical and Philosophical Psychology, 41(1), 85–90.

Uher (2000) provided a valuable integrative synthesis of varied issues related to the philosophical assumptions of psychometrics viewed from the perspective of axiomatic measurement theory.  However, much of the presentation is unlikely to effectively persuade psychometricians.  Three methodological principles, internal criticism, parity and charity, can help render such work more accessible and persuasive to psychometricians.  Uher's article offers a helpful opportunity to consider how these principles might be applied.  Ultimately, however, abstract argument is likely to be less persuasive than successful concrete applications of axiomatic measurement theory.

Markus, K. A. (2020).  On epistemic violence in psychological science.  Theory and Psychology, 30, 478-482.  doi:

Markus, K. A. (2019).  Review of Measurement Theory and Applications for the Social Sciences.  Psychometrika.  84, 646-648.  doi: 10.1007/s11336-018-9637-6

Markus, K. A. (2018).  Three conceptual impediments to developing scale theory for formative scales.  Methodology, 14, 156-164.  doi: 10.1027/1614-2241/a000154

Abstract: Bollen and colleagues have advocated the use of formative scales despite the fact that formative scales lack an adequate underlying theory to guide development or validation such as that which underlies reflective scales. Three conceptual impediments impede the development of such theory: the redefinition of measurement restricted to the context of model fitting, the inscrutable notion of conceptual unity, and a systematic conflation of item scores with attributes. Setting aside these impediments opens the door to progress in developing the needed theory to support formative scale use. A broader perspective facilitates consideration of standard scale development concerns as applied to formative scales including scale development, item analysis, reliability, and item bias. While formative scales require a different pattern of emphasis, all five of the traditional sources of validity evidence apply to formative scales. Responsible use of formative scales requires greater attention to developing the requisite underlying theory.

Markus, K. A.  (2017) Review of Causal Inference in Statistics: A Primer, by Judea Pearl, Madelyn Glymour, and Nicholas P. Jewell, Structural Equation Modeling: A Multidisciplinary Journal, 24, 636-642, DOI: 10.1080/10705511.2017.1299012

Markus, K. A. (2016). Validity bites: comments and rejoinders.  Assessment in Education: Principles, Policy & Practice, 23, 312-315.  doi: 10.1080/0969594X.2016.1156643

Markus, K. A. (2016).  Alternative vocabularies in the test validity literature.  Assessment in Education: Principles, Policy & Practice, 23, 252-267.  doi: 10.1080/0969594X.2015.1060191

Abstract:  Justification of testing practice involves moving from one state of knowledge about the test to another. Theories of test validity can (a) focus on the beginning of the process, (b) focus on the end or (c) encompass the entire process. Analyses of four case studies test and illustrate three claims: (a) restrictions on validity entail a supplement required to obtain justification from validity. (b) Rationales for restrictions assume particular contexts. (c) Claims can be translated between contrasting vocabularies. Implications for consumers of test validity theory include encouragement to focus on content instead of form and to write and read mindfully of the multiplicity of validity vocabularies. Implications for producers of test validity theory include encouragement to consider multiple reconstructions of a particular theory of test validity, clearly distinguish validity theories from validity definitions, and focus on contributing arguments that constrain possible theories rather than contributing definitions or broad frameworks.

Markus, K. A. (2016).  Consistent treatment of variables and causation poses a challenge for behavioral research methods: A commentary on Nesselroade and Molenaar (2016).  Multivariate Behavioral Research, 51, p413-418.  doi : 10.1080/00273171.2015.1081089

Markus, K. A. (2015).  Roger Ellis Millsap (1954-2014).  American Psycholgist, 70, 931. doi: 10.1037/a0039885

Markus, K. A. (2014).  Unfinished Business in Clarifying Causal Measurement: Commentary on Bainter and Bollen.  Measurement, 12, 146-150.  doi : 10.1080/15366367.2014.980106

Markus, K. A. (2014). Theory, observation, and validation: Commentary on Almond, Kim, Velasquez, & Shute.  Measurement, 12, 47-50. doi : 10.1080/15366367.2014.921033

Markus, K. A. (2013).  An incremental approach to causal inference in the behavioral sciences.  Synthese (online).  DOI: 10.1007/s11229-013-0386-x.

Abstract: Causal inference plays a central role in behavioral science.  Historically, behavioral science methodologies have typically sought to infer a single causal relation.  Each of the major approaches to causal inference in the behavioral sciences follows this pattern.  Nonetheless, such approaches sometimes differ in the causal relation that they infer.  Incremental causal inference offers an alternative to this conceptualization of causal inference that divides the inference into a series of incremental steps.  Different steps infer different causal relations.  Incremental causal inference is consistent with both causal pluralism and anti-pluralism.  However, anti-pluralism places greater constraints the possible topology of sequential inferences.  Arguments against causal inference include questioning consistency with causation as an explanatory principle, charging undue complexity, and questioning the need for it.  Arguments in favor of incremental inference include better explanation of diverse causal inferences in behavioral science, tailored causal inference, and more detailed and explicit description of causal inference.  Incremental causal inference offers a viable and potentially fruitful alternative to approaches limited to a single causal relation. (Copyright 2013 Springer Science+Business Media Dordrecht)

Markus, K. A., Loveland, J. E., Ha, D. T. & Raghavan, C. (2013).  Publication trends in intimate partner violence: Bridging the division in qualitative and quantitative methods.  In C. Raghavan & S. J. Cohen (Eds.), Domestic Violence: Methodologies in Dialog (pp. 233-253).  Boston: Northeastern University Press.

Abstract:  Both qualitative and quantitative methods contribute to research on intimate partner violence (IPV).  In this chapter we first report a tabular review of IPV research, focusing on the use of qualitative, quantitative, and mixed methods.  We then explore the conceptual basis for distinguishing between qualitative and quantitative methods and argue that in at least one important sense, all methods are mixed methods. (Copyright 2013 Northeastern University)

Borsboom, D. & Markus, K. A. (2013).  Truth and evidence in validity theory.  Journal of Educational Measurement, 50, 109-113.

Abstract:  According to Kane (this issue) “the validity of a proposed interpretation or use depends on how well the evidence supports the claims being made”. Because truth and evidence are distinct, this means that the validity of a test score interpretation could be high even though the interpretation is false. As an illustration, we discuss the case of phlogiston measurement as it existed in the 18th century. At face value, Kane’s theory would seem imply that interpretations of phlogiston measurement were valid in the 18th century (because the evidence for them was strong), even though amounts of phlogiston do not exist and hence cannot be measured. We suggest that this neglects an important aspect of validity and suggest various ways in which Kane’s theory could meet this challenge.  (Copyright 2013 National Council on Measurement in Education)

Markus, K. A. (2013).  Correspondence without correspondence theory:  Comment on Haig and Borsboom.  Theory and Psychology, 23, 806-811.

Abstract:  Haig and Borsboom advocate for psychological science to adopt a correspondence theory of truth.  However, their argument requires a hidden premise that only correspondence theories of truth bring the benefits that they ascribe to correspondence. This premise is not plausible and their argument therefore does not support their recommendation. Additionally, considerations extraneous to Haig and Borsboom’s argument speak in favor of considering alternative theories of truth.  (Copyright 2013 Keith A. Markus)

Markus, K. A. (2013).  Theories of causality:  From antiquity to the present & The Oxford handbook of causation [book review].  Structural Equation Modeling, 20, 708-714.

Abstract:  These two books hold interest for structural equation modeling in part because methodological literature often passes over the task of explaining what is meant by causation.  Losee offers a brief, lively and to-the-point introduction and overview revolving round three key questions: (a) what sorts of things serve as causes and effects, (b) what relation holds between them, and (c) how might one assess a causal claim.  Beebee, Hitchcock & Menzies offer a comprehensive and up-to-date compendium of chapters introducing and summarizing various topics in causation.  Losee offers an ideal starting point while Beebee, Hitchcock and Menzies offers a more in-depth treatment that can be read cover to cover or used as a reference book.  (Copyright 2013 Keith A. Markus)

Kaufman, D., Codding, R. S., Markus, K. A., Tryon, G. S. & Nagler Kyse, E. (2013).  Effects of verbal and written performance feedback on treatment adherence:  Practical application of two delivery formats.  Journal of Educational and Psychological Consultation, 23, 264-299.  DOI: 10.1080/10474412.2013.845494

Abstract:  Verbal and written performance feedback for improving preschool and kindergarten teachers’ treatment integrity of behavior plans was compared using a combined multiple-baseline and multipletreatment design across teacher–student dyads with order counterbalanced as within-series conditions. Supplemental generalized least square regression analyses were included to evaluate significance. Maintenance of treatment integrity following termination of performance feedback was included and correspondence between treatment integrity and student behavior change was examined. Results suggested that both forms of feedback were effective for improving treatment integrity but that verbal performance feedback resulted in immediate and sustained improvements with
moderate to strong correspondence with student behavior change.  (Copyright 2013 Taylor and Francis Group, LLC)

Markus, K. A. & Borsboom, D. (2012). Reflective Measurement Models, Behavior Domains, and Common Causes. New Ideas in Psychology, 31,  54-64.

Abstract: Causal theories of measurement view test items as effects of a common cause. Behavior domain theories view test item responses as behaviors sampled from a common domain. A domain score is a composite score over this domain. The question arises whether latent variables can simultaneously constitute domain scores and common causes of item scores. One argument to the contrary holds that behavior domain theory offers more effective guidance for item construction than a causal theory of measurement. A second argument appeals to the apparent circularity of taking a domain score, which is defined in terms of a domain of behaviors, as a cause of those behaviors. Both arguments require qualification and behavior domain theory seems to rely on implicit causal relationships in two respects. Three strategies permit reconciliation of the two theories: One can take a causal structure as providing the basis for a homogeneous domain. One can construct a homogeneous domain and then investigate whether a causal structure explains the homogeneity. Or, one can take the domain score as linked to an existing attribute constrained by indirect measurement. (Copyright 2011 Elsevier Ltd.)

Markus, K. A. (2012). Linear causal modeling with structural equations [book review]. Structural Equation Modeling, 19, 703-710.

Abstract:  This book provides a welcome update and compilation of Stanley Mulaik's contributions to structural equation modeling.  The book is not organized in a manner that would make it easy to use in a semester long course, but it covers a range of important topics not customarily covered in an introductory course.  Coupled with a comprehensive introductory text, the book provides sufficient background to read primary literature on structural equation modeling methods.  The book offers a useful presentation of the author's views on causation in structural equation modeling despite some faulty arguments directed at alternative perspectives.  The book also sheds new light on the tensions between realist and empiricist influences in the author's views on objectivity.  The primary audiences for the book will be those with an interest in philosophy of science issues as they apply to structural equation modeling, instructors seeking supplementary reading for courses, and those new to structural equation modeling looking for an accessible introduction to more advanced topics.  (Copyright 2012 Keith A. Markus)

Markus, K. A. & Borsboom, D. (2012). The cat came back: Evaluating arguments against psychological measurement. Theory and Psychology, 22, 452-466.

Abstract: The possibility or impossibility of quantitative measurement in psychology has important ramifications for the nature of psychology as a discipline. Trendler’s (2009) argument for the impossibility of psychological measurement suggests a general and potentially fruitful strategy for further research on this question. However, the specific argument offered by Trendler appears flawed in several respects. It seems to conflate what must hold true with what one must know and also equivocate on the necessary evidence. Moreover, if the argument supported its conclusion, it would rule out qualitative discourse on psychology as well as psychological measurement. Taking Trendler’s argument as an example, one can formulate a general structure to arguments adopting the same basic strategy. An overview of the requirements that such arguments should meet provides a metatheoretical perspective that can assist authors in constructing such arguments and readers in critically evaluating them.  (Copyright 2011 the authors)

Markus, K. A. (2012). Constructs and attributes in test validity: Reflections on Newton's account.  Measurement: Interdisciplinary Research and Perspectives, 10, 84-87.

Absract:  Newton provides a thoughtful and valuable contribution to test validity theory.  I question the notion of an attribute as constrasted with a construct.  I question the strict requirement that a test must measure the attribute entailed by a decision for which the test is used.  I also question the rejection of degrees of validity.  (Copyright 2012 Keith A. Markus)

Markus, K. A. (2012). Principles and practice of structural equation modeling, 3rd ed. [book review].  Structural Equation Modeling, 3, 509-512.

Abstract:  The third edition of Kline's text offers a welcome update and restructuring of the text.  Much is improved from the previous version although the strain of squeezing an adequate introduction to all things SEM into a single semester course cannot be avoided.  The text continues to provide a highly articulate and accessible introduction, but not an adequate preparation for the technical literature for most readers.  (Copyrigh 2012 Keith A. Markus)

Markus, K. A. (2012).  Mulaik on atomism, contraposition and causation. Quality and Quantity, 46, 559–571.

Abstract: Causal inference using statistical models plays a central role in many areas of behavioral science, but the underlying metatheory of causal explanation remains poorly developed.  Mulaik's work on causation offers a useful foray into this topic. Evaluation of two negative arguments applied to a broad range of theories of causation offer overdue critical assessment of this contribution. More broadly, the critical evaluation of Mulaik's arguments speak to the need for better integration of substantive theories and statistical models in causal research. (Copyright 2010 Springer Science+Business Media B.V.)


Markus, K. A. (2012). Review of Bias and causation: Models and judgment for valid comparison.   Journal of Educational and Behavioral Statistics, 37, 475-476.

Abstract: Weisberg offers a useful integrative treatment of approaches to causal inference across a variety of disciplines.  The book includes useful original thinking through of key issues in causal inference primarily from a potential outcomes perspective.  However, the book does not provide a non-circular analysis of causation itself.  (Copyright 2012 Keith A. Markus)

Markus, K. A. (2011). Real causes and ideal manipulations: Pearl's theory of causal inference from the point of view of psychological research methods. In P. McKay Illari, F. Russo & J. Williamson (Eds.), Causality in the sciences (pp. 240-269). Oxford, UK: Oxford University Press.

Abstract: Pearl's work on causation has helped focus new attention on the nature of causal reasoning and causal inference in behavioral science. Pearl takes an axiomatic approach, presenting axioms as first principles, but these may be better understood at boundary conditions for the application of the theory. Pearl adopts a non-eliminative but instrumental approach to causation which creates some tension with the tradition of ruling out rival hypotheses in the behavioral sciences. Finally, much causal reasoning in the behavioral sciences involves reasoning across possible world that differ in their causal structure, which becomes awkward within the basic architecture of Pearl's system. A neighborhood semantics approach could represent this type of reasoning more naturally. Consideration of these issues may be helpful both to behavioral scientists working to incorporate Pearl's work and also to those working outside the behavioral sciences attempting to explain causal reasoning within those sciences. (Copyright 2011 Oxford University Press)


Markus, K. A. (2010). Structural Equations and Causal Explanations: Some Challenges for Causal SEM. Structural Equation Modeling, 17, 654-676.

Abstract: One common application of structural equation modeling (SEM) involves expressing and empirically investigating causal explanations. Nonetheless, several aspects of causal explanation that have an impact on behavioral science methodology remain poorly understood. It remains unclear whether applications of SEM should attempt to provide complete explanations or partial explanations. Moreover, it remains unclear what sorts of things researchers can best take as causes and effects. Finally, the meaning of causal assertions itself remains poorly understood. Attempting to clarify the use of structural equations as causal explanations by addressing these issues has implications for behavioral science methodology because applications of SEM typically remain vague about causation and thus about their substantive conclusions. Research aimed at clarifying these issues can lead to a sharper and more refined use of SEM for causal explanation, and by extension, clarify behavioral science methodology more generally. (Copyright 2010 Taylor & Francis Group LLC)

Markus, K. A. (2010). Hunting causes and using them: Approaches in philosophy and economics [book review]. Structural Equation Modeling, 17, 535-540.

Abstract: Nancy Cartwright seeks to question the traditional division of labor between the policy analyst, social or behavioral science methodologists, and the philosopher, encouraging instead a closer colaboration between the three. A series of 13 articles explore various aspects and this central theme. The articles provide detailed thoughtful probing of a number of important issues in applied causal inference. Some of these issues reflect active areas of research and others reflect neglected issues that warrant greater attention. The collection offers little in terms of synthesis across the 13 essays, and makes few consessions to readers lacking prior familiarity with the formal notation used in examples at various points in the book. Nonetheless, the book offers a valuable compilation including four new articles that will be of interest to anyone concerned with issues of causation and causal inference in the context of public policy and the research that informs it. (Copyright 2010 Keith A. Markus)

Markus, K. A. (2010). Questions about networks, measurement and causation. Behavioral and Brain Sciences, 33, 164-165.

Abtract: Cramer et al. present a thoughtful application of network analysis to symptoms, but certain questions remain open. These questions involve the intended causal interpretation, the critique of latent variables, individual variation in causal networks, Borsboom’s idea of networks as measurement models, and how well the data support the stability of the network results. (Copyright 2010 Cambridge University Press)

Markus, K. A. & Lin, J.-Y. (2010). Construct validity. In N. Salkind (Ed.), Encyclopedia of research design (pp. 229-233). Thousand Oaks, CA: Sage Publications.

Abstract: Provides an overview of construct validity as the term is used in test validity theory.

Markus, K. A. & Smith, K. M. (2010). Content validity. In N. Salkind (Ed.), Encyclopedia of research design (pp. 238-243). Thousand Oaks, CA: Sage Publications.

Abstract: Provides an overview of content validity as the term is used in test validity theory.

Markus, K. A., Gu, W. (2010). Bubble plots as a model free graphical tool for continuous variables. In H. D. Vinod (Ed.), Advances in social science research using R (p. 65-94). New York: Springer.

Abstract: Researchers often wish to understand the relationshp between two continuous predictors and a common continuous outcome. Many options for graphing such relationships, including conditional regression lines or 3D regression surfaces, depend on an underlying model of the data. The veridicality of the graph depends upon the veridicality of the model, and poor models can result in misleading graphs. An enhanced 2D scatterplot or bubble plot tha reprsents values of a variable using the size of the plotted circles offers a model-free alternative. The R function bp3way() implements the bubble plot with a variety ofuser specifiable parameters. An empirical study demonstrates the comparability of bubble plots to other model-free plots for exploroing three-way continuous data. (Copyright 2010 Springer Science + Business Media)

Markus, K. A., Hawes, S. W., & Thasites, R. J. (2008). Abductive inferences to psychological variables: Steiger's question and best explanations of psychopathy. Journal of Clinical Psychology, 64, 1069-1088.

Abstract: Abductive inference often involves inference to the best explanation. A focus on the bestness of explanations facilitates a comparative analysis of how abductive inference would differ if approached with four contasting sets of assumptions about how scientific inference works: positivism, realism, and two kinds of pragmatism. As a thought experiment,  one can imagine a situation in which competing models of psychopathy differ in parsimony and fit to the data, but produce tie when considering both virtues in combination. The thought experiment demonstrates that Steiger's (1990) question about how best to combine competing virtues in scientific inference applies to abductive inference and that the answers depend upon other assumptions about how science works. The comparative analysis helps focus some of the issues that require clarification before abductive inference can enter the Pantheon of standard research methods in psychology. More constructively, the analysis also demonstrates that one need not accept scientific realism to accept the use of abductive inference. (Copyright 2008 Wiley Periodicals, Inc.)

Markus, K. A. (2008). Putting concepts and constucts into practice: A reply to Cervone and Caldwell, Haig, Kane, Mislevy, and Rupp. Measurement, 6, 147-154.

Abstract: The commentary has greatly enriched the discussion initiated by the three target articles. The distinction between constructs and concepts contributes to both the top-down and bottom-up aspects of the dialectic between measurement theory and practice. The distinction also illuminates the abstraction from observation to manifest variables. However, the semantic analysis of variables in terms of ordered pairs of individuals and values of variables does not seek to describe a procedure for defining constructs as part of the test development process. As such, uncertainty about population membership does not pose a pragmatic constraint on the application of the distinction between concepts and constructs. Finally, meaning and reference in the context of test development can best be understood as jointly determined by both how the world is and the nature of the vocabularies chosen to describe it. Thus, both of these factors bear on the definition of specific constructs and concepts measured by individual tests. Distinguishing constructs from concepts can help clarify and advance discourse on testing and measurement across a wide range of domains. (Copyright 2008 Taylor & Francis Group, LLC)

Markus, K. A. (2008). Constructs, concepts and the worlds of possibility: Connecting the measurement, manipulation, and meaning of variables. Measurement, 6, 54-77.

Abstract: A theoretical variable such as integrity, conscientiousness, or academic honesty may correspond to either a construct or a concept, but the standard idiom does not distinguish the two. One can describe the difference between constructs and concepts in terms of set theory. Constructs extend over actual cases, whereas, concepts extend over both actual and possible cases. As such, theoretical claims made about, say, integrity as a construct differ from claims about integrity as a concept. The restrition of constructs to a specified population plays a central role in test validation and psychometric analyses aimed at distinguishing constructs from one another. The extension of concepts over possible populations plays a central role in the adoption of nonactual possibilities as goals in making efforts toward systemic change and also in the comparison of construts across populations. The failure of the standard idiom, which conflates constructs with concepts, to provide a vocabulary that captures both population-dependent and population-independent aspects of variables recommends the modifiction of that idiom to distinguish constructs from concepts. This distinction suggests various changes in practice such as including the intended population in the names of constructs but not concepts. (Copyright 2008 Taylor & Francis Group, LLC)

Markus, K. A. (2008). Hypothesis formulation, model interpretation, and model equivalence: Implications of a mereological causal interpretation of structural equation models. Multivariate Behavioral Research, 43, 177-209.

One can distinguish statistical models used in causal modeling from the causal interpretations that align them with substantive hypotheses. Causal modeling typically assumes an efficient causal interpretation of the statistical model. Causal modeling can also make use of mereological causal interpretations in which the state of the parts determines the state of the whole. This interpretation shares several properties with efficient causal interpretations but also differs in terms of other important properties. The vailability of alternaive causal interpretations of the same statistical models has implications for hypothesis specification, research design, causal inerence, data analysis, and the interpretation of research results. (Copyright 2008 Taylor & Francis Group, LLC)

Markus, K. A. (2007). Philosophical foundations of quantitative research methodology [book review]. Structural Equation Modeling, 14, 527-533.

Abstract: Yu has not written a book that offers significant contributions to cutting-edge work on the philosophical underpinnings of quantitative methods, nor has he written a systematic survey of philosophical foundations of quantitative methods. Yu has written an accessible and engaging book that provides and excellent introductory overview for nonspecialists. Even readers with a background in these issues can appreciate the book for these latter qualities. (Copyright 2008 Taylor & Francis Group, LLC)

Markus, K. A. (2007) Making Things Happen by James Woodward.  [book review] Structural Equation Modeling, 14, 170-178.

Abstract: The scholarly community recently lost two luminary contributors to the causation literature, David Lewis andWesley Salmon, but the two remain very much present
in this book by James Woodward. Woodward aligns himself with Lewis in advocating a counterfactual account of causation (explained later), whereas Salmon
surfaces throughout the book as the representative of noncounterfactual accounts of causation and the prime target for criticism. As such, the book holds interest
both as a contribution to the primary literature, by offering a distinct theory of causal assertion and causal explanation, and also as a contribution to the secondary
literature, by illuminating the work of Lewis, Salmon, and others. (Copyright 2008 Taylor & Francis Group, LLC)

Davis, M. & Markus, K. A. (2006). Misleading cues, misplaced confidence: an analysis of deception detection patterns. American Journal of Dance Therapy, 28, 107-126.

Abstract: First, a case is made that the processes and assumptions underlying judgments of whether someone is lying during a high stakes interview may be similar to movement interpretation processes in a clinical context, and that the former is easier to research than the latter. Graduate students judged the credibility of utterances from actual criminal confessions, explained their decisions, and rated how confident they felt in each decision. Four of the items contained a conventional but invalid nonverbal cue to deception and one contained two conventional, but incorrect, cues to truth-telling. Groups of 30 judged either content only transcripts, verbatim transcripts, audio, or audio/video. Comparison of rationales, confidence level, and accuracy across modality provided evidence of which cues misled judges, how nonverbal cues modified verbal content judgments, and detection patterns that warranted further research. The implications of the results for movement observation and interpretation in dance/movement therapy are discussed. (Copyright 2006 American Dance Therapy Association)

Davis, M., Markus, K. A. & Walters, S. B. (2006). Judging the Credibility of Criminal Suspect Statements: Does Mode of Presentation Matter? Journal of Nonverbal Behavior, 30, 181-198.

Abstract: For a study of modality differences in deception detection accuracy, groups of graduate students judged segments selected from videotapes of criminal confessions. Twenty brief utterances were presented in four ways: content only transcript, verbatim transcript, audio only, and audio/videotape. No modality difference in unbiased truth hit rate was found, but unbiased lie hit rate varied by modality, with judges of transcripts stripped of pause indications, word repeats, and umms and uhhs less accurate than verbatim transcript judges, audio judges, and audio/video judges. The 62% overall accuracy and 61% lie detection accuracy of audio judges was highest and, in contrast to other judges, audio judgments did not display a response bias. The results remain consistent with the presence of valid visual cues but suggest that at least in some situations focus on valid vocal cues may offer more accuracy. (Copyright 2006 Springer Science+Business Media, Inc)

Markus, K. A. (2006) Structural Equation Modeling.  In S. G. Rogelberg (Ed.), The Encyclopedia of Industrial and Organizational Psychology, (pp 773 - 776).  Thousand Oaks, CA:  Sage Publications.

Abstract: Introduces basic concepts of SEM. Statistical modeling concepts include model specification, parameter estimation, model fit, and model interpretation. Causal modeling with SEM, SEM resources, and alternatives to SEM are also discussed. (Abstract Copyright 2006 Keith A. Markus)

Markus, K. A. (2006).  Causation and counterfactuals [book review].  Structural Equation Modeling, 13, 142-151.

   Abstract:  Collins, Hall and Paul provide an outstanding resource for those interested in counterfactual theories of causation.  The introductory essay would be an ideal supplementary reading in a methodology course.  The book systematically develops current advances and problems in counterfactual theories of causation.  However, someone looking for a general overview of theories of causation would want to cast his or her net more widely.  (Copyright 2006 Ketih A. Markus)

Davis, M., Markus, K. A., Walters, S. B., Vorus, N., & Connors, B. (2005).  Behavioral cues to deception vs. topic incriminating potential in criminal confessions.  Law and Human Behavior, 29, 683-704.

  Abstract:  Coding statements of criminal suspects facilitated tests of four hypotheses about differences between behavioral cues to deception and the incriminating potential (IP) of the topic.  Information from criminal investigations corroborated the veracity of 337 brief utterances from 28 videotaped confessions.  A four-point rating of topic IP measured the degree of potential threat per utterance.  Cues discriminating true vs. false comprised word/phrase repeats, speech disfluency spikes, nonverbal overdone, and protracted headshaking.  Non-lexical sounds discriminated true vs. false in the reverse direction.  Cues that distinguished IP only comprised speech speed, gesticulation amount, nonverbal animation level, soft weak vocal and "I (or we) just" qualifier.  Adding "I don't know" to an answer discriminated both IP and true vs. false.  The results supported hypothesis about differentiating deception cues from incriminating potential cues in high-stakes interviews, and suggested that extensive research on distinctions between stress-related cues and cues to deception would improve deception detection.  (Copyright 2005 American Psychology-Law Society/Division 41 of the American Psychological Association)

Markus, K. A. (2005). The Facts of Causation by D.H. Mellor [book review].  Structural Equation Modeling: A Multidisciplinary Journal, 12, 506-512.

   Abstract:  Mellor provides a readable and valuable discussion of current issues in the theory of causation.  He argues in favor of facts as causes and effects.  A number of points from his discussion have direct relevance for causal modeling.  (Copyright 2005 Keith A. Markus)

Markus, K. A. (2004).  Varieties of causal modeling:  How optimal research design varies by explanatory strategy.  In K. van Montfort, J.Oud & A. Satorra (Eds.), Recent developments on structural equation models:  Theory and applications (pp. 175-196).  Dordrecht:  Kluwer Academic Publishers.

   Abstract:  Structural equation models allow for interpretation as causal models within a variety of explanatory strategies.  Literal explanatory strategies locate causation in the process modeled whereas non-literal strategies locate causation in the theoretical description itself.  Robust strategies apply the model to possible cases as well as actual cases whereas non-robust strategies  restrict application to actual cases.  Crossing these two basic distinctions yields a fourfold explanatory strategy typology (FEST).  The four explanatory strategies differ in their implications for research design, including what makes for the best measures, what form generalization takes, what makes for a good replication, and what makes for a good extension.  The best choice of explanatory strategy may depend upon the state of research in a specific topic area.  By demonstrating a many-to-one mapping of substantive interpretations onto statistical models, the FEST illustrates syntactical equivalence, an extension of the statistical equivalence concept in which different substantive models share the same structural equation model with similar implications for inferences from data to theory.  (Copyright 2004 Keith A. Markus)

Markus, K. A. (2002).  Statistical equivalence, semantic equivalence, eliminative induction, and the Raykov-Marcoulides proof of infinite equivalence. Structural Equation Modeling, 9, 503-522.

    Abstract:  Statistically equivalent models produce the same range of moment matrices over the domain of their parameter spaces.  Raykov and Marcoulides (2001) proposed a proof that leads to the conclusion that all structural equation (SE) models with certain minimal components have infinitely many statistically equivalent models.  A variation on their proof covers an even broader class of models.  This conclusion has important implications for the application of at least one notion of eliminative induction to structural equation modeling (SEM).  Normally, assertion of statistical equivalence imply that the models differ in meaning, giving statistical equivalence its interest.  Consequently, a particular complex causal structure provides a counterexample to the proposed proof.  This counterexample suggests that a successful proof may require more detailed attention to the concept of semantic equivalence as characterized y different substantive implications.  A formal account of semantic equivalence rests on translation between SE models and a model-neutral descriptive language.  (Copyright 2002 Lawrence Erlbaum Associates, Inc.)

Markus, K. A. (2002). Beyond objectivity and subjectivity. American Psychologist, 57, 68-69.

    Abstract:  In advocating Bayesian inference as applied to Null Hypothesis Significance Tests, Krueger (2001) took for granted conceptual framework that dichotomizes beliefs as either objective (invariant over observers) or subjective (relative to an observer).  Despite having claimed a pragmatic basis for his argument, Krueger overlooked the fact that a more thoroughgoing pragmatic approach would avoid this problematic framework altogether.  As a consequence, Krueger drew two unjustified conclusions.  He could have avoided these premature conclusions by considering beliefs as grounded in collectively accepted but continuously evolving norms for the justification of knowledge claims.  Such a view avoids the false choice between objectivity and subjectivity and thus undermines any inference from the an inability to attain the former to an inability to avoid of the latter.  (Copyright 2002 Keith A. Markus.)

Markus, K. A. (2001).  The converse inequality argument against tests of statistical significance.  Psychological Methods, 6, 147-160.

    Abstract:  Critics have put forth several arguments against the use of tests of statistical significance (TOSSes).  Among these, the converse inequality argument stands out but remains sketchy, as does criticism of it.  The argument states that we want P(H|D) (where H and D represent hypothesis and data, respectively), we get P(D|H), and the 2 do not equal one another.  Each of the terms in 'P(D|H) =/= P(H|D)' requires clarification.  Furthermore, the argument as a whole allows for multiple interpretations.  If the argument questions the logic of TOSSes, then defenses of TOSSes fall into 2 distinct types.  Clarification and analysis of the argument suggests more moderate conclusions than previously offered by friends and critics of TOSSes.  Furthermore, the general method of clarification through formalization may offer a way out of the current impasse.  (Copyright 2001 American Psychological Association.)

Fenster, A., Markus, K. A., Wiedemann, C. F., Brackett, M. A. & Fernandez, J. (2001).  Selecting tomorrow's forensic psychologists:  A fresh look at some familiar predictors. Educational and Psychological Measurement, 61, 336-348.

    Abstract:  The present study examined the use of the Graduate Record Examination (GRE-Verbal and GRE-Quantitative) and undergraduate grade point average (UGPA) to predict long-term performance in an MA program in forensic psychology.  The criterion measures were graduate grade point average (GGPA) and time to completion (TTC).  Data were available for 206 graduates.  Regression analysis indicated that a linear combination of GRE-V and GRE-Q, and UGPA correlated 0.63 with GGPA.  Predictive efficiency was reduced by only 2% of the variance when GRE subscores are combined into a total score.  The correlation with TTC was smaller (R = 0.31) but nonetheless translated into meaningful differences in student performance.  Most noteworthy, GRE scores and UGPA appear to predict better for forensic psychology than for social sciences in general.  (Copyright 2001 Sage Publications, Inc.)

Brandt, D. E & Markus, K. A. (2000).  Adolescent attitudes towards the police:  a new generation.  Journal of Police and Criminal Psychology, 15, 10-16.

    Abstract:  The attitudes toward the police (ATP) of a group of young inner city adolescents were investigated within the context of a program designed to teach dispute resolution skills and promote a dialogue with local police.  ATP were measured using a 23 item questionnaire.  The results indicated that while ATP were generally positive, girls held more positive ATP than boys and adolescents who reported negative experiences with the police had less favorable ATP.  A confirmatory factor analysis of the questionnaire yielded three factors:  attitudes toward police behavior, attitudes toward interaction with the police, and attitudes toward interaction with other adults.  The results are in general agreement with earlier studies with other populations and have implications for programs designed to improve adolescent relationships with the police.  (Copyright 2000 Society of Police and Criminal Psychology.)

Markus, K. A. (2000).  Twelve testable assertions about cultural dynamics and the reproduction of organizational culture.  In N. M. Ashkanasy, C. P. M. Wilderom & M. F. Peterson (Eds.),  Handbook of organizational culture and climate (pp. 297-308).  Thousand Oaks, CA:  Sage.

    Abstract:  The traditional static view of organizational culture (OC) takes the persistence of OC for granted and seeks to explain culture change.  Through this chapter I seek to engage the reader in a different kind of conversation about OC:  a dynamic view that takes change as primary and seeks to use dynamic processes to explain the persistence of OC over time.   Three types of processes occurring in organizations offer potential explanations of the day to day reproduction of OC.  Intentional processes involve meanings consciously projected from individual minds into their environment.  Unconscious processes involve meanings projected from individual minds, but done so without conscious awareness.  Discursive processes involve meanings that take shape in the communicative actions that connect individual minds and that may never have been conscious to any member of the organization practicing the culture.  Finally, to put discourse theory to good use we must remain mindful that discursive processes are material processes.  (Copyright 2000 Keith A. Markus.)

Markus, K. A. (2000).  Conceptual shell games in the four-step debate.  Structural Equation Modeling, 7, 163-173.

    Abstract:  The exchange between Hayduk and Glaser (2000) and Mulaik and Millsap (2000) sheds new light on the use of multistep procedures for testing structural equation models.  Nonetheless, the fundamental concepts of the discussion remain murky.  The notion of a correct number of constructs (interpreted latent variables) rests on a conflation of the model with the reality it models.  The articulation of what is tested in terms of model constraints encounters as analogous difficulty.  Finally, the appeal to analysis into clear and distinct ideas holds the potential to clarify some of these issues, but still awaits the necessary exposition and application to structural equation modeling.  A common thread shows itself in an over-reliance on single languages of description.  This calls for greater attention to the active engagement of multiple languages of description.  (Copyright 2000 Lawrence Erlbaum Associates, Inc.)

Markus, K. A. (1998). Science, measurement, and validity: Is completion of Samuel Messick's synthesis possible? Social Indicators Research, 45, 7-34.

    Abstract:  Messick's (1989) theory of test validity is profoundly influential (Hubley and Zumbo, 1996; Angoff, 1988) in part because it brings together disparate contributions into a unified framework for building validity arguments.  At the heart of Messick's theory lies a synthesis of realism and constructivism with respect to both scientific facts and measurement.  Within this synthesis there remains a tension between the evidential basis and the consequential basis for test interpretation and use.  This cannot be sidestepped simply by limiting the evidential basis to test interpretation and the consequential basis to test use:  Interpretation and use are not so easily held separate.  The roles of constructivism and context in Messick's theory underline the inherent link between facts and values, but the assumption that facts are objective and values are subjective goes unquestioned in Messick's theory.  The inherent link between facts and values combines with this assumption to produce the unresolved tension in Messick's theory.  This suggests that a unified theory of test validity requires a theory of value justification.  (Copyright 1998 Kluwer Academic Publishers.)

Markus, K. A. (1998).  Validity, facts, and values sans closure:  Reply to Messick, Reckase, Moss, and Zimmerman. Social Indicators Research, 45, 73-82.

    Abstract:  Three intertwined issues are woven through the comments on Markus (1998).  The  commentators disagree regarding the idea of a unitary validity for a given test interpretation.  A similar issue arises regarding a unified concept of test validity.  The third involves the possibility of a unified validity theory, a theory without internal tensions.  In suggesting that facts differ from values, but perhaps not in the way typically envisioned, I am also caught between commentators pulling me in opposing directions.  Additional arguments support the moderate position taken in the target article.  Among my concessions, however, I recognize that the target article misleadingly aligned a unified validity theory with closure in the discussion of test validity.  (Copyright 1998 Keith A. Markus.)

Markus, K. A. (1998). Psychological processes and mental states. American Psychologist, 53, 1077-1078.

    Abstract:  Kipnis (1997) offers two arguments against the use of psychological processes as a basis for the explanation of social behavior.  First, psychological processes cannot be directly measured.  Second, "scientific explanation of behavior should be anchored in societal events" (p. 208).  Kipnis's arguments conflate psychological processes with mental states.  Discursive Psychology provides a conceptualization of psychological processes and interpersonal processes rather than intrapersonal mental states.  Kipnis's arguments against mental states do not apply to this conceptualization of psychological processes.  (Copyright 1998 Keith A. Markus.)

Markus, K. A. (1998). Judging Rules. The Journal of Experimental Education, 66, 261-265.

    Abstract:  With the help of some familiar interlocutors, I have come to see that Marsh and Hau (1996) is open to more than one reading.  On my reading, the assertion that we have to use judgment in assessing model fit, as opposed to simply following rules, is an important theme of the article.  This theme is a familiar refrain.  Nevertheless, my several attempts to formulate the theme are open to criticism.  Judgment presupposes rules rather than being opposed to the following of rules.  Refining our rules in what the game is all about.  Anyone for croquet?  (Copyright 1998 Helen Dwight Reid Educational Foundation.)

Rice, R. W., Markus, K., Moyer, R. P., & McFarlin, D. B., (1991). Tests of Locke's range of affect hypothesis. Journal of Applied Social Psychology, 24, 1977-1987.

    Abstract:  Two 2 X 2 (Facet Importance X Facet Amount) factorial experiments tested Locke's (1969, 1976) hypothesis that facet importance moderates the range of affective reactions.  Written excerpts from letters, interview transcripts, and employee handbooks were used to create scenarios which manipulated the importance and current amount of two target job facets:  freedom to do things one's own way on the job, and amount of face-to-face contact with customers/clients.  As predicted, significant Facet Importance X Facet Amount interactions showed that facet satisfaction was influenced more strongly by differences in facet amount when the facet was high in importance than when it was low in importance.  Subjects in the high-importance condition, relative to subjects in the low-importance condition, were more satisfied with high facet amounts and more dissatisfied with low facet amounts.  Discussion focused on the convergence of results from research using different methods to test the range of affect hypothesis.  (Copyright 1991 V. H. Winston & Son, Inc.)

Back to Top

Book Thumbnail

                Keyboard Thumbnail

Markus, K. A. (2020).  Panelist in Validity Evidence Based on Testing Consequences moderated by Debbi Bandalos.  51st Annual Meeting of the Northeastern Educational Research Association.  14 October.  Virtual meeting. 

Markus, K. A. (2020).  Counterfactual conditionals and causal effects.  Convention of the American Psychological Association.  6 August.  Virtual meeting.

Markus, K. A. (2020).  Non-causal determination:  implications for causal explanation and causal modeling.  International Meeting of the Psychometric Society.  14 July.  Virtual meeting.

Markus, K. A. (2019).  Psychometrics' inherited ontologies:  nomological networks, causal structures, and measurement.  International Meeting of the Psychometric Society.  18 July.  Santiago, Chile.

Markus, K. A. (2018).  Formative measurement theory:  Three conceptual impediments to progress.  International Meeting of the Psychometric Society.  10 July.  New York City, NY.

Markus, K. A. (2017). Lewis, Rubin, Pearl: Compare and contrast.  International Meeting of the Psychometric Society.  18 July, Zurich, Switzerland.

Markus, K. A. (2016). Testing causal hypotheses: Precisely formulating causal hypotheses to maximize testiability.  Consortium for the Advancement of Research Methods and Analysis.  7 April.  Lincoln, NE.
Markus, K. A. (2015).  Degrees of validity: some new simulation results.  International Meeting of the Psychometric Society.  13 July 2015.  Beijing, China.

*Markus, K. A. (2014).  Problems and pseudo-problems in test validity theory. 5 April.  Philadelphia, PA.

Markus, K. A. (2014).  Test validity: Looking back, looking forward, and a closer look at some current work.  James Madison University.  26 September.  Harrisonburg, VA.

Markus, K. A., *Jeglic, E. L., & *Calkins, C. (2014).  MnSOST--R differential item functioning: white, black and latiiono male sex offenders.  International Conference on the Rule of Law in and Era of Change: Security, Social Justice and Inclusive Governance.  12 June.  Athens, Greece.

Markus, K A. (2014).  Test Development and Validation Viewed as Mixed Methods Research.  A Critcal, Pluralist and Mixed-methods History of Measurement and Evaluaton in Psychology.  Symposium organized by Valerie Futch.  American Psychological Association.  August 2014, Washington D.C.

Markus, K. A. & Borsboom, D. (2014).  A model of test score interpretation.  American Psychological Association.  August 2014, Washington D.C.

Markus, K. A., Jeglic, E. L., & Calkins, C.  (2014) Fairness and race in sex offender assessment:  differential item functioning in the MnSOST-R.  The Rule of Law in and Era of Change: Security, Social Justice and Inclusive Governance.  June, 2014, Athens Greece.

Markus, K. A. (2014).  Test Validity: Looking Back, Looking Forward, and a Closer Look at Some Current Work.  James Madison University.  September 2014, video-conferenced symposium.

Markus, K. A. (2014).  Problems and Pseudo-problems in test validity theory.  What is the best way to use the term 'validity'? Symposium organized by Paul E. Newton.  National Council on Measurement in Education, April 5, Philadelphia, PA.

Markus, K. A. (2013). Test validity: Looking back and looking forward.  CUNY Graduate Center, 7 November 2013.

Markus, K. A. (2013).  Implications of different forms of explanation for the theory and application of logic models.  American Evaluation Association.  16 October, Washington, DC.

Markus, K. A. & Satuliri, S. (2013).  Sensitivity to base rates of alternative distance metrics in SSA and MDS.  14th International Facet Theory Conference.  21 August, Olinda, PE, Brazil.

Markus, K. A. (2013).  Comparing two accounts of degrees of validity:  Deductive strength versus belief centrality.  National Council on Measurement in Education.  29 April, San Francisco, CA.

Markus, K. A. (2012).  All causation is not created equal:  Theoretical and methodological considerations for causal research.  Invited talk, Columbia University, October 2, 2012.

Markus, K. A. (2011). Test validity: Looking back and looking forward. ETS. 7 October 2011. 

Markus, K. A. (2011). R in an hour. Forensic psychology doctoral program, John Jay College of Criminal Justice, CUNY. 23 September 2011.

Markus, K. A. (2011). Pearl's theory of causal reasonsing fromt he perspective of psychological research methods. Convention of the American Psychological Association. 5 August 2011. Washington, DC.

Markus, K. A. (2011). Parallelism, ergodicity, and psychological explanations. International Meeting of the Psychometric Society. 20 July 2011. Hong Kong.

Markus, K. A. (2011). State of the Art Talk: Test validity: Looking back and looking forward. International Meeting of the Psychometric Society. 19 July 2011. Hong Kong.

Markus, K. A. (2011). Score interpretations: The Goldilocks Model. International Meeting of the Psychometric Society. 19 July 2011. Hong Kong.

Markus, K. A. (2010). Measurement, causation and test validity: Theoretical puzzles and practical problems. Lehman College, CUNY. October 4, 2010. Bronx, NY.

Markus, K. A. (2010). Measurement, causation, and test validity: Theoretical puzzles and practical problems. Fordham University. September 22, 2010. Bronx, NY.

Markus, K. A. (2010). Measurement, causation, and test validity: Theoretical puzzles and practical problems. Doctoral program in social personality, CUNY Graduate Center. September 15, 2010. Manhattan, NY.

Markus, K. A. (2010). Behavioral science research ethics: Tensions between individual good and common good. Paper presented to Societies in transition: Balancing security, social justice, and tradition conference, June 4 2010 as part of a symposium entitled "Ethics in criminal justice research: Balancing needs for research with ethical responsibilities" organized by Martin Wallenstein. Marakech, Morocco.

Markus, K. A. (2009). All methods are mixed methods. Paper presented to the convention of the American Psychological Association, August 8, 2009, as part of a session entitled "Role of Mixed Methods in Psychological Research" and organized by Gwyneth M. Boodoo. Toronto, CA.

Markus, K. A. (2009). How can validity come in degrees? Paper presented at the 2009 International Meeting of the Psychometric Society. 21 July 2009, Cambridge, UK.

Markus, K. A. & Gu, W. (2009). Bubble plots as a model-free graphical tool for three continuous variables and a flexible R function to plot them. Conference on quantitative social science research using R. 18 June 2009, Fordham University, New York.

Markus, K. A., Hawes, S. W. & Thasites, R. J. (2008, August 16). Abductive Inferences to Psychological Variables: Weighting Competing Criteria. Poster presented to the Convention of the American Psychological Association, Boston MA.

Markus, K. A. (2008, July 30). Construct validity and causal modeling. Invited paper presented to the Psychology Department, University of Canterbury, Christchurch, NZ.

Hawes, S. W. & Markus, K. A. (August, 2007). Evaluating the Construct Validity of the PCL-R. Poster accepted to the 115th Annual Conference of the American Psychological Association. San Francisco, CA.

Chajewski, M. & Markus, K. A. (August, 2007). Biases in social influence reporting: Social desirability as deception. Poster accepted to the 115th Annual Conference of the American Psychological Association. San Francisco, CA.

Chajewski, M., & Markus, K. A. (June, 2007). The psychometric evaluation of the Sexual Offender Need Assessment Rating-Self Report (SONAR-SR). Paper accepted to the 68th Canadian Psychological Association Annual Convention. Ottowa, CA.

Chajewski, M. & Markus, Keith A. (March, 2007). Detecting effects of social desirability in the Sexual Offender Need Assessment Rating-Self Report (SONAR-SR). Poster presented to the Eastern Psychological Association Conference. Philadelphia, PA.

Markus, K. A. & Perillo, A. D. (2007, March). Judges as Psycho-technocrats: Three Challenges Facing Fondacaro’s Ecological Jurisprudence. Paper presented as part of the symposium entitled Comments on Fondacaro’s Ecological Jurisprudence to the conference entitled Off The Witness Stand: Using Psychology in the Practice of Justice. New York, NY.

Markus, K. A. (2007, August). Using Longitudinal Data to Distinguish Distinct Causal Explanations. Poster accepted for presentation to the Convention of the American Psychological Association. San Francisco, CA.

Markus, K. A. (2007, July). A Thrice Parameterized Tale: Interpreting Quadratic Models. Paper presented to the 2007 International Meeting of the Psychometric Society. Tokyo, Japan.

Markus, K. A. (2006, November 2). Cautions on casual causal analysis: The perspective of causal pluralism. (Inaugural lecture in the Fordham Council on Applied Psychometrics Brown Bag Lunch Lecture Series, New York, NY).

Markus. K. A. (2006, July 9).  Causation, Generalization and Deductive Strength:  A Discussion of Stephen G. West’s "Rubin's and Campbell's perspectives on causality."  Invited symposium on Causality in Experiments and Quasi-Experiments.  Jena, Germany.

 Chajewski, M., & Markus, K. (November, 2006). A self-report version of the Sexual Offender Need Assessment Rating (SONAR). Paper presented to the 2006 American Society of Criminology Conference. Los Angeles, CA.

Daftary, T., Kirschner, S., Markus, K. A., Broadwater, A. (2006).  Mock Juror Verdict Selections: Effect of Media and the Guilty but Mentally Ill Verdict Option. (Paper accepted for Presentation at American Society of Criminology Conference, Los Angles, CA).

Markus, K. A. (2006, August 10).  Two Fallacies About Causation:  Atomism and Contraposition.  Poster presented at the 114 Annual Convention of the American Psychological Association.  New Orleans, LA.

Markus, K. A. (2006, August 10).  Participant in Statistical Consulting Workshop at the 114 Annual Convention of the American Psychological Association.  New Orleans, LA.

Markus. K. A. (2006, July 9).  Causation, Generalization and Deductive Strength:  A Discussion of Stephen G. West’s "Rubin's and Campbell's perspectives on causality."  Invited symposium on Causality in Experiments and Quasi-Experiments.  Jena, Germany.

Markus, K. A. (2006, June 15).  Causation de novo versus causation in sequence:  Implications for potential-response theory applied to linear causal models.  Paper presented at the 2006 International Meeting of the Psychometric Society.  Montreal, CA.

Markus, K. A., Sothmann, F. C., & Raghavan, C. (2006, March 3).  Kinds of Intimate Partner Violence:  Applying Latent Class Analysis to the Revised Conflict Tactics Scale.  Paper presented to the American Psychology and Law Society.

Markus, K. A. (2005, July 13).  Distinguishing constructs from concepts:  A semantic approach.  University of Amsterdam, invited address.

Markus, K. A. (2005, July 8).  Concepts, constructs and Psychological Measurement.  Paper presented at the 2005 International Meeting of the Psychometric Society.

Markus, K. A. (2005, August 20).  Profile plots for clustered data:  multiple error bars.  Poster presented to the 113 Annual Convention of the American Psychological Association. 

Markus, K. A. (2004, July 30).  The case for pluralism about causation.  Paper presented to the 112 Annual Convention of the American Psychological Association as part of a symposium entitled Understanding and modeling causation organized by Bruce Brown.

Markus, K. A. & Davis, M. (2004, May 3).  Toward a typology of deception:  deception rates for response types.  John Jay College of Criminal Justice, CUNY.  New York, NY.

Markus, K. A. (2003, August 7).  Varieties of causal modeling:  How explanatory strategy affects research design.  Poster presented to the 111th Annual Convention of the American Psychological Association, Toronto, CA.

Markus, K. A. (2002, December 9).   Varieties of causal modeling:  How explanatory strategy affects research design.  Paper presented to the Doctoral Program in Psychology, Industrial and Organizational Subprogram, Baruch College, CUNY.  New York, NY.

Markus, K. A, Davis, M. & Walters, S. B. (2002, August 23).  Toward a psychological typology of deception:  latent class cluster analysis.  Poster presented to the 110th Annual Convention of the American Psychological Association, Chicago, IL.

Markus, K. A. (2002, June 23).  Semantically Equivalent Structural Equation Models:  Definition, Examples, and Theoretical Significance.  North American Meeting of the Psychometric Society, Chapel Hill, NC.

Markus, K. A. (2002, April, 18).  A formal-causal interpretation of structural equation models.  Expanded paper presented to the Doctoral Program in Educational Psychology, CUNY Graduate Center.  New York, NY.

Fenster, A., Markus, K. A., & Palmer, J. (2002, January 4).  Predicting diagnositc skills from measures of academic knowledge of psychopathology.  Poster presented to the 24th Annual Conference of the National Institute on the Teaching of Psychology, St. Petersburg Beach, FL.

Markus, K. A. (2001, August 27).  Activity theories of causality and latent causes.  Poster presented to the 109th Annual Convention of the American Psychological Association, San Francisco, CA.

Markus, K. A. (2001, July 16).  A formal-causal interpretation of structural equation models.  Paper presented to the International Meeting of the Psychometric Society, Osaka, Japan.

Markus, K. A. (2001, June 22).  Eliminative Induction and The Raykov-Marcoulides Proof of Infinite Equivalence.  Paper presented to the North American Meeting of the Psychometric Society, Valley Forge, PA.

Markus, K. A. (2000, August 4).  Defeasibility and degrees of support in tests of statistical significance.  Poster presented to the 108th Annual Convention of the American Psychological Association, Washington DC.

Davis, M., Walters, S. B., Vorus, N., Meiland, P. A., & Markus, K. A. (2000, August 4).  Verbal and nonverbal cues to false testimony in criminal investigations.  Poster presented to the 108th Annual Convention of the American Psychological Association, Washington DC.

Markus, K. A. (2000, July 6).  Making true assertions from false models:  Addressing a central paradox in structural equation modeling.  Paper presented at the 2000 Annual Meeting of the Psychometric Society.  Vancouver, British Columbia.

Brandt, D. E. & Markus, K. A. (2000, March 25). An examination of adolescent attitudes towards the police:  A new generation.  Paper presented to the Annual Conference of the Eastern Psychological Association, Baltimore, MD.

Markus, K. A. (1999, August 23).  Concepts versus constructs in theories of integrity.  Paper presented to the 107th Annual Convention of the American Psychological Association, Boston, MA.

Markus, K. A. (1999, August 21).  Introduction to the NHST problem.  Paper presented to the 107th Annual Convention of the American Psychological Association, Boston, MA.

Markus, K. A., Fenster, A., Wiedemann, C. F., Brackett, M. A., & Fernandez, J. (1999, August 20).  Selecting tomorrow's forensic psychologists:  GRE-V, GRE-Q, and undergraduate GPA.  Poster presented to the 107th Annual Convention of the American Psychological Association, Boston, MA.

Markus, K. A. (1999, June 25).  Compare and contrast Pearl and Markus on the semantics of structural equation models.    Paper presented at the 1999 Annual Meeting of the Psychometric Society, Lawrence, Kansas.

Markus, K. A. (1999, March 23).  The new Standards for Educational and Psychological Testing.  Presentation sponsored by Forensic Psychology MA program at John Jay College of Criminal Justice, New York, NY.

Markus, K. A. (1998, August 14).  Science, measurement, and validity:  Is  completion of Messick's synthesis possible?  Poster presented to the 106th Annual Convention of the American Psychological Association, San Francisco, CA.

Markus, K. A.  (1998, August 16). Discussant for Innovations in program and test evaluation.  Paper session during the 106th Annual Convention of the American Psychological Association, San Francisco, CA.

Markus, K. A. (1998, June 19).  Fit functions as selection functions:  Toward a Possible-World Semantics of Structural Equation Modeling.  Paper presented at the 1998 Joint Annual Meeting of the Classification Society of North America and the Psychometric Society, Urbana, Illinois.

Markus, K. A. (1997, August 18).  The converse inequality argument against tests of statistical significance.  Poster presented at 105th Annual Convention of the American Psychological Association, Chicago, IL.

Markus, K. A. (1997, May 31).  Discursive processes in organizational membership.  Paper presented at The Fifth A. F. Jacobson Symposium in Communications.  Omaha, NA.

Markus, K. A. (1996, May 23).  Is Measurement Invariance sufficient evidence of invariance in measurement?  Pace University, Department of Psychology, New York, NY.

Markus, K. A. (1993, November 11). Comment:  Framing self and other in deconstructive versus dialogic reading.  New School for Social Research, Taiwan Study Group, New York, NY.

Markus, K. A. (1993, March 24). Subject, culture, text.  New School for Social Research, Taiwan Study Group, New York, NY.

Markus, K. A. (1992, October 16). The distinctiveness of the Facet Importance and Wanted Amount job satisfaction constructs.  Baruch College, Organization and Policy Studies/Industrial and Organizational Psychology Student Research Conference, New York, NY.

Rice, R.W., Markus, K., Moyer, R.P. & McFarlin, D.B. (1990, April). Tests of Locke's range of affect hypothesis.  Poster presented at the annual meeting of the Society for Industrial and Organizational Psychology, Miami, FL.

Markus, K. (1988, May 7). Concepts of causality and their implications for the theory and practice of psychological science. Sixteenth Annual Hunter College Psychology Convention, New York, NY.

Markus, K., Rice, R.W., & McFarlin, D.B. (1988, April 22). Causal relations among the determinants of Facet Satisfaction.  Poster presented at the annual meeting of the Eastern Psychological Association, Buffalo, NY.

Back to Top

Magnifying Glass Thumbnail

Contact Information
Keith A. Markus, Ph.D.


Psychology Department
John Jay College for Criminal Justice
The City University of New York
524 West 59th Street
New York, NY 10019

Phone:  (212) 237-8784
Fax:  (212) 237-8742

  Address reprint requests to either tha email or paper mail address.  Due to the American Psychology Association's position on electronic publication, I do not make papers downloadable through this web site.  If you copy an abstract, please retain the copyright information.

Back to Top

Piano Keyboard Thumbnail

                Keyboard Thumbnail
Biographical Information

I am currently a member of the Psychology Department at John Jay College of Criminal Justice.  I also serve on the graduate faculty of the CUNY Graduate Center  in both the Industrial and Organizational Psychology and Forensic Psychology subprograms within the Psychology Doctoral Program, in the Quantitative Psychology subprogram of the Educational Psychology Doctoral Program, and also in the Criminal Justice Doctoral Program.  I teach or have taught research methods and quantitative methods courses at the undergraduate, masters and doctoral levels and my syllabi are available on this web page.  I completed my Ph.D. in Industrial and Organizational Psychology from The City University of New York in January 1996. I have previously taught Cognitive Psychology and Theories of Motivation at Baruch College.

My methodological interests include test validity, causation, structural equation modeling, latent class analysis, discourse analysis, psychometrics and test development, program evaluation, research design, and the logic of statistical significance testing.  My methodological interests center around the justification of inferences drawn from social science research.

My substantive research interests include intimate partner violence/domestic violence (IPV/DV), deception detection, and discourse processes in organizations as they relate to organizational socialization, organizational culture, organizational membership, organizational communication, and other aspects of organization.  I also have an interest in the logic of conditionals that informs my methodological work on causal inference, statistical inference, and test validation.

I also have general interests in ethical theory and in philosophy of language.  The former relates to my interests in protection of human subjects and also to the role of values in test validation, program evaluation, and scientific inference.  The latter relates to my interests in logic and also in discourse analysis.

I belong to The American Psychological Association, The Academy of Management, The Society for Industrial and Organizational Psychology, The Society for Philosophy and Psychology, American Educational Research Association., Psychometric Society, National Council on Measurement in Education, American Evaluation Association, and the Philosophy of Science Association.

Back to Top

Guitar Fingerpicking Thumbnail

                Fretboard Thumbnail
Criminal Justice Statistics Links
    A number of different agencies make criminal justice statistics about the United States available online.     The Bureau of Justice Statistics  Web site contains a wealth of summary statistics, complete data sets, research reports, research grants, and other information. The Sourcebook of Criminal Justice Statistics offers mostly tables and summary statistics. Fedstats: One Stop Shopping for Federal Statistics contains of wealth of information, but it is not always easy to find what you are looking for. To access more Criminal Justice links (from library home page) click and scroll down for statistics links.
    Although not restricted to Criminal Justice, a very large number of data sets are available from the Inter-university Consortium for Political and Social Research.  There is also data available from the Current Population Survey.

Back to Top

Viola Bow Thumbnail

Consulting Services

  I have largely curtailed my consulting activities.  If you need a structural equation modeling consultant or a test construction consultant, please email me with a description of the project, you needs, and the estimated time frame.

Back to Top

Migrated October 31, 2008
Migrated 17 August 2011
Updated 16 February 2013, 14 January 2014, 14 April 2014, 16 September 2016, March 17 2021, March 19 2021