Keith Markus' Urban Sprawl:

  PSYC U76000 JJ / EPSY U73000 GC
Psychometric Methods / Introduction to Psychometrics

Course Information
American Educational Research Association
American Psychological Association

National Council on Measurement in Education
Psychometric Society
   Site Map

Blackboard (CUNY portal)

Spring 2011

EPSY U73000 GC:
Tuesday 4:15-6:15 PM
GSUC 6417

PSYC U76000 JJ:
  Wednesday 6:30-8:30 PM
Room:  2437 North Hall, John Jay College

Contact Information:
Professor Keith A. Markus  (This is the best way to contact me.)
212-237-8784 (Email will generally reach me before voice mail.)
Room 2127N (North Hall, John Jay College)
Psychology Department, John Jay College of Criminal Justice, 445 W 59th Street, New York, NY 10019 USA

Office Hours:
  Tuesday 3 PM to 4 PM (GC) & Wednesday 5 PM to 6 PM (JJ).

Course Description:  The course offers a general introduction to psychometric methods primarily emphasizing classical test theory, test construction and validation, and test use.  The emphasis lies with developing a firm understanding of basic psychometric concepts.  This course lays a foundation for more advanced courses in specific topics introduced here. The course understands psychometircs and testing as applying broadly, not just to paper and pencil tests but also to performance assessments, behavioral observations, measured variables in experiments and quasi-experiments, surveys, and other forms of behavioral data collection. However, much of the material will emphasise measurement involving multiple indicators of a common construct.

Course Objectives: The course assumes a foundation in basic statistics and a healthy curiosity but little more. The more you put into the course, the more you will get out of the course. The course design relfects the following objectives.
1. Students will gain a basic understanding of the foundations of test theory that will prepare them to pursue more advanced topics (e.g., item response theory, structural equation modeling).
2. Students will gain the background and confidence to critically read technical manuals and other documentation in conjuction with use of published tests.
3. Students will gain facility with conceptual tools for thinking through issues of validity and reliability as applied to all measures from dependent variables in experiments to large scale testing programs.
4. Students will gain a level of comfort with algebraic representations of test scores and the use of these to think through applied problems related to test use and interpretation.
5. Students will gain an increased sensitivity to the fallibility of educational and psychological tests and the limits to their use and interpretation.
6. Students will gain exposure to the use of statistical software for conducting psychometric analyses and some experience with such analyses.
7. Those students who choose to avail themselves of it will optionally leave the course with additional experience using various software packages for psychometric data analysis.

Text Book:
Crocker, L. & Algina, J. (2006). Introduction to classical and modern test theory.  Mason, OH: Cengage Learning. (Originally published 1986.)

Additional Reading:
American Educational Research Association, American Psychological Association & National Council on Measurement in Education (1999).  Standards for educational and psychological testing.  Washington, DC:  AERA. (Note: This edition will be replaced by a new edition in the near future.)

Byrne, B. M. (2005). Factor analytic models: Viewing the structure of an assessment instrument from three perspectives. Journal of Personality Assessment, 85, 17-32.

Gregorich, S. E. (2006). Do self-report instruments allow meaningful comparisons across diverse population groups? Testing measurement invariance using the confirmatory factor analysis framework. Medical Care, 44, S78-S94.

Hubley, A. M. & Zumbo, B. D. (1996). A dialectic on validity: Where we have been and where we are going. Journal of General Psychology, 123, 207-215.

Kane, M.T. (1992). An argument-based approach to validity. Psychological Bulletin, 112, 527-535

Messick, S. (1995). Validity of psychological assessment: Validation of inferences from persons' responses and performances as scientific inquiry into score meaning. American Psychologist, 50, 741-749.

Reise, S. P., & Haviland, M. G.  (2005). Item response theory and the measurement of clinical change. Journal of Personality Assessment, 84, 228-238.

Sijtsma, K. (2009). On the use, the misuse, and the very limited usefulness of Cronbach's alpha. Psychometrika, 74, 107-120.

Streiner, D. L. (2003). Starting at the beginning: An introduction to coefficient alpha and internal consistency. Journal of Personality Assessment, 80, 99-103.

Zumbo, B.D. (2007). Three generations of differential item functioning (DIF) analyses: Considering where it has been, where it is now, and where it is going.  Language Assessment Quarterly, 4, 223-233. (reprint at

Books are available through the GC online bookshop: Ordering through the GC online bookshop helps to support the Mina Rees Library.

Course Flow:
  Familiarize yourself with the reading material before the corresponding lecture.  Lectures will summarize and clarify the reading.  In general, I would rather answer your questions than lecture. I will use class time to illustrate and amplify particularly tricky points based on past experience. I will not focus on unimportant material in class, but there will be some important material that I do not focus on in class.

I will illustrate psychometric concepts using a variety of software packages.  Familiarity with the software is not a course requirement. However, learning psychometrics simply by reading about it is akin to learning to swim, ski, or play a musical instrument simply by reading about it. Actual practice is a much more effective method. Whether you use a simple calculator, a spreadsheet, or advanced statistical software, it is a good habit to play around with the material by constructing concrete examples and taking a try-and-see attitude toward the material. If something seems puzzling, make up an example and try it out. If something seems counter-intuitive to you, try to construct a counterexample. The more concrete you make psychometrics, the more comfortable you will feel with the material, the better you will understand it, and the more skills you will develop that you can apply outside of the class. None of the this is required for the course, but it will make it more fun, more interesting, and more practical.

Examinations:  The examinations will not be cumulative but later material will always presuppose a familiarity with prior material.  You are allowed one two-sided 8.5 x 11 inch hand-written page of notes and a calculator to be used during each examination.  Examinations will emphasize your ability to reason using psychometric principles studied in the course.  Although examinations will not emphasize computations, they will require some computation.  Exercises in the text book offer the best test preparation.

Homework:  Come prepared to turn in homework assignments at the beginning of class.  Given that the homework comes due before the corresponding lecture (and the fact that you can look up the answers if you get stuck), I will grade more on completeness than accuracy.  The assignments primarily serve the purpose of allowing you to test your understanding of the reading before the lecture and thus better recognize where you have questions about the material.

Course Project: Write a proposal for the validation of a test of your construction following the format below. Double space the proposal and use APA format and style. However, printing on both sides of the page is allowed to save paper.

A. Title page including your name and affiliation.
B. Abstract (180 words max).
C. Purpose of the test (250 words max). Describe the intended use of the test. Describe the intended users and the intended test taking population. Explain what the test would contribute over and above existing tests. Describe the theoretical rationale behind the test.
D. Test Blueprint (500 words max).
  1. Define the constructs to be assessed by the test. Your test should include at least two constructs and at least six items per construct. Describe the relationship(s) between the constructs, conceptually and statistically.
  2. Specify the format of the items and response options.
  3. Specify the content of the items. If a scale on your test includes more than one kind of item, specify the number of items of each type.
  4. Specify the acceptable range of item statistics (mean or proportion correct, standard deviation) for each item and test statistics (mean, standard deviation, reliability) for each subscore.
E. Draft test. Provide a draft version of the test including instructions and a full set of items that conform to parts 1-3 of the test blueprint.
F. Proposed validation plan (750 words max). Describe five validation studies for your test (one paragraph each). Design one study for each of the five main sources of test validity evidence listed in the Standards (content, response processes, internal structure, relationships with other variables, and consequences of test use). Explain the rationale behind the intended interpretation of the test and how each study tests an assumption of that rationale (validity argument).
G. Factor model (250 words max). Download the SPSS code for simulating item response data from an assumed factor model. Enter plausible values for the item parameters (loadings, error variances, intercepts, facor correlation[s]). If necessary, tweak your values until your items satisfy part 4 of the test blueprint. Report the final set of values in a table. In the text, describe your general interpreation of the resulting factor model. Include your interpretation of each factor, a description of which items assess which factors, and a general description of the strengths and weaknesses of the item set.
H. Pilot study (500 words max). Report this as if you had completed an empirical pilot study, but use the simulated data from part G.
  1. Conduct an item analysis of the data set. Report the item statistics (means, standard deviations, item total correlations and regression R-square values). Describe how differences between item statistics relate to differences between item parameters in the factor model.
  2. Report the scale statistics (means, standard deviations). Compare and constrast the scale statistics for each scale.
  3. Report both alpha and lambda 2 reliability estimates. Describe the alphas-if-item-deleted.
  4. Relate the results from parts H1 to H3 to the test blueprint. Provide an overall evaluatoin of the functioning of the draft test based on these results.
J. Appendices: Include the SPSS syntax used for your simulation as Appendix A, and the SPSS output as Appendix B.

Convert your paper to portable document format (PDF), and turn in both a hard copy and a PDF file. (If you cannot save directly to PDF, download a free PDF print driver such as Cute PDF.) Note the due dates on the course schedule. Proposals will be grading using the folloiwng rubric.

Completeness (50% of grade, 13 points total)
A&B = 1 point
C = 1 point
D = 2 points. Each of four sections = .5 points.
E = 1 point.
F = 3 points.
G = 1 point.
H = 2 points. Each of four sections = .5 points.
J = 2 points.

Overall quality dimensions (50% of grade, 40 ponts total)
Clarity of presentation (1 - 10)
Technical accuracy of reporting (1 - 10)
Depth with which issues are presented within allowed space (1 - 10)
Overall conceptualization and design of proposed test and test development (1 - 10)
1-5 = unsatisfactory.
6 = minimally satisfactory.
7 = some significant weaknesses.
8 = generally good with a few weak points.
9 = overall very well done.
10 = outstanding effort.

Grading:  Each of the two examinations is worth 25% of your total grade.  The course project is worth another 30%. That leaves 20% for the homework assignments.  Letter grades will be assigned as indicated below.

Letter Grade
Percent Grade

Special Needs:
To request accommodations please contact the Office of the Vice President for Student Affairs (Room 7301 Graduate Center; (212) 817-7400). Information about accommodations can be found in the Graduate Center Student Handbook 05-06, pp. 51-52).

Academic Honesty:    
The Graduate Center of The City University of New York is committed to the highest standards of academic honesty. Acts of academic dishonesty include—but are not limited to—plagiarism, (in drafts, outlines, and examinations, as well as final papers), cheating, bribery, academic fraud, sabotage of research materials, the sale of academic papers, and the falsification of records. An individual who engages in these or related activities or who knowingly aids another who engages in them is acting in an academically dishonest manner and will be subject to disciplinary action in accordance with the bylaws and procedures of The Graduate Center and the Board of Trustees of The City University of New York.  

Each member of the academic community is expected to give full, fair, and formal credit to any and all sources that have contributed to the formulation of ideas, methods, interpretations, and findings. The absence of such formal credit is an affirmation representing that the work is fully the writer’s. The term “sources” includes, but is not limited to, published or unpublished materials, lectures and lecture notes, computer programs, mathematical and other symbolic formulations, course papers, examinations, theses, dissertations, and comments offered in class or informal discussions, and includes electronic media. The representation that such work of another person is the writer’s own is plagiarism.

Care must be taken to document the source of any ideas or arguments. If the actual words of a source are used, they must appear within quotation marks. In cases that are unclear, the writer must take due care to avoid plagiarism.

The source should be cited whenever:
(a) a text is quoted verbatim
(b) data gathered by another are presented in diagrams or tables
(c) the results of a study done by another are used
(d) the work or intellectual effort of another is paraphrased by the writer

    Because the intent to deceive is not a necessary element in plagiarism, careful note taking and record keeping are essential in order to avoid unintentional plagiarism.

    For additional information, please consult “Avoiding and Detecting Plagiarism,” available in the Office of the Vice President for Student Affairs, the Provost’s Office, or at

(From The Graduate Center Student Handbook 05-06, pp. 36-37)

EDPS U73000 GC
PSYC U76000 JJ
Reading Assignments Due
Assignments Due
T 2/1
W 2/2 Course Overview

T 2/8
W 2/9 Crocker & Algina (CA) 1, 2, & 5
Statistical Review & Test Theory Basics
CA 1: Exercises 1 & 2,
CA 2: Exercises 6 & 7,
CA 5: Exercise 3.
T 2/15
W 2/16 CA 6; Standards (S) Introduction & 2
Reliability and Classical Test Theory
CA 6: Exercises 1 & 3.
T 2/22
W 3/2
(W 2/23 Follows a Monday schedule.)
CA 7; Streiner, 2003; Sijtsma, 2009
The Estimation of Reliability
CA 7: Exercises 1 & 4.Choice of test for the course project.
T 3/1
W 3/9
CA 10, S 1; Hubley & Zumbo, 2006
Test Validity
CA 10: Exercises 5 & 7.
T 3/8
W 3/16
Kane, 1992; Messick, 1995
Test Validation
Write a paragraph comparing and contrasting the two readings.
T 3/15
W 3/23
CA 4, S 3
Test Construction
CA 4: Exercises 3 & 5.
T 3/22
W 3/30
Midterm Examination.
T 3/29

W 4/6

CA 13, Byrne, 2005
Factor Analysis

CA 13: Exercises 2.
T 4/5
W 4/13
CA 14, Gregorich 2006
Classical Item Analysis & Measurement Invariance
CA 14: Exercises 2 & 3.
T 4/12
W 4/27 (Classes do not meet W 4/20)
CA 15; Reise & Haviland, 2005
Introduction to Item Response Theory
CA 15: Exercises 3 & 4.
T 5/3 (Classes do not meet 4/19 or 4/26)
W 5/4
CA 16; Zumbo, 2007
Detecting Item Bias: DIF
CA 16: Exercises 1 & 4.
T 5/10
W 5/11
CA 12, S 7
Bias in Selection
CA 12: Exercise 2. Course project.
T 5/17
W 5/18
CA 19, S 4
Norms and Standard Scores
CA 19: Exercises 5 & 6.
T 5/24
W 5/25
Final Examination.

Created January 27, 2008
Updated January 26, 2011
This page was created using Mozilla SeaMonkey v.2.0.11 and is best viewed using a Mozilla web browser.