Keith
Markus' Urban Sprawl:
http://web.jjay.cuny.edu/~kmarkus
PSYC U80100.04 GC
Seminar in Special Topics:
Introduction to Data Analysis and Programming
with R and Python
54445
(Equivalent to EPSY 88000-03 53058)
Syllabus
Spring 2025
Time: Wednesday 6:30-8:30
PM
Room: Room 6418,
CUNY Graduate Center, 365 Fifth Avenue
Contact Information:
Professor Keith A. Markus
kmarkus@aol.com (This is
the best way to contact me.)
212-237-8784 (For some
reason I no longer receive voice messages as email, so I do not
recommend voice messages.)
Office: 10.65.04
New Building, John Jay College
Address: Psychology Department, 10th Floor
John Jay College of Criminal Justice, CUNY
524 W59th Street, New York, NY, 10019
.
Office Hours: Priority will be given to students who
make an appointment beforehand. 5 PM to 6 PM
Wednesdays when classes are in session. 3204.02 CUNY
Graduate Center.
Course Description: R and Python offer widely used
programming environments. The course offers a basic
introduction to R and Python programming for data analysis and data
management. The focus is on providing a firm foundation for
further self-guided learning in both environments. The course
is aimed at behavioral science researchers and methodologists and
assumes a basic familiarity with behavioral science data analysis,
commonly used statistical distributions and statistical tests.
The course provides a basic introduction to flow charts and program
design. The course explores the basic environments (R packages
and Python modules) including key elements of syntax, data types,
programming basics. The course emphasizes structured
programming with functions in R and object oriented programming with
classes in Python.
Course Objectives:
1. Students will gain a basic understanding of the process or
writing clear, readable, and reusable code.
2. Students will gain a basic level of comfort and familiarity
with both the R and Python programming environments.
3. Students will gain hands on experience with structured
programming using functions in R.
4. Students will gain hands on experience with object oriented
programming in Python.
5. Students will gain sufficient familiarity with both
environments to explore further topics on their own.
Course Flow:
Familiarize yourself with the reading before class meets.
We will illustrate concepts from the reading in class and
reinforce them with in-class practice problems. In
addition to the reading listed below, various handouts will be
made available through the learning management system.
Reading:
M1: Markus, K. A. (2025). Structured
Programming for Data Analysis Using Functions: A Tutorial. Draft
manuscript.
M2: Markus, K. A. (2025). Structured Programming for Data
Analysis Using Classes: A Tutorial. Draft manuscript.
ITR: Venables, W. N., Smith, D. M. & the R Core Team
(2021). An Introduction to R: Notes on R: A Programming
Environment for Data Analysis and Graphics. Version 4.1.2
(2021-11-01).
https://cran.r-project.org/doc/manuals/r-release/R-intro.pdf
URDAG: Maindonald, J. H. (2008). Using R for Data Analysis
and Graphics: Introduction, Code and Commentary.
https://cran.r-project.org/doc/contrib/usingR.pdf
TPT: The Python Tutorial
https://docs.python.org/3/tutorial/index.html
ITNSM: No Author (No Date). Introduction to numpy, scipy
and matplotlib*
https://www.patnauniversity.ac.in/e-content/science/physics/MScPhy21.pdf
P1: No Author (No Date). Pandas 1: Introduction*
https://www.acme.byu.edu/wp-content/uploads/2021/09/pandas1_2021.pdf
(Your browser may give a security warning for this URL. I will
post all the pdf files on Blackboard so that you can download them
from there.)
RSG: Google's R Style Guide https://google.github.io/styleguide/Rguide.html
PEP8: Style Guide for Python Code
https://www.python.org/dev/peps/pep-0008/
ZEN: The Zen of Python (Type 'import this' in the Python
console.)
*If you are the author of one of these documents and wish to be
identified as such, please t me know (kmarkus@aol.com).
Viewing:
The following three YouTube playlists are very strongly
recommended. I would suggest watching both DataDaft
playlists all the way through. The Socratica list includes
some topics that we will not directly address in this course but
also covers some topics not covered by the DataDaft list. In
particular, the episode on Python classes is highly recommended
because we will use those extensively.
DataDaft: Introduction to R https://www.youtube.com/playlist?list=PLiC1doDIe9rDjk9tSOIUZJU4s5NpEyYtE
DataDaft: Python for Data Analysis https://www.youtube.com/playlist?list=PLiC1doDIe9rCYWmH9wIEYEXXaJ4KAi3jc
Socratica: Learn Python https://www.youtube.com/playlist?list=PLi01XoE8jYohWFPpC17Z-wWhPOSuh8Er-
Software:
Software installation details will depend on your operating
system.

You can install base R here: https://cran.r-project.org/
Recommended, you can install R Studio here: https://www.rstudio.com/
This is a popular graphical user interface (GUI) for R that runs R
in the background and makes many routine tasks easier. I
will use R Studio in class. Note: When you install R Studio
the installer will look for an existing R installation. So,
it is best to install R first, then R Studio.

You can install base Python here: https://www.python.org/
This is not recommended for this course. See
recommended installation below.
Recommended: You can install Anaconda Python here: https://www.anaconda.com/
The Anaconda Python distribution includes all the special modules
used for data analysis applications. These can be tricky to
install yourself unless you have a high comfort level using the
command line terminal on your operating system. I will be
using the Spyder integrated development environment (IDE, which is
like a GUI for our purposes) to work with Anaconda Python in
class. Spyder comes packaged with Anaconda Python. (If
installation does not provide an icon or other shortcut, try
typing 'spyder' at the command line with no quotes to open the
application for the first time.)

Recommended: You can use the Dia program to draw flow
charts. You can install it from the Dia homepage here:
http://dia-installer.de/
Article Reviews:
The course requires two article reviews for M1 and M2 listed
above. A form will be provided on Blackboard. See
schedule for due dates. (Your detailed feedback will be
greatly appreciated.)
Coding Shares:
You are responsible for posting at least 3 contributions to the
"Coding Knowledge Exchange" bulletin board on Blackboard. As
you explore R and Python, you will discover various things that
are helpful to you. This might be a particular function, R
package or Python module. It might be a particular coding
idiom used to complete a particular kind of task. It might
be a blog post, help forum post (e.g., Stack Overflow or Cross
Validated), or other web resource. It might be a helpful
book. It might be something that you found confusing at
first and now have a clear idea how to explain to others. In
any event, each share should contain at least a short paragraph of
original text that explains (a) what the share comprises, (b) why
you found it helpful, and (c) one or more examples of potential
use cases (applications). A share should never be just a URL
link or direct quotation.
If you come up with a particularly elegant or otherwise
interesting solution to an in-class exercise, that can also be a
code share. If someone posts such a solution and you have a
helpful suggestion related to what the other student posted, that
can also serve as a code share.
Further use of the forum is not graded but I strongly encourage
robust discussion. I encourage you to continue sharing after
you have made three posts and to reply to one another's posts with
questions, comments, or additions.
Course Projects:
There will be separate projects in R and Python. Details about
the project assignments will be posted to Blackboard. Both
will involve writing a reusable suite of functions (in R) or classes
and methods (in Python) to complete a concrete data
management/analysis task using concepts from the course.
Grading: The final grade comprises coding shares and
the two course projects. Each article review counts for 5% of your
grade, coding shares count for 30% of your grade and each project
counts for 30% of the grade. Letter grades will be assigned
as indicated below.
Letter Grade
|
Percent Grade
|
A
|
92-100
|
A-
|
84-91
|
B+
|
76-83
|
B
|
68-75
|
B-
|
60-67
|
C+
|
52-59
|
C
|
44-51
|
C-
|
36-43
|
F
|
0-35
|
Diversity:
Everyone should feel welcomed as a member of the R and Python user
communities. The broader software development community has
recognized that it has struggled with diversity and has responded
with efforts to address this issue (https://en.wikipedia.org/wiki/Silicon_Valley#Demographics).
Jessica
McKellar served as a figurehead for early efforts to pursue
greater diversity and inclusion in the Python community. Her
message has been that there are no shortcuts or easy fixes; Instead
it takes long hours of networking and sending individual email
invitations to diversify conference participation. The Python
Foundation has published a diversity statement (https://www.python.org/community/diversity/)
and offers grants that can support, among other things, local
diversity and inclusion efforts (https://www.python.org/psf-landing/).
The R
Consortium has a high level project called R
Community IDEA that pursues inclusion, diversity, equity and
accessibility. A presentation
by Heather Turner describes various aspects of these
efforts. The R Foundation has endorsed R conferences in a
variety of languages (https://www.r-project.org/conferences/)
to include people outside Europe and North America and to target
under-supported regions globally. Diversity
and inclusion statements have become a key instrument in
making the field more welcoming to anyone interested in getting
involved. Reactionary events (e.g., Gamer Gate and the
infamous Google employee memo) often garner more press attention
than the positive work that is being done. Overall, it is my
understanding that such efforts have made up more ground promoting
gender inclusion (https://en.wikipedia.org/wiki/Sexism_in_the_technology_industry)
than other forms of inclusion. Moreover, one sometimes still
sees prominent personalities express counter-productive views that
demonstrate their struggle to adjust to the cultural changes taking
place. So, plenty of work remains to be done but do not let
that discourage you. Those interested in contributing to these
efforts will find a receptive audience and ample opportunities
through user organizations and conferences. I encourage anyone
interested to attend an R user conference
or Python user
conference to meet other users and learn about things you
might not otherwise explore on your own. UseR! and PyData are
two prominent options. The R Foundation requires conferences
that it endorses, including UseR!, to have a code
of conduct statement. PyData also has a code
of conduct statement. Such statements have become key
tools in establishing norms of professional behavior.
Education is another area targeted by efforts to diversify coding
and include more under-represented groups. Organizations like
Code.org
work to bring coding instruction to diverse students to help pave
the way for a more inclusiveness in the field. To sum up,
software development has a recognized diversity problem and
addressing that problem remains a work in progress but you will find
many people and initiatives in the field dedicated to progress in
this area.
I hope that this course can serve as an entry point to both user
communities and help promote diversity and inclusion in that
way. Likewise, I hope that you will share what you learn with
others. At a philosophical level, my choice to combine two
languages in this course reflects my deeper commitment to dialogism:
the view that the world is best understood through multiple
representations, that discussing one language system from the
perspective of another plays an important role in minding the gap
between what we communicate about and how we communicate about it,
and that no single system of representation is ever sufficient in
itself to the exclusion of others. In turn, these
philosophical commitments shape my understanding of and approach to
advancing diversity and inclusion.
Special Needs:
To request accommodations please contact the Office of the Vice
President for Student Affairs (Room 7301 Graduate Center; (212)
817-7400). Information about accommodations can be found in the
Graduate Center Student Handbook 05-06, pp. 51-52).
Academic Honesty:
The Graduate Center of The City University of New York is committed
to the highest standards of academic honesty. Acts of academic
dishonesty include—but are not limited to—plagiarism, (in drafts,
outlines, and examinations, as well as final papers), cheating,
bribery, academic fraud, sabotage of research materials, the sale of
academic papers, and the falsification of records. An individual who
engages in these or related activities or who knowingly aids another
who engages in them is acting in an academically dishonest manner
and will be subject to disciplinary action in accordance with the
bylaws and procedures of The Graduate Center and the Board of
Trustees of The City University of New York.
Each member of the academic community is expected to give full,
fair, and formal credit to any and all sources that have contributed
to the formulation of ideas, methods, interpretations, and findings.
The absence of such formal credit is an affirmation representing
that the work is fully the writer’s. The term “sources” includes,
but is not limited to, published or unpublished materials, lectures
and lecture notes, computer programs, mathematical and other
symbolic formulations, course papers, examinations, theses,
dissertations, and comments offered in class or informal
discussions, and includes electronic media. The representation that
such work of another person is the writer’s own is plagiarism.
Care must be taken to document the source of any ideas or arguments.
If the actual words of a source are used, they must appear within
quotation marks. In cases that are unclear, the writer must take due
care to avoid plagiarism.
The source should be cited whenever:
(a) a text is quoted verbatim
(b) data gathered by another are presented in diagrams or tables
(c) the results of a study done by another are used
(d) the work or intellectual effort of another is paraphrased by the
writer
Because the intent to deceive is not a necessary
element in plagiarism, careful note taking and record keeping are
essential in order to avoid unintentional plagiarism.
For additional information, please consult
“Avoiding and Detecting Plagiarism,” available in the Office of the
Vice President for Student Affairs, the Provost’s Office, or at
http://web.gc.cuny.edu/provost/pdf/AvoidingPlagiarism.pdf.
(From The Graduate Center Student Handbook 05-06, pp. 36-37)
Schedule
Date
|
Topics
|
Reading Due
|
Assignments Due
|
Week 1 W 2/5
(No classes 1/29)
|
Introduction to R and Python
environments, course overview, flow charts and programming
basics
|
|
|
Week 2 W 2/19
(College closed 2/12)
|
R basics & imperative
programming: Data types, indexing, missing data, loops,
reading and writing data, etc.
|
ITR, RSG
|
|
Week 3 W 2/26
|
Defining your own R functions
& Functional programming |
M1, URDAG
|
|
Week 4 W 3/5
|
Survey of some statistical
analysis functions in R |
ITR
|
|
Week 5 Th 3/6 (!)
|
R graphics |
|
|
Week 6 W 3/12
|
Test driven programming in R |
|
M1 Review |
Week 7 W 3/19
|
R statistical distribution
functions and a general framework for simulation studies
in R |
URDAG
|
|
Week 8 W 3/26
|
General considerations for
refactoring and writing re-usable code in R |
|
|
Week 9 W 4/2
|
Python basics: Importing
modules, data types, loops, variable scoping, defining
functions, file handling, etc. |
TPT, PEP8, ZEN
|
R project
|
Week 10 W 4/9
|
Object oriented
programming: Classes, attributes, methods &
composition
|
M2, TPT
|
|
Week 11 W 4/23
(Spring Break 4/16)
|
Test driven programming in
Python, unit tests and assertion checks |
|
M2
Review
|
Week 12 W 4/30
|
Data types, data management
and statistical distributions: NumPy, SciPy and Pandas |
ITNSM
|
|
Week 13 W 5/7
|
Data Analysis: Statsmodels and
Mathplotlib |
ITNSM, P1
|
|
Week 14 W 5/14
|
General considerations for
refactoring and writing re-usable code in Python
|
|
|
Week 15 W 5/21
|
Student project presentations |
|
Python project |
Created 5 September 2021, updated 26
November 2021, 3 December 2021, 20 January 2022, 24 January 2025,
2 Feruary 2025
This page was created using
SeaMonkey
v.2.53.18.2.