Syllabus

Contact Information

  • Instructor: James Henderson (jbhender@umich.edu)
  • Instructor office hours: Monday 2:30-4pm, Thursday 2:30-4pm (Zoom).
  • GSI: Yichao Chen (yichaoc@umich.edu)
  • GSI office hours: Friday 12-3 (Zoom)
  • See Canvas or Google Calendar for Zoom Links

Key Dates

  • Midterm Group Project:
    • Assigned: W Oct 7 W Oct 21
    • Draft due: M Oct 26 F Nov 13
    • Revisions due: M Nov 9 W Nov 25
  • Final Project:
    • Assigned: W Nov 4 W Nov 11
    • Draft due: M Dec 7
    • Revision due: Tu Dec 15, 6pm
  • Final Class: Tu Dec 8
  • No Class: Tu Nov 3 (Election), Tu/Th Nov 24/26 (Holiday)
  • Office of the Registrar

If project due dates interfere with religious holidays you observe please let me know prior to October 1 to work out alternative arrangements.

Grading information

  • Problem sets (6 sets, top 5 scores): 50%
  • Midterm Project: 15%
  • Final Project: 15%
  • Readings & Quizzes: 10%
  • Participation in Weekly Activities: 10%

Grades on individual assignments will be given as points with 1,000 total points available in the course, so that 1pt = 0.1% of the course grade. In addition, a small number (~20-50) of extra credit points will be awarded for participating in course surveys. The latter will be sufficient to cover 1-2 missed activities.

Text books

There are no required texts for this course. There are three recommend texts.

The Art of R Programming, by Norman Matloff, is recommended for those with little to no previous experience in R.

Advanced R, by Hadley Wickham, is recommended for those who would like to develop a deep understanding of R and its inner workings.

R for Data Science, by Garrett Grolemund and Hadley Wickham, is a helpful bridge between these two.

Pre-requisites

You should have taken or be currently enrolled in an intermediate applied statistics course such as Stats 500, and you should be comfortable programming in at least one programming language or scripting in a statistical software language.

Course Format

This term the course is designed for an online format requiring a limited amount of synchronous participation. Lecture material and instructor demonstrations will be recorded and distributed weekly via Canvas, with new material generally appearing on Mondays. Students are expected to view recorded lectures and complete related readings prior to the course meeting on the following Tuesday. Readings will generally be accompanied by a short quiz in Canvas to assess comprehension, also typically due the following Tuesday.

Synchronous course meetings will begin at 10am on Tuesdays and Thursdays and always end by 11:20. The first Tuesday, Sep 1, will be a live presentation of the course overview. Thereafter, each Tuesday prior to the Thanksgiving recess will consist of a live Q & A (~10 minutes) followed by a synchronous group activity related to the topic from the prior week. Participation in weekly activities is required (10 points each). All Thursday course meetings, and Tuesday meetings in December, will consist of live Q & A only. Attendance of Q & A meeting are optional but strongly recommended. I expect Thursday Q & A to generally last less than 30 minutes, but we will take as much time as needed up to 11:20.

The information above is summarized briefly below:

  • Tuesdays (Sep 8 - Nov 17): synchronous activities (Attendance recorded)
  • Tuesdays (Dec 1 & Dec 8): synchronous Q & A (Attendance optional)
  • Thursdays (Sep 3 - Dec 3): synchronous Q & A (Attendance optional)

Synchronous meetings will be recorded and posted to Canvas.

Course Recordings

Synchronous course meetings including activities and question and answer (Q & A) session may be audio/video recorded and made available to other students in this course through Canvas. As part of your participation in this course, you may be recorded. If you do not wish to be recorded, please contact me (jbhender@umich.edu) the first week of class (or as soon as you enroll in the course, whichever is latest) to discuss alternative arrangements.

Students are prohibited from distributing recordings of any class activity without prior written permission from the instructor, except as necessary as part of approved accommodations for students with disabilities. Any approved recordings may only be used for the student’s own private use.

Course Description

Stats 506 covers a variety of topics related to the use of computers for analyzing, managing, and presenting data. The goals of the course are for students to:

  1. gain fluency with common computing tools, methods, and concepts used by data scientists and statistical analysts;
  2. develop good habits for coding, documentation, and workflow;
  3. demonstrate growth in presenting and communicating data and analysis.

The topics below represent an approximate plan for the course. I reserve the right to modify this plan. Topics will be supplemented with computing and data analysis case studies.

  • Week of September 1:
    • Course Overview
    • Linux shell and utilities
  • Week of September 8:
    • R basics: vectors, arrays, objects, I/O, functions, programming.
    • R: style guidelines
  • Week of September 15:
    • git, version control
    • markdown, R markdown
  • Week of September 22:
    • R topics: the tidyverse, emphasizing the dplyr and tidyr packages
    • R topics: graphics using ggplot2
  • Week of September 29:
    • Stata basics: syntax, I/O, data management, aggregations by group
    • Stata topics: regression, macros, iteration
  • Week of October 6:
    • R topics: vectorization, Monte Carlo studies
  • Week of October 13:
    • SAS basics: syntax, I/O, data management, aggregations by group
    • SAS topics: regression, macros
  • Week of October 20:
    • SQL
    • Regular expressions
    • R topics: working with strings and dates
  • Week of October 27:
    • R topics: understanding and managing memory utilization
    • R topics: R's object oriented systems
  • Week of November 3:
    • R topics: the data.table package
  • Week of November 10:
    • R topics: parallel computing and asynchronous computing with futures
  • Week of November 17:
    • re-sampling methods: the bootstrap, permutation testing, cross validation
  • Week of December 1:
    • Batch computing in an HPC environment

Computing Resources

All of the software in this course is available without charge for UM students. Some software, such as R, is free and open source and can be installed on your personal machine. You should install both R and RStudio as described in steps 1 and 2 here.

You will also need an ssh client, such as Terminal (Mac), Putty (Windows) or git for windows (Windows).

Both Stata and SAS are available on university Linux servers including the Statistics and Computation Service and the Great Lakes HPC environment. Both programs can also be used in a virtual environment by visiting midesktop.umich.edu. Finally, SAS offers a free University Edition.

Submitting work late

You are expected to turn in all problem sets by the due dates listed on Canvas. However, to accommodate unexpected circumstances you may turn in problem sets up to 72 hours late for a 5 point reduction. In addition, we will count only the highest 5 assignments. The 72 hour policy does not apply to quizzes or projects.

Problem Set Revisions

You may revise any one of the first 5 problem sets after the grades are released. However, you may only submit revisions for a problem set which you originally submitted and may only revise those questions which you originally attempted. The spirit of the policy is to allow you to correct mistakes or oversights, not to reward you for submitting the example solutions in place of your original work. I reserve the right to unilaterally determine which revisions are in accord with the spirit of the policy.

Revisions will be graded according to the original guidelines, subject to a (rounded) 20% penalty. For example, if you attempted the entire assignment on time, received an 80, and made perfect revisions your revised score would be: 80 + floor((1 - .2) * 20) = 96.

To submit a revision, utilize the "problem set revisions" assignment on Canvas. You may submit revisions at any time up to 7 days after problem set 5 grades are posted. However, once you've utlized your revision you may not "replace" it with a future revision.

Accomodations for students with disabilities

If you think you need an accommodation for a disability, please let me know at your earliest convenience. Some aspects of this course, such as assignments, in-class activities, and recorded lectures may be modified to facilitate your participation and progress. As soon as you make me aware of your needs, we can work with the Services for Students with Disabilities (SSD) office to help us determine appropriate academic accommodations. SSD (734-763-3000; http://ssd.umich.edu) typically recommends accommodations through a Verified Individualized Services and Accommodations (VISA) form. Any information you provide is private and confidential and will be treated as such.

Academic Integrity

Unless specifically stated otherwise, students are expected to complete homework and other assignments independently without copying code or text from other students in this course. Students are encouraged to discuss assignments and to help one another with concepts and syntax, but such discussions should not reach the point of representing one student’s work as another’s.

Use of materials from other sources should fall within the license for those materials and include proper attribution.

For more on academic integrity, please be advised of Rackham policy: http://www.rackham.umich.edu/policies/academic-policies/section11.

Mental Health and Wellbeing

The University of Michigan is committed to advancing the mental health and wellbeing of its students. If you or someone you know is feeling overwhelmed, depressed, and/or in need of support, services are available. For help, contact Counseling and Psychological Services (CAPS) at 734.764.8312 and caps.umich.edu during and after hours, on weekends and holidays, or through its counselors physically located in schools on both North and Central Campus. You may also consult University Health Service (UHS) at 734.764.8320 and https://www.uhs.umich.edu/mentalhealthsvcs, or for alcohol or drug concerns, see https://www.uhs.umich.edu/aodresources.

For a listing of other mental health resources available on and off campus, visit http://umich.edu/~mhealth/

Mandatory Reporting and Sexual Misconduct

Title IX prohibits sex discrimination to include sexual misconduct: harassment, domestic and dating violence, sexual assault, and stalking. If you or someone you know has been harassed or assaulted, you can receive confidential support and academic advocacy at the Sexual Assault Prevention and Awareness Center (SAPAC). SAPAC can be contacted on their 24-hour crisis line, 734-936-3333 and online at sapac.umich.edu. Alleged violations can be reported non-confidentially to the Office for Institutional Equity (OIE) at institutional.equity@umich.edu. Reports to law enforcement can be made to University of Michigan Police Department at 734-763-3434.

As an instructor, one of my responsibilities is to help create a safe learning environment on our campus. I also have a mandatory reporting responsibility. I am required to share information regarding sexual misconduct or information about a crime that may have occurred on U-M's campus with the University. Students may speak to someone confidentially by contacting SAPAC’s Crisis Line at (734) 936-3333.