The group project will be completed in groups of 3-4. Groups have been assigned according to an algorithm and are posted at Project Groups.
Each group will prepare and explain an example related to one of the topics below. Each member of the group will prepare a version of the software using a different statistical software (e.g. R, Stata, SAS, Python, Matlab) and write a short tutorial on the topic in that software as it relates to the example.
Here are is a collection of topics from which to choose. Some topics should be considered “umbrella” topics from within which you should choose 1-3 specific concepts to cover in your tutorial.
Each member of the group must contribute a version of the example in a different statistical software (R, Stata, SAS, python, Matlab).
When applicable, two examples may come from a single software if they use different tools (e.g. different R packages).
As a whole, each group’s examples must meet the following criteria:
Here are examples of how to meet the software criteria for a group of 3: - R, Stata, SAS - R, Stata, Python - R (tidyverse), R (data.table), SAS.
To get an idea of scope, you can compare to completed projects from previous years:
Those projects are slightly different in that some are tutorials and some are focused more specifically on a data analysis. Your projects can be simpler than these.
Each group should write a short proposal containing:
Languages other than those listed above (R, Stata, SAS, Python, Matlab) may be included provided you meet the software requirements and receive approval from me on your proposal.
A group liaison should submit the group’s proposal to me via email with the subject header “Stats 506 Group Project Proposal” before Monday November 2, 5pm.
Write an introduction 3-5 paragraphs in length covering the following topics.
This section should consist of tabbed sections for each version of the example. Make the examples as similar and parallel to one another as possible. Use tabbed sections to include code and make it easy compare the examples. Use common language in the code comments to facilitate comparisons.
If your core example is fairly simple, you should either provide multiple examples or collaborate on extending a subset of the examples as described below.
Extend one or more of the examples beyond the common scope of the others. Use this section to extend a subset of the examples to:
motivate a reader about why the topic is important, e.g. if your topic was “re-shaping data” you might show an analysis making use of the re-shaped data;
demonstrate functionality available in one tool, but not in another, e.g. pivoting multiple columns to long at once using data.table::melt()
;
explain or demonstrate something else you’d like a reader (e.g. this class) to know.
Include a section comparing and contrasting the tools used. Emphasize any important differences in functionality or defaults necessary to understand similarities or differences in the separate versions of the example.
Groups should use git and GitHub to coordinate their work. One group member should create a public repository for the project with others submitting pull requests to them. Your git repo is considered part of the final submission and should include at a minimum:
Excluding extraordinary circumstances, all group members will receive the same grade. However, I reserve the right to modify this policy in cases where one or more group members clearly put in less effort than the others.
Proposals Due: Monday November 2 by 5pm. Your group must have your proposal approved by me prior to this time. Groups are required to select unique topics so it is to your benefit to submit early.
Draft Due Date: Friday November 13 by 5pm. Submit drafts to Canvas as a link to the official git repo. Each member should submit a link and reference the files which they worked on to facilitate peer review from other groups. At a minimum, your draft must include:
Peer Review Due: Wednesday November 18 by 5pm. You will be asked to provide constructive feedback to two peers following the peer review guidelines. Additional guidelines for how to structure this feedback will be provided.
Final Due Date: Wednesday November 25, noon. This is the deadline to submit the final version of your tutorial. Please submit a link to the official git repo and the final html
document.
Please make edits in response to peer feedback.
I will post group proposals here as they are approved. Two groups may not choose the same or closely related topics.
I also plan to post your final analyses to this page.
You may include or omit your name from the author information at your own discretion. If omitting your name from the html
document, also include a second copy of your scripts with "_anon" appended to the file name and your name scrubbed from the header.