10. Writing data science reports
The book Report Writing for Data Science in R by Roger D. Peng (freely dowloaded from LeanPub) is a useful reference for data science report writing.
Please read carefully the PDF on “Punchline reports“.
Each of the two project reports should be written as a self-contained report with the following sections:
- Title (on title page)
- Give an informative title to your project.
- List of authors
- Approximate contributions of each author (see here for examples)
- Assessment: Does the title give an accurate preview of what the report is about? Is it informative, specific and precise? Are all authors listed? Is each author’s contribution clearly stated?
- The issues
- A section – 1/2 to 1 page , and no more than 2 pages – stating what is addressed in the report, in simple, accurate, non-technical language.
- Assessment: Are the main issues addressed in the report described clearly, succinctly, in non-technical language?
- Findings
- A section describing the main findings of your analysis.
- Assessment:
- Are the findings clear? Do the findings address the issues of the previous section?
- Discussion
- A section discussing implications of the findings.
- Assessment: Is the discussion straightforward and easy to read? Does it relate to the issues of the analysis and the findings?
- Appendix A: Method
- Data collection. Explain where how and why you obtained the data..
- Variable creation. Detail the variables in your analysis and how they are defined (if necessary). For example, if you created a combined (frequency times quantity) drinking variable you should describe how. If you are talking about gender no further explanation is really needed.
- Analytic Methods. Explain the statistical procedures that will be used to analyze your data. E.g. Boxplots are used to illustrate differences in GPA across gender and class standing. Correlations are used to assess the impacts of gender and class standing on GPA.
- Assessment: Could the study be repeated based on the information given here? Is the material organized into logical categories (like those above)?
6. Appendix B: Results
- Typically, results sections start with descriptive statistics, e.g. what percent of the sample is male/female, what is the mean GPA overall, in the different groups, etc. Figures can be nice to illustrate these differences! However, information presented must be relevant in helping to answer the question(s) of interest. Typically, inferential (i.e. hypothesis tests) statistics come next. Tables can often be helpful for results from multiple regression. Do not give computer output here! Tables and figures should be labeled, embedded in the text, and referenced appropriately. The results section typically makes for fairly dry reading. It does not explain the impact of findings, it merely highlights and reports statistical information.
- Assessment: Is the content appropriate for a results section? Is there a clear description of the results?Are the results/data analyzed well? Given the data in each figure/table is the interpretation accurate and logical? Is the analysis of the data thorough? Is anything relevant ignored?Are the figures/tables appropriate for the data being discussed? Are the figure legends and titles clear and concise?
7. Appendix C: Data and code
8. References (if any)
Writing quality
- Is your report well-organized, with paragraphs organized in a logical manner?
- Is each paragraph well-written, with a clear topic sentence, and single major point?
- Is your report generally well-written, with good use of language, and sentence structure?
- Are tables and figures labeled correctly and referenced accordingly?
- Does the entire report flow and answer any question(s) sufficiently?
- Is there extraneous information presented (if so, delete it)?
Percentages for aspects of the projects: