Please watch the video on tests of independence and homogeneity which extends your knowledge of chi-square tests to two-by-two contingency tables (if you are feeling rusty with chi-square tests, ask in class and we’ll bring you up to speed). Also, please watch the video on odds ratios and relative risk.
The link to google docs for this week’s group work is here
In small groups please discuss…
What features in experimental design and analysis are crucial for good hypothesis driven scientific practice?
Would an analysis framework using a design thinking process be compatible with good science?
When would a design thinking approach be appropriate or inappropriate for data analysis?
Using the Framingham Heart Study, choose one interesting question that you formulated in week 1 or that you have generated for your lab reports.
Some examples of what you might consider are:
Next, identify a stake-holder that you feel would be most invested in each of the STEEPLE implications.
What would be the best media to communicate the question, its answer and its implication to the stake-holders?
This data contains the respiratory status of patients recruited for a randomised clinical multicenter trial. In each of two centres, eligible patients were randomly assigned to active treatment or placebo. During the treatment, the respiratory status (categorised poor or good) was determined at each of four, monthly visits. The trial recruited 111 participants (54 in the active group, 57 in the placebo group) and there were no missing data for either the responses or the covariates. The question of interest is to assess whether the treatment is effective and to estimate its effect.
respiratory <- read.delim("https://wimr-genomics.vip.sydney.edu.au/AMED3002/data/respiratory.txt",
sep = "\t")
tab <- table(respiratory$treatment, respiratory$status)
tab
Q2: We would like to test if there is any evidence that receiving the treatment altered your chance of having a good respiratory status? Rephrase this question into a null and alternate hypothesis that is consistent with a chi-square test.
Q3: Perform a chi-square test
chisq.test(tab)
test = chisq.test(tab)
test$expected >= 5
Q5: What is your conclusion for this test?
Q6: Interpret the relationship further by calculating a relative risk or odds ratio. Are both appropriate in this case?
OR <- (tab[1, 1] * tab[2, 2])/(tab[2, 1] * tab[1, 2])
OR
Lets perform a test of independence with randomly generated data.
## Set a seed so that results are reproducible.
set.seed(51773)
## Generate two random vectors of size n. We can do this with the 'sample'
## function.
n = 50
AB = sample(c("A", "B"), n, replace = TRUE)
CD = sample(c("C", "D"), n, replace = TRUE)
head(AB)
head(CD)
tabABCD = table(AB, CD)
tabABCD
chisq.test(tabABCD)
chisq.test(tabABCD)$expected
Only this section needs to be included in your Module 1 Lab Report to be handed in by 23:59 on Friday 11th March.
While I expect that you should explore the Framingham data, I am not opposed to you submitting a report on a dataset that you are incredibly engaged with.
Report guidelines
There are no hard and fast guidelines to the final content of your submitted lab reports. For this lab you will be assessed on your ability to generate statistical questions, explore these with graphical summaries and interpret your findings. Your report will also need to be well-presented:
It is expected that your report will construct and communicate an interesting story in 4 - 6 paragraphs (ish). To do this, you should be a ‘bad’ scientist and explore the data until you find something that you think is interesting, or, can use to address the marking criteria. When preparing your report always think “is your report something that you would be proud to show your friends?”, “would your family be interested in the conclusions you made?” and “would they find it easy to read?”
Marking criteria
Lab instructions
Continue with the framingham data from week 1. If you are comfortable, feel free to:
Provide your data analytics code as well as a summary of your findings in a reproducible report (e.g. Rmarkdown report).