University of Sydney

Populations & Samples

Population & Sample

  • The population is the full amount of information being studied, collected though a census.
  • A sample is part of the population.


Why we do need samples? What are the issues concerned with analysing samples?

Limitations of a census

Collecting every unit of a population

  • is hard
  • take lots of time
  • costs a lot of money
  • requires lots of resources.

Parameter & Estimate

Parameter & Estimate

  • A parameter is a numerical fact about the population which we are interested in. For example the population mean \(\mu\).
  • An estimate (or statistic) is a calculation of sample values which best predicts the parameter. For example the sample mean \(\bar{x}\).


"The estimate is what the investigator knows. The parameter is what the investigator wants to know."


Finding the best estimate of the parameter

  • Much Statistical Theory is concerned with how to find the best estimate of a parameter.
  • 2 critical issues are:
    • How was the sample chosen? Is it representative of the population?
    • What estimate is closest to the parameter?

Selecting a sample

4 Common Types of Bias

  1. Selection bias A systematic tendency to exclude or include one type of person from the sample.

  2. Non-response bias Caused by participants who fail to complete surveys.
    • What was the response rate?
    • Non-respondents can be very different to respondents.
  3. Interviewer's bias When the interviewer has to make a choice of participants in the survey, or when characteristics of the interviewer have an effect on the answer given by participants.

  4. Measurement bias When the form of the question in the survey effects the response to the question.

Statistical Thinking

Suggest how the following could occur:

  • Selection bias in a phone survey
  • Non-respose bias in the Australian postal poll on same sex marriage
  • Interviewer bias, when an employer in a polling company is given a quota

Examples of Measurement Bias

1. Bias in question wording

Statistical Thinking

How is this wording biased? Suggest a rewording.

  • Should a woman have control over her own body, including her reproductive system?
  • Should a doctor be allowed to murder unborn children who can't defend themselves?

Examples of Measurement Bias Cont.

2. Recall bias People forget details.

3. Sensitive questions People may not tell the truth.

  • Do you use illegal drugs?
  • How often do you bathe?
  • Do you pay tax?
  • What is your age?

4. Lack of clarity in question

  • People may misinterpret the question.
  • Certain words may be ambiguous and so mean different things to different people.

Bias and sample size


Statistical Thinking

Would increasing the number of samples overcome the bias?

How to pick a good sample?

  • A sampling procedure should be fair - ie it gives a representative cross section of the population.
  • We use a probability method to pick the sample, so that

    • the interviewer is not involved in the selection: "Blind chance is impartial".
    • the interviewer knows the chance of any particular person being chosen.
  • For example, Simple random sampling involves drawing at random without replacement.

As this is often not practical, organisations often use multi-stage cluster sampling.

Common methods of surveys

  • Face-Face Interviews
  • Phone Interviews
  • Self-administered Surveys
  • Mail
  • Online


What are the pros and cons of each method?

Summary

Bias is any factor that favours certain outcomes or responses, or influences an individual's responses. Bias may be unintentional (accidental), or intentional (to achieve certain results).

  • When looking at data from a survey think about
    • Selection bias / sampling bias. The sample does not accurately represent the population. Example: Attendees at a Star Trek convention may report that their favorite genre is a science fiction.
    • Non-response bias: Non-Response Bias Certain groups are under- represented because they elect not to participate. Example: A restaurant may give each table a "customer satisfaction: survey with their bill.
    • Measurement or designed bias: Bias factors in the sampling method influence the data obtained. A respondent may answer questions in the way she thinks the questioner wants her to answer.