Calculate standard error
Determining confidence intervals
Understand the difference between dependent and independent variables.
Understand the steps for hypothesis testing.
A population is the entire group of items, people, or events of interest. Due to practical limitations, it’s often impossible to study the entire population. A sample, which is a subset of the population, is used to make inferences about the population.
A sample needs to be of a sufficient size and randomly selected to accurately represent the population (per the central limit theorem).
The central limit theorem states that, under certain conditions, the sum of a large number of random variables i approximately normally distributed and the spread of the distribution will decrease.
In this example, we know what the original
distribution is, but when we do research (and we sample from the
population), we don’t know the true mean of the distribution. In this
example, we know that \(\mu\) equals
3.5.
However, we can estimate with what amount of error our observed mean will differ from the population mean (\(\mu\)). To do this, we take the average of our sampling distribution (sum of the distribution divided by the sample size). This will result in a new distribution with a \(\mu\) and a standard deviation. The standard deviation of the distribution of the mean is calculated as the \(\frac{\sigma}{ \sqrt n}\), where \(\sigma\) is standard deviation of rolling one die. Consider that the larger the sample size, the smaller the standard deviation is going to be.
If this was not clear, you can watch: https://www.youtube.com/watch?v=zeJD6dqJ5lo&ab_channel=3Blue1Brown
Implications of the central limit theorem:
The mean of the distribution of sample means is identical to the mean of the “parent population,” the population from which the samples are drawn.
The variance of the sampling distribution is equal to the population variance divided by the sample size.
The average distribution of the summed distributions is approximately normal. This is the basis for statistical inference for means.
Recall that the variance of the average distribution of the summed of the distributions is equal to the population variance divided by the sample size.
Standard error: The standard error of the mean, or simply standard error, indicates how different the population mean is likely to be from a sample mean.
Formula:
\(\text{standard error} = \frac{\sigma}{ \sqrt n}\)
\(\sigma\) the population standard deviation. If the population standard deviation is not known, you can substitute the sample standard deviation, s, in the numerator to approximate the standard error.
\(\sqrt n\) the square root of the sample size
Imagine that we are sampling grades (1 to 10) from a class with 100 students.
## [1] 3.162278
## [1] 0.6324555
Therefore, the standard error in our population for our grades is 0.633.
Now, notice that the larger the sample size, the more accurately we are predicting out population parameter.
## [1] 4.472136
## [1] 0.4474273
Remember that the standard error is calculated with the standard deviation of our sampling distribution and the number of samples (the more samples the better our distribution will resemble to the \(\mu\) of the population)
A confidence interval, in statistics, refers to the probability that a population parameter will fall between a set of values for a certain proportion of times. In other words, a confidence interval is the mean of your estimate plus and minus the variation in that estimate.
For example, if you construct a confidence interval with a 95% confidence level, you are confident that 95 out of 100 times the estimate will fall between the upper and lower values specified by the confidence interval.
To calculate a confidence interval you need to know:
Imagine that you are a speech language pathologist and you are assessing a child that is 2 years old and produces words with a mean of 3.2 syllables per word. You want to know what is the percentage of children that does better that the child you are assessing.
This is a vector with the means of 30 children:
means_syllables_child = c(3.2,3,4.5,3,2,2,2,2.8,3.2,2.5,2,3,4.5,3,2,2,2,2.8,3.2,2.5,2,3,4.5,3,2,2,2,2.8,3.2,2.5)
What is the mean of the samples means?
## [1] 2.74
And the standard deviation?
## [1] 0.7600363
Point estimate: The point estimate of your confidence interval is the statistic you are computing. In this case, the point estimate will be the mean.
From our previous example. If the mean of 2.74 number of syllables, our point estimate is 2.74.
Standard error: The standard error helps us identify how much variation there is in our sampling distribution.
## [1] 0.1387564
## [1] 0.138763
From our previous example, the standard error is 0.14.
Our mean estimate plus and minus the variance is 2.74 +- 0.14
Confidence level: The probability that the confidence interval includes the true mean value within a population is called the confidence level of the confidence interval. You can decide your confidence level.
In short, setting you condifence level to 95% indicates that you are creating a range of values that you can be 95% confident contains the true mean of the population.
Critical value: Cut-off values from the normal distribution that define regions where the mean is unlikely to lie.
If we have a normal distribution, we can use z-scores. If our confidence level is 95%, we need to cut off 5% of the distribution (2.5% in each side).
We take the z-score of 1.96 that will encompass 95% of all sample means. 95% probability of getting a z-score of 1.96.
CI_low <- 2.74 - (1.96*0.14)
CI_high <- 2.74 + (1.96*0.14)
print(paste('95% CI :', CI_low, '-', CI_high, 'syllables'))
## [1] "95% CI : 2.4656 - 3.0144 syllables"
Note: Be careful because confidence levels are not the same as confidence intervals
The confidence level is the percentage of times you expect to get close to the same estimate if you run your experiment again or resample the population in the same way.
The confidence interval consists of the upper and lower bounds of the estimate you expect to find at a given level of confidence.
Null hypothesis: there is no difference between the values of the means in the populations from which the samples were drawn (the two samples belong to the same population)
\(H_0 : \mu = \mu_0\)
Alternative hypothesis: there is a difference between the values of the means in the populations from which the samples were drawn (the two samples belong to two populations)
\(H_A : \mu \neq \mu_0\)
Such a hypothesis test is called two-sided test.
If the hypothesis test is about deciding, whether a population mean, μ, is less than the specified value μ0, the alternative hypothesis is expressed as
\(H_A : \mu < \mu_0\)
If the hypothesis test is about deciding, whether a population mean, μ, is greater than a specified value μ0, the alternative hypothesis is expressed as
\(H_A : \mu > \mu_0\)
Important: when we perform hypothesis testing we want to reject the null hypothesis.
Calculate a test statistic to find the probability of the results observed, on the assumption that the null hypothesis is true. The test statistic is a number calculated from the data set, which is obtained by measurements and observations, or more general by sampling.
\(H_0 : \mu = \mu_0\)
\(H_A : \mu < \mu_0\)
We have to assume that there will always be a chance that the differences that we are observing are due to chance (sampling differences) and not due to a true difference brought by the independent variable.
We set a probability for our observed results to occur under the null hypothesis.
Significance level of 5% means that there is a 5% of probability of our observed difference to be a result of chance (different sampling). Researchers in the social sciences are normally comfortable with a 5% probability of having found their observed results by chance.
\(p < 0.05\)
A test statistic is a value describing the extent to which the research results differ from the null hypothesis. The test statistic is a hypothesis test that helps you determine whether to support or reject a null hypothesis in your study. You achieve this by using a test statistic to calculate the p-value of your results.
Two-tailed test
Learning in an immersion setting will result in different scores than learning in a classroom setting (it could be higher or lower).
\(H_0 : \mu_0 \neq \mu\)
We set the confidence level to 95% (learning in an immersion setting will result in DIFFERENT than learning in a classroom setting 95% of the time)
\(\alpha\) (p-value) is the significant level 0.05 (5%) and cuts off the two tails of the distribution, because the test statistic could have either positive or negative values. The critical region cuts an area of \(\alpha\)/2
19.6 and -1.96 are our critical values. If our test statistic is lower or higher than the critical values we REJECT the null hypothesis with 95% confidence. The difference between the two means is not likely to be by change at a significance level of 5%.
One-tailed test
Learning French in a classroom setting will result in lower scores than learning French in an immersion setting.
\(H_0 : \mu < \mu_0\)
We set the confidence level to 95% (learning in a classroom setting will result in higher results than learning in a classroom setting 95% of the time)
\(\alpha\) (p-value) is the significant level 0.05 (5%) and cuts off the two tails of the distribution, because the test statistic could have either positive or negative values. The critical region cuts an area of \(\alpha\)/2
The null hypothesis is rejected if the test statistic is too small.
When we reject or not reject the null hypothesis we do so using a significance level (5%), and this means that we still have some chance to reject the null hypothesis when the reality is that the null hypothesis is true, or to accept the null hypothesis when the reality is that the null hypothesis is false.
Type I error: The null hypothesis is rejected when it is actually true (false positive)
Type II error: The null rejected is not rejected when it is actually false (false negative)
Factors to decide which test to use:
Level of measurement concerned
Characteristics of the frequency distribution
Type of design used for the study
Parametric tests: for ratio or interval levels of measurement.
Assumptions of parametric tests:
Data are normally distributed.
Some parametric tests, populations have equal variances.
Example: Difference between the mean scores of a group of students that learned statistics for 10 hours a week during 2 weeks, and a group of students that learned statistics for 2 hours a week during 20 weeks.
Non-parametric tests: for ranking, ordinal variables, and numeric variables that are not normally distributed.
Non-parametric tests are less powerful than parametric tests.
Example: Participants decide whether speech produced using a mask is intelligible or not (Likert scale from 1 to 7) and this is compared to speech produced without using a mask. Difference between the means of the intelligibiltiy judgement task.
Correlated samples: for repeated measures designs
Example: one group of speakers exposed to speech produced with masks and speech not produced with masks.
Independent samples: two unrelated populations
Example: one group of speakers exposed to speech produced with masks and one group of speakers exposed to speech not produced with masks).
References
https://whitlockschluter3e.zoology.ubc.ca/RExamples/Rcode_Chapter_6.html
https://pub.towardsai.net/understanding-type-i-and-type-ii-errors-in-hypothesis-testing-956999e60e17
https://edge.sagepub.com/ssdsess3e/student-resources-0/chapter-3/discussion-questions
https://sciencenotes.org/independent-and-dependent-variables-examples/