Statistical precision

The size of a sample affects the quality of the research data and it is not simply a question of applying some arbitrary percentage to a specific population. The sample size that needs to be taken should reflect the basic characteristics of the population, the type of information required and the costs entailed. The larger the sample size, the greater its precision or reliability, but practical constraints of time, staff and other costs intervene.

When computing the size of a sample, the size of the non-response factor should be borne in mind. If, for example, a final sample of 3500 is planned and the non-response is estimated at 30% then it would be advisable to increase the original factor to 5000. Such a correction will help to obtain the number of responses required but it will not correct the bias that arises from non-response errors - the fact that those who do not respond may have significantly different opinions from those who do respond.

The error of a sample varies inversely with the square root of the sample size. A sample of 9000 is more accurate than a sample of 1000. After a certain sample size has been reached, additional large increases in size do not significantly improve the statistical precision of a given sample. Costs, however, certainly do increase with larger samples.

The precision of the survey is a function of the sample size (although see Chisnall6). Precision is related to the square of the number in the sample. That is, the accuracy of results increases proportionately to the square of the sample size. If the sample size is doubled, let us say from 500 to 1000, accuracy rises only to the square root of the doubled sample size (2), or a bit less than half (1.4). While technically the rationale applies only to probability sampling, it is usually used as a guide in planning the size of non-probability samples as well. By sampling precision we do not mean how accurate the results are, for accuracy depends on too many factors: how good the questions are, whether or not the interviewer (if there is one) affects the results, etc. What is referred to here is the degree to which data on a simultaneously replicated study parallel those of another study or how similar the results of two simultaneous studies are.

Sampling error Sampling error is the difference between a survey result and its parameter (a known statistic of the universe) - that is, the difference between the true value of a parameter of a population and that estimated from a sample. The error occurs because the value has been calculated from a sample rather than from the whole parent population.

Confidence interval The confidence interval is an interval within which a parameter of a parent population is calculated (on the basis of sample data) to have a stated probability of lying. Generally, the researcher settles for a 95% probability - meaning that there are 95 or more chances in 100 that the reported figure falls within the range of a stated numerical value from the parameter.

For simplicity and because it answers almost all needs, our discussion here is limited to a discussion of percentages. But another measure of the confidence interval might be in terms of means. The confidence interval can be calculated only when the study has used probability sampling to obtain the results.

Overall precision required Once the overall precision (range of the confidence interval) has been defined for the particular study, it is possible to make some tentative decisions on sample size. We say tentative only because there are still other aspects to be considered. In principle this approach can be used only with probability sampling, since there is no way of knowing whether a non-probability sample produces a sample similar to one based on selection probabilities. Practically, however, it is often used as a guide in determining the sample size of non-probability samples. As the sample size is quadrupled the confidence interval is cut in two. The confidence interval decreases as the expected percentage of results moves away from 50% (either plus or minus). This has some practical implications. First, increasing the required accuracy places a heavy premium on the price of the survey because of the effect it has on sample size requirements. Second, if the expected result cannot be predicted, 50% is the correct percentage to use. That is where the greatest error margin is experienced. So if results cannot be predicted in advance, sample size planning should be on the conservative side: in this case, the larger, more expensive side.

Online Survey Champion

Online Survey Champion

There are people all over the world trying to find ways to make money online. From stay at home moms looking to make a few extra dollars to college students and entrepreneurs, the allure of making your own hours and working from home or from the local coffee shop is very appealing.

Get My Free Ebook


Post a comment