Difference between numbers in a series of arrays

In this case a comparison is made between the answers given by people with different sets of characteristics to a particular question. Characteristics of the people include demographics, such as geography, household size, the presence of children, income, sex and age. These are cross-tabulated against such things as awareness or the use of a product category, brand and purchase intent.

EXAMPLE

Question: Are you aware of brand X washing machines? Table 10.1 shows the results.

 Yes N o Tota l Men 50 200 250 Women 150 300 450 Total 200 500 700

Simple inspection of the data would seem to suggest there is a difference between men and women in terms of awareness level. However, we really want to know if the difference observable in cross-tabulating the data is statistically significant: in other words, what the probability is that observed differences could have occurred by chance. In this particular instance we can use the chi-square test to see if this is the case (we should, however, consult standard statistical texts to appreciate the suitability of this test and to understand its theoretical underpinning).

The chi-square statistic is calculated by the formula:

where 0 , = the observed value and E, = the theoretically expected value, assuming in this case that there is no difference between the occupational backgrounds of respondents (see Table 10.2).

We hypothesise that there is no relationship. However, x2 at 2 degrees of freedom at the 5% level = 3.84 and because 13.5 > 3.84 the difference noted in the sample is statistically significant. We must reject the null hypothesis and conclude that the difference in levels of awareness in men and women did not occur by chance.

 Yes No Total Men (0 = observed) 50 200 250 (250 x 200)/700 (250 x 500)/700 (E = expected) = 71 = 179 Women (0 = observed) 150 300 450 (450 x 200)/700 (450 x 500)/700 (E = expected) = 129 = 321 Total 200 500 700

Note: Expected values are rounded off to the nearest whole number.

X2 = (50 - 71)2/71 + (200 -179)2/179 + (150 - 129)2/129 + (300 - 321)2/321 = 13.5 with degrees of freedom (r - 1)(k - 1) = 1

Note: Expected values are rounded off to the nearest whole number.

X2 = (50 - 71)2/71 + (200 -179)2/179 + (150 - 129)2/129 + (300 - 321)2/321 = 13.5 with degrees of freedom (r - 1)(k - 1) = 1

where r = number of rows, k = number of columns

In the example, using a 2 x 2 table, it is usually preferable to use Fisher's exact test or include Yates' correction where there are a small number of cases. In the latter case the formula is modified to:

Where the sample size is so small that the expected value is less than 5, then chi-square should not be used.

Chi-square is a statistical tool used to evaluate the statistical significance of differences between sets of data. Basically, it compares one or more frequency distributions of data to indicate whether there is a real difference. It compares an actual set of data against a theoretical one to show what would be expected by chance alone.

EXAMPLE

Chi-square

A market researcher has completed a study of soft drinks. The following table shows the brand purchased most often, broken down by male versus female. The researcher wants to know if there is a relationship between the gender of the purchaser and the brand purchased.

 Drink Male Female Coke SS 52 Pepsi S? 48 7-Up 35 38 Dr Pepper 34 21 Sprite 32 41 Lilt 33 31 Blackjack 18 25 Tango 34 24

Chi-square 'male' 'female'

Expected counts are printed below observed counts:

 Male Female Total 1 SS 52 118 S2.84 55.1S 2 S? 48 115 S1.24 53.?S 3 35 38 73 38.88 34.12 4 34 21 55 29.29 25.?1 5 32 41 73 38.88 34.12 S 33 31 S4 34.08 29.92 ? 18 25 43 22.90 20.10 8 34 24 58 30.89 27.11 Total 319 280 599

Use of similarities between numbers to show cause and effect

=-0.159 + 0.181 + 0.541 + 0.616 + 0.387 + 0.440 + 0.757 + 0.863 + 1.216 + 1.386 + 0.034 + 0.039 + 1.048 + 1.194 + 0.314 + 0.357

The tabular value of at 0.05 level of significance and (8 - 1)(2 - 1) = 7 degrees of freedom is 14.07. The calculated value (9.533) is lower than the tabular value (14.07). There would not seem to be a significant relationship between gender of the purchaser and the brand purchased.

Online Survey Champion

There are people all over the world trying to find ways to make money online. From stay at home moms looking to make a few extra dollars to college students and entrepreneurs, the allure of making your own hours and working from home or from the local coffee shop is very appealing.

Get My Free Ebook