Increasingly, as database people, we are being asked to squeeze business insights out of survey data. Are surveys that easy? Heck, no. Are they accurate? It depends. Generally, you'd be better off staring at tea-leaves if you want to predict behaviour, attitudes and opinions.
Surveys are an aspect of scientific method that look easy, in much the same way that dressing up in a white coat, looking thoughtful, asking people to say 'Ah', and writing prescriptions illegibly, seems like something that anyone could do.
Surveys aim to draw general conclusions about the entire population from just a sample of it. This is perfectly legitimate, but to do this, each person or thing must both be selected randomly and be an accurate representation of the whole population. The better you do it, the lower the sampling error, but however hard you try, a sample selected randomly from a population will almost never be exactly the same as the entire population.
Even those who are trained in psychometrics and statistics blanch in terror at the prospect of conducting an accurate survey, because they are so easy to get wrong. Surveys also suffer from bias, caused by the way the sampling was done or the measurements taken. There are so many ways of unintentionally manipulating information from surveys that the unwary can get very badly misled by them.
Even if you are engaged in biological research or doing estimation, the gathering of data needs care and training. If it is about human behaviour or opinions, there is a minefield to get through. The idea that anything can be gained from randomly contacting people and asking questions, or even worse, trying to get them to fill in online forms, is wrong. Leave it to the Sunday magazines, along with horoscopes and fashion tips.
In a sense, a skewed sample distribution tells you more than no sample at all. For example, a survey conducted on Twitter will at least tell you whether people who use Twitter a lot like or dislike a policy or product. However, it would be foolish to extrapolate that those findings to make predictions about the general population, which is likely to contain many clusters of contrary opinion.
There are magic tricks we can try in order to 'clean' the data. We can combine several surveys, average them up, and apply formulas to correct consistent bias. In science, there is no way to heal bad data, and your colleagues will pillory you if you attempt it.
So, in the two buttons below, click 'yes' if you have faith in unscientific surveys, and 'no' if you don't. Yeah, just kidding.
Phil Factor.