The Missing Links

Current Gender Skew
Current Gender Skew

As I mentioned in a previous post, I’ve been staggered at the success of the Big Running Survey so far. Only 11 days of data collection and there are already well over a thousand responses, with more coming through by the hour. With each respondent answering about a hundred questions the volume of data I will be working with is fantastic – far above 100,000 datum.

This bodes extremely well for making some very confident assertions about the nature of the British running community, the ways it can be segmented and the relative ways in which different social groups participated in the sport.

If the sample (the people who completed the survey) was a completely random selection of British runners we would be looking at being able to say we have 99% confidence (in statistical terms) that the sample we have represents in miniature the entire population of British runners, accurate to within 3%. That would be very impressive indeed.

However, there is one fly in the ointment: sampling error.

Because we weren’t selecting people at random, but were instead inviting runners across various social media, via running clubs etc. we have a lot of unevenness in our coverage. There are, for example, many more women than men in the sample. Club members appear to be over-represented compared to non-members. There are very few runners over sixty in the sample. These discrepancies are sampling errors generated by the way the data was collected, for instance:

  • Social media users are on average younger than the general population
  • Women tend to be more keen to participate in surveys than men
  • Our tweets were picked up by an influential women’s running group, but not by its male equivalent
  • Email addresses for clubs are available on the internet, allowing us to reach club members, but no equivalent channel exists for lone runners or non-club members
  • Athletic (i.e track and field) clubs appear to have a more formal structure than many more casual running clubs, so appear more bureaucratic and resistant to requests to help

For these and more reasons we cannot assume that the data accurately represents the running community as a whole. However, it is very rare that this can be said to be the case for any survey with true confidence.

The good news is that this doesn’t hinder us significantly, especially given the volume of data. It may be that one group (let’s say sprinters) are under-represented, but there are enough of them for us to identify a decent sized group of them in the survey data that we can use to provide some detail of their approach to the sport and demographic profile compared to, say, road runners. The only thing we won’t be able to say with certainty is how big that group is across the UK.

Fortunately we can cross-reference this small group in our data with its counterpart in Sport England’s massive Active People Survey (APS) to find out how big the group really is. APS can give us the sheer size of the group, and our survey can provide rich detail about motivations, practices and opinions. In combination we can generate a pretty complete picture of the British running scene.

But before we get there I intend to try to fill in some of the gaps. In a couple of weeks we will do a preliminary analysis of the data to identify where we have any blind spots. That will give us some key groups to really focus in on in terms of data collection. We’ll follow this up with some precise, targeted efforts to build the data in those areas.

In the meantime, if you know any male runners, runners who are not involved in any kind of club or members of track and field clubs please let them know… we need them!

Link to survey:

Leave a Reply