Why do we run?: what the data says

Why do we run?Those of you who took part in the Big Running Survey may remember that it included a lot of questions about your motivations for running, and also a large section asking about the forms of running you took part in.

To non-runners I understand that the running community can look pretty homogenous – lots of people running around in shorts and t-shirts whatever the weather with slightly uncomfortable expressions on their faces. But of course as runners we know that isn’t the case. Running is an activity that encompasses a range of different sub-cultures and practises. From track sprinting to mountain marathoning, about the only they all have in common is running itself.

People run for a wide range of reasons, and these motivations (along with other factors) help to dictate the form of running they choose. So I thought it might be interesting to look at how motivations and ways of running related to each other statistically to start building a more nuanced picture of why we run.

Identifying key motivations

First of all, in order to manage the large number of motivational variables I have combined the scores for motivations that are strongly related to each other and make intuitive sense as clusters. Doing this we end up with these motivational clusters:

Competitive motivation
A combination of scores on motivations such as ‘to get the best possible times’ or ‘to do well in races’.

Psychological motivation
Includes questions like ‘it’s good for my psychological well-being’, ‘to escape my worries’ and ‘to have time to think’.

Aesthetic motivation
Focuses on questions around improving appearance and losing weight.

Social motivation
Combines questions on being motivated by social and community aspects of running.

Environment motivation
Includes motivations around enjoying being outside and interacting with the environment.

These categories don’t take in all the motivations examined in the survey, but these appear to be the most important and distinct categories. Using the data I can now give each runner who participated in the survey a score in each of these five dimensions.

Just as a summary, here are the average motivation levels (score out of 2) for men and women. I won’t comment on the differences we see here for now!

running motivations by gender
Table 1: Running motivations by gender

Identifying different ways of running

There were a lot of questions about specific details of how runners participate in the sport in the survey, but for now we’re going to stick to looking at the kind of races they like to take part in.

I used a similar process as for the motivations to identify clusters of related forms of running (types of race that were often attended by the same people) and generating over-arching categories of participation that each runner could be allocated a score in. They turned out to be:

Track running
Including all track races – whatever the distance.

Road running up to half-marathon
Self explanatory I hope!

Marathon and Ultra-marathon
It was difficult to draw a line between this category and road running because there is a high level of correspondence between half marathon and marathon runners. However, this pairing has some distinct characteristics and makes intuitive sense as a separate category.

Fell and trail running
Again, self explanatory. Quite a big overlap between this category and ultra-marathon too, but on balance it made sense to keep this pair in their own category.

Obstacle and mud running
A relatively recent phenomenon, this category is very distinct in terms of participation base.

Those who run but do not participate in races.

Again applying the theme of gender differences (which are important in running) we can see the level of participation in the last 12 months for one example of a form of running from each of these categories below:

Forms of running participation rates
Table 2: Forms of running participation rates

The above table deserves a couple of comments. First, the number of non-racing runners is probably an underestimate because less engaged runners may have been less likely to participate in the survey. I will address this later in the analysis by drawing on data from a huge survey by Sport England that will help measure the extent of this effect.

And secondly, I think the number of fell-runners and ultra-marathoners is probably a significant overestimate for the general running population. This is because these groups have been especially helpful in disseminating the survey and filling it in (thank you!). So these participation rates should be seen as reflecting the survey sample, not the overall running population.

Putting it together

Now we have scores for each runner in terms of their key motivations and their level of engagement with different forms of running. The next step is to combine these two sets of data to see how the different motivations correspond to each form of running.

In statistical terms I am looking for the degree of correlation between each type of running and each motivational cluster. Conducting this analysis gives us the following results, which I have simplified by giving each motivation a score to show the strength and direction of the relationship:

Running motivations by form of sport
Table 3: Running motivations by form of sport

A negative score (in red) indicates that the MORE someone is engaged in a particular form of running the LESS likely they are to be motivated by the relevant motivation.

A positive score indicates that MORE engagement with the form of running concerned is connected to a HIGHER level of the motivation.

The value of each score indicates the strength of the relationship.

What does this tell us?

I think the type of runner that stands out most clearly here is the ‘non-racer’. They really don’t seem to enjoy running much at all, and appear to be taking part mainly to lose weight – the stereotypical ‘jogger’.

mud runningMud and obstacle racing (despite the hype that surrounds it) appears to attract those with fairly low levels of all of our motivational variables, and is by far the least competitive form of racing. Interestingly given all the mud and mess, it’s also the only form of racing to be connected to the motivation to lose weight and look good.

Track racers and shorter distance road runners seem to have a lot in common in terms of their motivations. They are a competitive bunch who also enjoy the social side of the sport – being part of a community of runners.

The really long distance runners and the fell- and trail- runners also share a lot of characteristics, although the ultra/marathoners appear to value the inner experience of running whereas fell-runners favour the experience of engaging with the world around them.

A missing variable

Earlier I touched on the fact that men’s and women’s motivations are quite different (see table 1). This is really important, as it changes the picture a bit when we conduct the same analysis on each group in isolation.

In the next post (next week) I will break these results down to show how each form of running has a subtly different meaning to men and women.


Data Analysis / Perceptions of Talent

Data collection has finally come to an end for the Big Running Survey. I’m pleased to say that we’ve exceeded our targets, obtaining a total of almost 2,500 responses. Thank you so much to everyone who participated.

Over the coming weeks I’m going to be embarking on the data analysis and will publish updates on this blog to let all of those who took part have access to what we’ve found. From beginning to end it will take a few months as I’m simultaneously writing the PhD thesis of which this is a part, so expect a drip-drip of findings rather than a deluge!

I’ve already done a bit of exploratory analysis on the data just to get a feel for where interesting patterns and relationships are likely to appear. I’ve already discovered some quite surprising relationships, and I’m looking forward to exploring them a bit more when we move on to the interviewing stage of the project.

For instance (and in the spirit of International Women’s Day) the little table below tells us something very interesting about one of the differences between men’s and women’s relationship to running.

The table shows the mean scores for runners’ self-perceptions of their running talent. Scores were on a scale of 1 to 7. As you can see I have split them down by respondents’ gender and by their frequency of winning a medal for running in the last year (not simply for participation, but because they finished high up the field).

Running talent by gender

The medal winning frequency is a rough indicator of how good the respondents really are at running. You would expect frequent medalists to be amongst the most talented runners.

What I find really interesting here is that men are significantly more sure of their natural talent than women. Even those men that don’t medal at all rate themselves as being (on average) only marginally less talented than medalling women, and not much less talented than the top female runners. And men who win the occasional medal rate themselves considerably more talented than the top women’s group.

What does this tell us? Is it a reflection of women’s tendency to underestimate themselves or to want to avoid appearing boastful or competitive? (something that men, perhaps, are less reticent about!) Or is it something to do with the fact that in a mass participation race the average woman will usually finish a little way behind the average man, giving the man the impression he has done better because his finishing place appears to be above average, with the woman having the opposite experience?

At this stage I’m not in a position to say, but more data analysis and the interviews might help to shed some more light on this. It’s important, because whether or not we consider ourselves good at something is one of the factors that determines whether and how much we participate. Could women’s lower self-perceptions of talent be one of the factors that contribute to their lower levels of participation in running in general and in competitive running in particular? (more on this in later articles)

If you have any other thoughts or ideas on how this gender difference could be interpreted I would love to hear them. Please let me know using the comments section below.

This really is just the tip of a rather massive (and intimidating) iceberg in terms of what we’re going to be able to get from the data. Future posts will cover a lot more ground, but I thought it would be worth posting this to mark the start of the results coming through.

If you haven’t done so, please sign up (using the form at the top of the sidebar) to receive regular updates on the research.

The Missing Links

Current Gender Skew
Current Gender Skew

As I mentioned in a previous post, I’ve been staggered at the success of the Big Running Survey so far. Only 11 days of data collection and there are already well over a thousand responses, with more coming through by the hour. With each respondent answering about a hundred questions the volume of data I will be working with is fantastic – far above 100,000 datum.

This bodes extremely well for making some very confident assertions about the nature of the British running community, the ways it can be segmented and the relative ways in which different social groups participated in the sport.

If the sample (the people who completed the survey) was a completely random selection of British runners we would be looking at being able to say we have 99% confidence (in statistical terms) that the sample we have represents in miniature the entire population of British runners, accurate to within 3%. That would be very impressive indeed.

However, there is one fly in the ointment: sampling error.

Because we weren’t selecting people at random, but were instead inviting runners across various social media, via running clubs etc. we have a lot of unevenness in our coverage. There are, for example, many more women than men in the sample. Club members appear to be over-represented compared to non-members. There are very few runners over sixty in the sample. These discrepancies are sampling errors generated by the way the data was collected, for instance:

  • Social media users are on average younger than the general population
  • Women tend to be more keen to participate in surveys than men
  • Our tweets were picked up by an influential women’s running group, but not by its male equivalent
  • Email addresses for clubs are available on the internet, allowing us to reach club members, but no equivalent channel exists for lone runners or non-club members
  • Athletic (i.e track and field) clubs appear to have a more formal structure than many more casual running clubs, so appear more bureaucratic and resistant to requests to help

For these and more reasons we cannot assume that the data accurately represents the running community as a whole. However, it is very rare that this can be said to be the case for any survey with true confidence.

The good news is that this doesn’t hinder us significantly, especially given the volume of data. It may be that one group (let’s say sprinters) are under-represented, but there are enough of them for us to identify a decent sized group of them in the survey data that we can use to provide some detail of their approach to the sport and demographic profile compared to, say, road runners. The only thing we won’t be able to say with certainty is how big that group is across the UK.

Fortunately we can cross-reference this small group in our data with its counterpart in Sport England’s massive Active People Survey (APS) to find out how big the group really is. APS can give us the sheer size of the group, and our survey can provide rich detail about motivations, practices and opinions. In combination we can generate a pretty complete picture of the British running scene.

But before we get there I intend to try to fill in some of the gaps. In a couple of weeks we will do a preliminary analysis of the data to identify where we have any blind spots. That will give us some key groups to really focus in on in terms of data collection. We’ll follow this up with some precise, targeted efforts to build the data in those areas.

In the meantime, if you know any male runners, runners who are not involved in any kind of club or members of track and field clubs please let them know… we need them!

Link to survey: www.bigrunningsurvey.co.uk

Sport and Social Class – The Rankings

This is a follow-up to an article I published earlier this week that looked at how people’s socioeconomic background (crudely, their ‘class’) was a great predictor of the kinds of sports they got involved in.Sports by socioeconomic group

Based on a large scale survey of sports participants, running came in just below sports like sailing, yoga and windsurfing, but above cycling, basketball and football in terms of the socioeconomic status of its enthusiasts.

But the data I used for that study was from Belgium, and I only included a handful of sports. So this post is designed to provide an English perspective, as well as much wider coverage in terms of the sports included in the comparison.

I’ve taken data from Sport England’s massive ‘Active People Survey‘, an annual sports participation survey of over 160,000 people, and done a bit of number crunching to compile a list of popular sports ranked by their relative popularity to high and low status groups.

More precisely, I generated a ratio of the rate of participation in each sport by high socioeconomic group people to the rate of participation for the low socioeconomic group. Sorry if that sounds confusing, but what it means is:

If a sport gets a score of 2 on the ranking that would mean it is twice as popular with the high status group as it is with the low status group. Or, if a sport gets 0.5 then the likelihood of a high status person participating in the sport is half that of a low status person.

So this is about comparing the appeal of each sport to the two groups, not comparing the total numbers in each group participating.

The two socioeconomic status categories are defined using the N-SEC classification system used in the UK census. Here’s a list of those included in each group:

Higher Status

  1. Higher managerial and professional occupations
  2. Lower managerial and professional occupations
  3. Intermediate occupations (clerical, sales, service)
  4. Small employers and own account workers

Lower Status

5. Lower supervisory and technical occupations
6. Semi-routine occupations
7. Routine occupations
8. Never worked and long-term unemployed

‘Participation’ is defined as taking part in a sport at least once per week.

The UK’s ‘Poshest’ Sport Rankings



Participation Rate Ratio

1 Tennis 3.89
2 Squash 3.00
3 Keep-fit Classes 2.43
4 Golf 2.42
5 Mountaineering 2.40
6 Running 2.28
7 Road Cycling 2.25
8 Swimming – Outdoor 2.09
9 Athletics – Track & Field 2.08
10 Aerobics 2.05
11 Badminton – Indoor 1.87
12 Hockey 1.67
13 Swimming – Indoor 1.60
14 Netball 1.53
15 Fitness & Conditioning 1.50
16 Gym 1.49
17 Table Tennis 1.29
18 Boxing 1.20
19 Karate 1.06
20 Equestrian 1.05
21 Bowls 1.03
22 Shooting 1.00
23 Cricket 0.97
24 Football 0.94
25 Rugby Union – 15-a-side 0.73
26 Tenpin Bowling 0.71
27 Basketball 0.63
28 Snooker 0.60
29 Pool 0.56
30 Angling 0.54
31 Darts 0.40

Note: A high rank doesn’t mean better! The sports that are doing the best to encourage as wide a range of participants as possible are those towards the middle of the table with scores around 1. These are equally attractive to both ends of the social spectrum.

You can see that most sports are more popular with the higher status group than the lower status group (i.e. they have a participation ratio above 1). This reflects the fact that participation in sport as a whole is more common amongst the middle class than the working class.

Looking at the detail, there are a quite few surprising results. As with the Belgian results, running is pretty high up the list – only just behind golf and mountaineering, but I wouldn’t have guessed rugby or cricket would be in the lower half of the table,or that equestrian sports and shooting have such similar levels of appeal across the classes. But the data is from a very reliable source and from a huge sample, so we have to take it seriously.

I’d love to hear your interpretations for any of these figures.