Big Data and Decision Optimisation for Racing

Big Data and Decision Optimisation for Racing

Build fitness on a realistic schedule with The Weekly Workout Stack™

Get your free copy of this time-saving planner.
Download Now


Analytics and big data are the new buzz words in sports. This week learn how big data and decision optimisation are helping one rider in the Race Across America and how it might eventually help you too.



Dave Haase is a Race Across America racer working with IBM to optimise his performance in the 3000+ mile long race, from Oceanside, CA, to Annapolis, MD. There are no planned stops, racers stop when they want. The first to cross the finish line wins.


It involves racing 22 hours a day for 9 days or so leaves a lot of room for optimisation…we aren’t talking about racing at 300W for 22 hours and trying to squeeze out seconds or minutes. A RAAM racer is potentially talking hundreds of miles and hours of advantage…there a strategy to long distance racing that has always been based on intuition. Until now…


Haase has partnered with IBM for the insight and foresight that using data and analytics can provide. IBM have married the sensor data from Haase’s bike, biometrics, and forward-looking weather conditions. Haase and his crew then have the best information to make decisions about racing—and about resting.


That’s an important point which I will touch on later so I’ll repeat it so it sticks in your mind – Haase and his crew then have the best information to make decisions about racing—and about resting.


The goal of this partnership is to combine the output of data from Dave and his bike with external data such as location, terrain and weather data, and figuring out how to get the best performance out of Dave’s “engine” becomes a prediction and optimization problem.


Dave can predict and optimise when he expends energy, for what duration, when he rests and what food intake he needs to meet his milestones across the country.


How do solve a problem like this though?


Jean-François Puget – Chief Architect – Analytics Solutions, IBM. Talks about using data for decision-making. Decisions for who though – we will get to that. As you can probably tell Puget is the one that was putting the analytics side of things together.


He started with three questions when looking at this problem:

What data can we access?
What insights can we gain from that data?
What key decisions does Dave make during the race?


That last question is a great place to start when undertaking any big data analytics project: What decisions do we want to improve to get the outcomes we want?


What are the right questions for Dave to answer? To find out, they first had to get a complete understanding of the RAAM.


RAAM is a nonstop cycling race, more than 3,000 miles long. The route is fixed, but racers can stop when and where they want along that route:


Why not ask this about your discipline – What decisions do you want to improve to get the outcomes you want?


The most important decisions for RAAM was when Dave should rest. Dave will race for eight or nine days, sleeping only two hours a day.


Rest is vital to restore energy and power but when Dave rests, he is not moving. And as time goes by, so can other riders! So it becomes a must to balance two competing goals:


Have Dave rest, restoring him to a higher level of power to increase his speed.


Keep Dave on the course as much as possible, increasing his distance ridden.


Making decisions while balancing conflicting goals is a great use case for decision optimisation, an analytics method focusing on computing the best options in a given situation. To help with this IBM developed a decision optimisation model that helps find optimal times for Dave to rest during RAAM. This model sees Dave as an engine whose power declines as he rides and increases when he rests.


To create the decision optimisation model, they needed data about how Dave’s power evolves when he rides and when he rests. This is where the “Internet of Dave” comes into play. This is derived from the the Internet of Things – which is an increased machine-to-machine communication built on cloud computing and networks of data-gathering sensors; it’s mobile, virtual, and instantaneous connection.


So not only does Dave have the usual bike sensors recording and transmitting.




Dave wears a Bioharness from Zephyr –


That has multiple physiological measurements, such as:

Heart Rate
Respiration Rate
Estimated Core Temperature


Data from these were collected over months during Dave’s training rides. Then analysed. Resulting in solid estimates about how Dave’s power evolves:
So back to the questions asked by Jean-François Puget. They can close the loop, with these answers:

  • What data can we access? Internet of Dave sensor data, including Dave’s medical condition and the power he is able to deliver.
  • What insights can we gain from that data? How Dave’s power evolves when Dave is riding and when he is resting.
  • What key decisions does Dave make during the race? When to rest to help him finish the race as early as possible.


How about you – what answers can help you make better decisions?


Let’s consider you don’t have a bioharness and just use the standard sensors available to cyclists. I’m going to make up some answers here…


  • What data can you access? Sensor data, HR, cadence, speed, power.
  • What insights can you gain from that data? How does your power evolve after certain efforts – like when you burn a match.
  • What key decisions does Dave make during the race? When to sit in to help you finish the race with the lead group.


The analytics that helps Dave (and us) are descriptive analytics which is the training data. Predictive analytics, the power data, or even the training load changes. And the most the most important in races – Decision optimisation.


It’s not just Dave though, there are other factors at play. We aren’t talking about a velodrome here. The effects of wind are obvious to every cyclist. If the wind is a tailwind, you will ride faster than riding into a headwind.


The effects across a race of any distance can be dramatic, let alone the RAAM. The gradient of the road is also important.


To evaluate the importance of these effects, they modeled Dave’s speed precisely as a function of his power, his weight, his air penetration coefficient, his bike friction coefficient, the road’s slope, the wind’s strength and the wind’s direction. This model, incorporating physical laws, is quite precise as long as it’s feed with accurate data.


The Internet of Dave provides some of that data. Beyond that, The Weather Company (TWC) provides data about current and forecast weather:

  • wind
  • speed
  • direction
  • temperature


And RAAM organisers provide GPS data that they use to compute slope.From which they were able to map out the elevation along the course:
Their model, in combination with all these data, can predict when Dave will cross the finish line if he maintains constant power output and never rests. They ran the model with three scenarios at constant power: no wind, constant headwind of 10 mph, and the constant tailwind of 10 mph.


In the windless scenario, Dave takes about 7.5 days to complete the race. With a headwind, he takes about one day longer. With a tailwind, he takes about one day less.


This shows the importance of wind as a predictor of performance.


In 2006, Dave faced the worst weather in the history of RAAM when crosswinds gusted up to 80 mph across the planes of Kansas. Ever-changing and highly variable conditions make a clear case for using what foresight we can.

So they will use weather forecast data to compute expected headwinds and tailwinds along the route.


So the influence of wind on Dave’s performance must be done must be done as he goes. This can help in the short term.


Big data can help Dave and his crew make decisions, but ultimately it is they who will decide whether to follow the model’s rest and ride recommendations. Analytics simply assist them by providing a better way of evaluating all available options. And though it might look like a lot of work, Dave is the one that has to ride across America!


I have been thinking a lot lately about the application of big data – or at least making use of the data we already have.


We have seen Pro cycling’s first attempts through Robby Ketchell and Slipstream Sports – now that Ketchell has moved to Sky no doubt this technology will take another leap. He has focussed the most energy on data capture, management, and optimization out of anyone else in cycling. Add that to a big budget and it’s exciting to think what they are doing right now in some secret squirrel lab. He has already admitted created wearables from scratch.


To remind you of what he has already developed – It’s called Platypus, a database that contains information on every rider in the race it’s monitoring. “When riders get into a break, Platypus loads them into the app and you can click on the stats: the breaks they’ve been in, how often they succeed, and other historical data,” Ketchell says. Directors in the car have real-time, direct access via a companion app Ketchell created that runs on Apple devices the team uses.


Also, weather forecasts and real-time conditions are a part of this – but that’s easy.


More information about the riders in the race or breakaway. But linking performance data and predicted data of other riders could be the next step.


If you’re curious about the evolution of AI. There is plenty of directions AI can go. Platypus is an example of Artificial Narrow Intelligence. AI that specialises in one area. A non-cycling is Deep Blue, the AI that can beat the world chess champion in chess, but that’s the only thing it does.


Beyond ANI is Artificial General Intelligence (AGI): a machine that can perform any intellectual task that a human being can. Finally, there is Artificial Superintelligence: Artificial Superintelligence ranges from a computer that’s just a little smarter than a human to one that’s trillions of times smarter—across the board.


But by the time this rolls around, we will either be dead (from the machines) or onto some next level shit.


Right now it seems that decision optimisation is the main tool used here to take the data from numbers to action. ATM we have predictive analytics like Best Bike Split – but not yet have we got something that helps us win real bike races. Not time trials but ones with other riders.


Endless scenarios can play out in a race and this is where the experience of you, your team your DS, your manager all get put into play.


How can this problem be solved? How can we use data for racing outcomes rather than just performance outcomes?


What if there was some way to help you make decisions in a race – combining the physical and tactical elements. Knowing how much you have to give, the history of the riders around you. Who might go when, who is a threat – a database of scenarios with the most likely ones ranked in order.


This is when big data gets real. Crunching real time data and give best outcomes based on all of these different elements that can change at any moment. Seems impossible – I guess Deep Blue did as well.




Build fitness on a realistic schedule with this time-saving planner.

Get your free copy of The Weekly Workout Stack™

Leave a reply


Build fitness on a realistic schedule with this time-saving planner.

Get your free copy of The Weekly Workout Stack™

Build fitness on a realistic schedule with The Weekly Workout Stack™

Get your free copy of this time-saving planner.
Download Now
Download this comprehensive list
& get the best brands in one place