Learning Air Traffic Controller's "Workload" from Speech

Written by Thinh Hoang x GPT-4o. All views are of my own. These research projects are personal.
ATC is fundamentally a human's business (at least for now)
If I had to bet on the weakest link holding Air Traffic Control back from becoming a precise, hard science, I’d put my $100 on one thing: human stress and workload. Everything else: flight planning, conflict detection, trajectory prediction, has matured into something robust, even beautiful. But when it comes to the human operator, we’re still in the dark. And our flashlight? A handful of indices we call "complexity metrics."
These metrics are meant to estimate how hard a situation feels to a controller. Some are obvious: how many aircraft are in the sector. Others are subtler: Are aircraft converging? Are they changing altitude? How often do handovers happen? How busy is the radio? Collectively, these indicators determine how much we dare to load a sector before someone, somewhere, starts yelling.
Research has done a lot. We’ve learned, for example, that cognitive demand increases with traffic density, task switching, ambiguous intentions, and even things like weather reroutes or aircraft speed variations. But here’s the thing: most of these studies are carried out in simulation labs, under tidy, isolated conditions. In the real world, things blur. A single miscommunication, an unexpected sector opening, or a tweak in team structure can tip the balance.
And that’s the scary part. Introduce a new concept of operations, say, a different way to delegate roles across a team, and suddenly we’re guessing. We run the simulations. We show the slides. But we still don’t know how it’ll feel for the person in the hot seat. Like in the Oppenheimer movie: "Theory can only take you so far."
All the models and algorithms in the world won’t save us if we can’t answer this simple question: when will a controller feel like it’s too much? It’s like designing the perfect race car, only to realize the driver can't steer it well because the wheel is too sensitive.
Until we understand that, until we can predict not just what’s happening in the sky, but what’s happening in someone’s mind, we’ll keep hitting the same invisible wall.
It’s not a tech problem. It’s a human one. And that makes it all the more worth solving. And publishing.
In this series, we will build a model to derive stress and complexity, by diving into the features that we learn from ATC recordings. I will show the code, the notebook, and you can try it too. Let's see how far we can push this stupid idea.