A machine learning project to build an autonomous sailplane that remains aloft on thermal currents is impressive enough. But the work conducted by Microsoft researchers Andrey Kolobov and Iain Guilliard will also improve the decision making and trustworthiness of IoT devices, personal assistants and autonomous cars.
The constraints limiting the computational resource of weight and space imposed by the airframe of the sailplane adds relevance to the many new developments in ubiquitous computing. The autonomous sailplane is controlled by a 160MHz Arm Cortex M4 with 256KB of RAM and 60KB of flash running on batteries that monitor the sensors, run the autopilot and control the servo motors, to which the researchers have added a machine learning model that continuously learns how to autonomously ride the thermal currents.
+ Also on Network World: The inextricable link between IoT and machine learning +
In the these early days of platforms like digital assistants, IoT and autonomous vehicles, there are hundreds of open problems that will be distilled into a handful of scientific questions that first must be answered to build products that match popular visions of them. When scientific questions start to emerge, the platform’s future becomes predictable — maybe not to an exact month, but within a year or three.
IoT following the same path as Web 2.0
The Web 2.0 platform followed a similar course. First talked about in 1999, it developed enough interest in 2004 for O’Reilly Media and MediaLive to host the first Web 2.0 conference in 2004. But it was not until later in the decade that companies such as Salesforce and Google implemented Web 2.0. This was a decade-long evolution of first a vision, a collection of open problems distilled into scientific questions that university and industrial researchers answered.
As the technology shifts from research to development, product developers find the answers to the scientific questions in the work of researchers that enables them to build a product. Digital assistants, IoT and autonomous vehicles — on a Web 2.0 time-scale — are much closer to 2004 than the later part of the first decade of this millennium when enough research was translated into development that products could be built at scale.
Goals of Microsoft’s sailplane project
At an all-hands meeting, inspired by a 2013 story in the Economist about autonomous sailplanes, the team of Microsoft researchers set the goals of this project to answer two scientific questions: how to build trusted AI systems and how to architect system with AI and machine learning as a fundamental systems design principle. Kolobov’s said:
“The state of the art in AI development is not at the level where AI agents can reliably act fully autonomously, which is why we do not see many AI systems acting in the physical world with full autonomy. MSR is trying to build AI systems that are robust and can be trusted to act fully on their own, performing better than humans. The implications of this research apply to personal assistants, autonomous cars and IoT.
“We wanted to gain experience in designing systems where AI and machine learning are first-class citizens so we do not have to fundamentally modify the architecture of the systems post hoc.”
Kolobov and Guillard are part of a multidisciplinary team with complementary skills. Kolobov has applied AI and machine learning to commercial products such as Windows and Bing. Guillard, after over a decade working on control systems on the Airbus 350 and the A4 Skyhawk, is a computer science Ph.D. candidate at the Australian National University and an intern at Microsoft. An odd pairing perhaps, unless the question that they are trying to answer is understood.
There are many machine learning problems that do not have ground truth. Ground truth is an accurate data set for machine learning classification and training. Machine learning models are programmed (trained) with data, not lines of code. The models are taught to predict a correct answer with data. If a model is taught to recognize cats, labeled images of cats are ground truth. This method called supervised training has two stages: training, typically based on beefy GPUs — learning from images of cats and not cats — and inference, or predicting the right answer — cat or not cat.
Because the physical world in which the machine learning model interacts is unpredictable, ground truth cannot be easily simulated with computers — and classified training data does not exist. How can every defensive maneuver of an autonomous vehicle in reaction to another vehicle be predicted? Or how can every elderly caretaker robot’s response to a patient in distress be predicted?
These unstable systems cannot be predicted. The models need to learn by interacting with chaotic, unstable systems. Vehicles on highways and elderly patients are unsuitable for AI to learn to operate reliably and safely. A sailplane, however, is.
How the sailplane works
Ground truth for an autonomous sailplane is limited because it is impossible to measure the turbulent condition, where the rising thermal column of air begins and ends and what is happening inside. A laptop on the ground performs high-level planning for the sailplane based on data from flights by manned sailplanes using the local terrain and wind conditions to predict thermals sent to the autonomous sailplane via telemetry.
The sailplane uses an onboard Bayesian reinforcement learning algorithm to make decisions by using the observations it gets from its sensors. Bayesian reinforcement learning was chosen because of the model’s ability to plan its actions to learn and exploit knowledge in an optimal manner. This approach is easier to understand the decisions the agent is making and why.
The model uses Monte Carlo tree search to choose the detected areas of lift that can be exploited to optimize altitude to instruct the autopilot to adjust the elevators, which are the horizontal control surfaces on the tail that cause an airplane to climb or descend with servo motors to keep the sailplane soaring. It also uses Monte Carlo tree search to exploit thermal currents, sending instructions to the autopilot. Monte Carlo tree search has been applied to win non-deterministic games, such as Google’s Go project and poker. It gradually builds up a partial game tree of moves, then uses advanced strategies to find a balance between exploring new decision branches and exploiting the most promising branches.
Running Bayesian reinforcement algorithms on a sailplane poses significant challenges compared to Go and poker. Remember the computational and battery constraints? The sailplane is controlled by the real-time open-source ArduPilot running on open-source Pixhawk Arm Cortex M4 hardware. Execution of the Bayesian reinforcement algorithm is interleaved with the real-time ArduPilot in short, less than 100ms intervals so that control of the sailplane’s sensors and servo motors is maintained and crashes avoided.
The pairing of Kolobov and Guillard is a careful match of a machine learning expert with an aeronautic control systems domain expert who is also a computer science Ph.D. candidate. But their pairing isn’t the only clever combination. This search for a deeper understanding was combined with off-the shelf sailplane airframes, sensors, servo motors, open-source hardware and an open-source auto-pilot so that Kolobov and Guillard could get right to implementing and tuning the Bayesian reinforcement learning and quickly iterate their designs, improving their results.
During these pioneering days of digital assistants, IoT and autonomous vehicles, product developers will find some of the most relevant answers to scientific questions that can be translated into their products fields, such as sailplanes that might appear at first to be orthogonal and unrelated.
It may take another five to 10 years for digital assistants, IoT and autonomous vehicles to become as reliable as humans. Microsoft Research is working on one of the scientific questions that must first be answered before these pioneering product visions reach this point.
Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.