Tag Archives: research

Microsoft announces expansion of Montreal research lab, new director

Geoffrey Gordon has been named Microsoft Research Montreal’s new research director. Photo by Nadia Zheng.

Microsoft plans to significantly expand its Montreal research lab and has hired a renowned artificial intelligence expert, Geoffrey Gordon, to be the lab’s new research director.

The company said Wednesday that it hopes to double the size of Microsoft Research Montreal within the next two years, to as many as 75 technical experts. The expansion comes as Montreal is becoming a worldwide hub for groundbreaking work in the fields of machine learning and deep learning, which are core to AI advances.

“Montreal is really one of the most exciting places in AI right now,” said Jennifer Chayes, a technical fellow and managing director of Microsoft Research New England, New York City and Montreal.

Chayes said Gordon, currently a professor of machine learning at Carnegie Mellon University, was a natural choice for the job in part because he’s interested in both the foundational AI research that addresses fundamental AI challenges and the applied work that can quickly find its way into mainstream use.

“We want to be doing the research that will be infusing AI into Microsoft products today and tomorrow, and Geoff’s research really spans that,” she said. “He’ll be able to help us improve our products and he’ll also be laying the foundation for AI to do much more than is possible today.”

Jennifer Chayes, technical fellow and managing director of Microsoft Research New England, New York City and Montreal.

Chayes also noted that Gordon’s broad and deep AI expertise will be a major asset to the lab. She noted that Gordon is an expert in reinforcement learning, in which systems learn through trial and error, and he’s also done groundbreaking work in areas such as robotics and natural language processing. The ability to combine all those areas of expertise will be key to developing sophisticated AI systems in the future.

“Given that we want a very broad AI lab, Geoff is the ideal person to lead it, and to create the fundamental research that underlies the next generation of AI,” she said.

Gordon said he’s especially interested in creating AI systems that have what we think of as long-term thinking: the ability to come up with a coherent plan to solve a problem or to take multiple actions based on clues it gets along the way. That’s the kind of thing that comes easily to people but is currently rudimentary in most AI systems.

Over the last few years, AI systems have gotten very good at individual tasks, like recognizing images or comprehending words in a conversation, thanks to a combination of improved data, computing power and algorithms.

Now, researchers including Gordon are working on ways to combine those skills to create systems that can augment people’s work in more sophisticated ways. For example, a system that could accurately read clues based on what it is seeing and hearing to anticipate when it would be useful to step in and help would be much more valuable than one that requires a person to ask for help with a specific task when needed.

“We have, in some cases, superhuman performance in recognizing patterns, and in very restricted domains we get superhuman performance in planning ahead,” he said. “But it’s surprisingly difficult to put those two things together – to get an AI to learn a concept and then build a chain of reasoning based on that learned concept.”

Microsoft began developing its research presence in Montreal a year ago, when it acquired the deep learning startup Maluuba.

The Microsoft Research team in Montreal has already made groundbreaking advances in AI disciplines that are key to the type of systems Gordon imagines. That includes advances in machine reading comprehension – the ability to read a document and provide information about it in a plainspoken way – and in methods for teaching AI systems to do complex tasks, such as by dividing large tasks into small tasks that multiple AI agents can handle.

Gordon said he was drawn to the new position both because of the work the team in Montreal is doing and the opportunity to collaborate with the broader Montreal AI community.

“Research has always been about standing on the shoulders of giants, to borrow a phrase from a giant – and it’s even more so in the current age,” Gordon said.

The city has become a hotbed for AI advances thanks to a strong academic and research presence, as well as government funding commitments.

Yoshua Bengio, an AI pioneer who heads the Montreal Institute for Learning Algorithms, said Gordon’s presence and the Microsoft lab’s expansion will help continue to build the momentum that the Montreal AI community has seen in recent years. He noted that Gordon’s area of focus, on AI systems that can learn to do more complex tasks, is complementary to the work he and others in the community also are pursuing.

“It’s one of the strengths of Montreal,” said Bengio, who is also an AI advisor to Microsoft.

Joelle Pineau, an associate professor of computer science at McGill University and director of Montreal’s Facebook AI Research Lab, said she was thrilled to hear Gordon would be joining the Montreal AI ecosystem.

“There is no doubt that the Montreal AI community will be deeply enriched by his presence here,” Pineau said.

Navdeep Bains, Canada’s minister of innovation, science and economic development, said he was looking forward to seeing the work that Gordon and Microsoft Research Montreal will produce.

“I am pleased that our government’s investment in innovation and skills continues to position Canada as a world-leading destination for AI companies and impressive researchers like Geoff Gordon,” he said.

The expansion of the Montreal lab is part of Microsoft’s long history of investing in international research hubs, including labs in the U.S., Asia, India and Cambridge, United Kingdom. Chayes said the company’s international presence has helped it attract and retain some of the world’s best researchers in AI and other fields, and it also has helped ensure that the company’s AI systems reflect a diversity of experiences and cultures.

For example, Chayes said the fact that Montreal is a bilingual city could help inform the company’s work in areas such as translation and speech recognition.

“It’s a culture where you go back and forth between two languages. That’s a very interesting environment in which to develop tools for natural language understanding,” she said.

The French version of this blog post can be found on the Microsoft News Center Canada.


Allison Linn is a senior writer at Microsoft. Follow her on Twitter.

CodeTalk: Rethinking accessibility for IDEs

By Suresh Parthasarathy, Senior Research Developer; Gopal Srinivasa, Senior Research Software Development Engineer

CodeTalk team members from left to right include: Priyan Vaithilingam, Suresh Parthasarathy, Venkatesh Potluri, Manohar Swaminathan and Gopal Srinivasa from Microsoft Research India.

Software programming productivity tools known as integrated development environments, or IDEs, are supposed to be a game changer for Venkatesh Potluri, a research fellow in Microsoft’s India research lab. Potluri is a computer scientist who regularly needs to write code efficiently and accurately for his research in human computer interaction and accessibility. Instead, IDEs are one more source of frustration for Potluri: he is blind and unable to see the features that make IDEs a boon to the productivity of sighted programmers, such as squiggly red lines that automatically appear beneath potential code errors.

Potluri uses a screen reader to hear the code that he types. He scrolls back and forth through the computer screen to maintain context. But using a screen reader with an IDE is incomplete since much of the information from these systems is conveyed visually. For example, code is syntax highlighted in bright colors, errors are automatically highlighted with squiggles and the debugger uses several windows to provide the full context of a running program. Performance analysis tools use charts and graphs to highlight bottlenecks and architecture analysis tools use graphical models to show code structure.

“IDEs provide a lot of relevant information while writing code; a lot of this information — such as the current state of the program being debugged, real-time error alerts and code refactoring suggestions, are not announced to screen reader users,” Potluri said. “As a developer using a screen reader, the augmentation IDEs provide is not of high value to me.”

Soon after Venkatesh joined Microsoft Research India in early 2017, he and his colleagues Priyan Vaithilingam and Saqib Shaikh launched Project CodeTalk to increase the value of IDE’s for the community of blind and low vision users. According to a recent survey posted on the developer community website Stack Overflow, users who self-identify as blind or low vision make up one percent of the programmer population, which is higher than the 0.4 percent of people in the general population. Team members realized that while a lot of work had gone into making IDEs more accessible, the efforts had fallen short of meeting the needs of blind and low vision developers.

As a first step, the team explored their personal experiences with IDE technologies. Potluri, for example, detailed frustrations such as trying to fix one last bug before the end of a long day, listening carefully to the screen reader and concentrating hard to retain in his mind the structures of the code file only to have the screen reader go silent a few seconds after program execution. Uncertain if the program completed successfully or terminated with an exception, he has to take extra steps to recheck the program that keep him at work late into the night.

[embedded content]

The CodeTalk team also drew insights from a survey of blind and low vision developers that was led by senior researcher Manohar Swaminathan. The effort generated ideas for the development of an extension that improves the experience of the blind and low vision community of developers who use Microsoft’s Visual Studio, a popular IDE that supports multiple programming languages and is customizable. The CodeTalk extension and source code are now available on GitHub.

Highlights of the extension include the ability to quickly access code constructs and functions that lead to faster coding, learn the context of where the cursor is in the code, navigate through chunks of code with simple keystrokes and hear auditory cues when the code has errors and while debugging. The extension also introduces a novel concept of Talkpoints, which can be thought of as audio-based breakpoints.

Together, these features make debugging and syntax checking—two critical features of IDEs—far more accessible to blind and low vision developers, according to a study the CodeTalk team conducted with blind and low vision programmers. Real-time error information and talk points were particularly appreciated as significant productivity boosters. The team also began using the extension for their own development, and discovered that the features were useful for sighted users, as well.

CodeTalk is one step in a long journey of exploring ways to make IDEs more accessible. Research is ongoing to define and meet the needs of blind and low vision developers. The source code is available on GitHub and contributors are invited. The Visual Studio extension is available for download.

You can read more about this story on Microsoft’s Research Blog.

CodeTalk team members include Suresh Parthasarathy, Gopal Srinivasa, Priyan Vaithilingam, Manohar Swaminathan and Venkatesh Potluri from Microsoft Research India and Saqib Shaikh from Microsoft Research Cambridge.

Debugging data: Microsoft researchers look at ways to train AI systems to reflect the real world – The AI Blog

Photo of Microsoft researcher Hanna Walach
Hanna Wallach is a senior researcher in Microsoft’s New York City research lab. Photo by John Brecher.

Artificial intelligence is already helping people do things like type faster texts and take better pictures, and it’s increasingly being used to make even bigger decisions, such as who gets a new job and who goes to jail. That’s prompting researchers across Microsoft and throughout the machine learning community to ensure that the data used to develop AI systems reflect the real world, are safeguarded against unintended bias and handled in ways that are transparent and respectful of privacy and security.

Data is the food that fuels machine learning. It’s the representation of the world that is used to train machine learning models, explained Hanna Wallach, a senior researcher in Microsoft’s New York research lab. Wallach is a program co-chair of the Annual Conference on Neural Information Processing Systems from Dec. 4 to Dec. 9 in Long Beach, California. The conference, better known as “NIPS,” is expected to draw thousands of computer scientists from industry and academia to discuss machine learning – the branch of AI that focuses on systems that learn from data.

“We often talk about datasets as if they are these well-defined things with clear boundaries, but the reality is that as machine learning becomes more prevalent in society, datasets are increasingly taken from real-world scenarios, such as social processes, that don’t have clear boundaries,” said Wallach, who together with the other program co-chairs introduced a new subject area at NIPS on fairness, accountability and transparency. “When you are constructing or choosing a dataset, you have to ask, ‘Is this dataset representative of the population that I am trying to model?’”

Kate Crawford, a principal researcher at Microsoft’s New York research lab, calls it “the trouble with bias,” and it’s the central focus of an invited talk she will be giving at NIPS.

“The people who are collecting the datasets decide that, ‘Oh this represents what men and women do, or this represents all human actions or human faces.’ These are types of decisions that are made when we create what are called datasets,” she said. “What is interesting about training datasets is that they will always bear the marks of history, that history will be human, and it will always have the same kind of frailties and biases that humans have.”

Researchers are also looking at the separate but related issue of whether there is enough diversity among AI researchers. Research has shown that more diverse teams choose more diverse problems to work on and produce more innovative solutions. Two events co-located with NIPS will address this issue: The 12thWomen in Machine Learning Workshop, where Wallach, who co-founded Women in Machine Learning, will give an invited talk on the merger of machine learning with the social sciences, and the Black in AI workshop, which was co-founded by Timnit Gebru, a post-doctoral researcher at Microsoft’s New York lab.

“In some types of scientific disciplines, it doesn’t matter who finds the truth, there is just a particular truth to be found. AI is not exactly like that,” said Gebru. “We define what kinds of problems we want to solve as researchers. If we don’t have diversity in our set of researchers, we are at risk of solving a narrow set of problems that a few homogeneous groups of people think are important, and we are at risk of not addressing the problems that are faced by many people in the world.”

Timnit Gebru is a post-doctoral researcher at Microsoft’s New York City research lab. Photo by Peter DaSilva.

Machine learning core

At its core, NIPS is an academic conference with hundreds of papers that describe the development of machine learning models and the data used to train them.

Microsoft researchers authored or co-authored 43 accepted conference papers. They describe everything from the latest advances in retrieving data stored in synthetic DNA to a method for repeatedly collecting telemetry data from user devices without compromising user privacy.

Nearly every paper presented at NIPS over the past three decades considers data in some way, noted Wallach. “The difference in recent years, though,” she added, “is that machine learning no longer exists in a purely academic context, where people use synthetic or standard datasets. Rather, it’s something that affects all kinds of aspects of our lives.”

The application of machine-learning models to real-world problems and challenges is, in turn, bringing into focus issues of fairness, accountability and transparency.

“People are becoming more aware of the influence that algorithms have on their lives, determining everything from what news they read to what products they buy to whether or not they get a loan. It’s natural that as people become more aware, they grow more concerned about what these algorithms are actually doing and where they get their data,” said Jenn Wortman Vaughan, a senior researcher at Microsoft’s New York lab.

The trouble with bias

Data is not something that exists in the world as an object that everyone can see and recognize, explained Crawford. Rather, data is made. When scientists first began to catalog the history of the natural world, they recognized types of information as data, she noted. Today, scientists also see data as a construct of human history.

Crawford’s invited talk at NIPS will highlight examples of machine learning bias such as news organization ProPublica’s investigation that exposed bias against African-Americans in an algorithm used by courts and law enforcement to predict the tendency of convicted criminals to reoffend, and then discuss how to address such bias.

“We can’t simply boost a signal or tweak a convolutional neural network to resolve this issue,” she said. “We need to have a deeper sense of what is the history of structural inequity and bias in these systems.”

One method to address bias, according to Crawford, is to take what she calls a social system analysis approach to the conception, design, deployment and regulation of AI systems to think through all the possible effects of AI systems. She recently described the approach in a commentary for the journal Nature.

Crawford noted that this isn’t a challenge that computer scientists will solve alone. She is also a co-founder of the AI Now Institute, a first-of-its-kind interdisciplinary research institute based at New York University that was launched in November to bring together social scientists, computer scientists, lawyers, economists and engineers to study the social implications of AI, machine learning and algorithmic decision making.

Jenn Wortman Vaughan is a senior researcher at Microsoft’s New York City research lab. Photo by John Brecher.

Interpretable machine learning

One way to address concerns about AI and machine learning is to prioritize transparency by making AI systems easier for humans to interpret. At NIPS, Vaughan, one of the New York lab’s researchers, will give a talk describing a large-scale experiment that she and colleagues are running to learn what factors make machine learning models interpretable and understandable for non-machine learning experts.

“The idea here is to add more transparency to algorithmic predictions so that decision makers understand why a particular prediction is made,” said Vaughan.

For example, does the number of features or inputs to a model impact a person’s ability to catch instances where the model makes a mistake? Do people trust a model more when they can see how a model makes its prediction as opposed to when the model is a black box?

The research, said Vaughan, is a first step toward the development of “tools aimed at helping decision makers understand the data used to train their models and the inherent uncertainty in their models’ predictions.”

Patrice Simard, a distinguished engineer at Microsoft’s Redmond, Washington, research lab who is a co-organizer of the symposium, said the field of interpretable machine learning should take a cue from computer programming, where the art of decomposing problems into smaller problems with simple, understandable steps has been learned. “But in machine learning, we are completely behind. We don’t have the infrastructure,” he said.

To catch up, Simard advocates a shift to what he calls machine teaching – giving machines features to look for when solving a problem, rather than looking for patterns in mountains of data. Instead of training a machine learning model for car buying with millions of images of cars labeled as good or bad, teach a model about features such as fuel economy and crash-test safety, he explained.

The teaching strategy is deliberate, he added, and results in an interpretable hierarchy of concepts used to train machine learning models.

Researcher diversity

One step to safeguard against unintended bias creeping into AI systems is to encourage diversity in the field, noted Gebru, the co-organizer of the Black in AI workshop co-located with NIPS. “You want to make sure that the knowledge that people have of AI training is distributed around the world and across genders and ethnicities,” she said.

The importance of researcher diversity struck Wallach, the NIPS program co-chair, at her fourth NIPS conference in 2005. For the first time, she was sharing a hotel room with three roommates, all of them women. One of them was Vaughan, and the two of them, along with one of their roommates, co-founded the Women in Machine Learning group, which is now in its 12th year and has held a workshop co-located with NIPS since 2008. This year, more than 650 women are expected to attend.

Wallach will give an invited talk at the Women in Machine Learning Workshop about how she applies machine learning in the context of social science to measure unobservable theoretical constructs such as community membership or topics of discussion.

“Whenever you are working with data that is situated within society contexts,” she said, “necessarily it is important to think about questions of ethics, fairness, accountability, transparency and privacy.”


John Roach writes about Microsoft research and innovation. Follow him on Twitter.

Multiple Intel firmware vulnerabilities in Management Engine

New research has uncovered five Intel firmware vulnerabilities related to the controversial Management Engine, leading one expert to question why the Intel ME cannot be disabled.

The research that led to finding the Intel firmware vulnerabilities was undertaken “in response to issues identified by external researchers,” according to Intel. This likely refers to a flaw in Intel Active Management Technology — part of the Intel ME — found in May 2017 and a supposed Intel ME kill switch found in September. Due to issues like these, Intel “performed an in-depth comprehensive security review of our Intel Management Engine (ME), Intel Server Platform Services (SPS), and Intel Trusted Execution Engine (TXE) with the objective of enhancing firmware resilience.”

In a post detailing the Intel firmware vulnerabilities, Intel said the flaws could allow an attacker to gain unauthorized access to a system, impersonate the ME/SPS/TXE, execute arbitrary code or cause a system crash.

Mark Ermolov and Maxim Goryachy, researchers at Positive Technologies Research, an enterprise security company based in Framingham, Mass., were credited with finding three Intel firmware vulnerabilities, one in each of Intel ME, SPS and TXE.

“Intel ME is at the heart of a vast number of devices worldwide, which is why we felt it important to assess its security status. It sits deep below the OS and has visibility of a range of data, everything from information on the hard drive to the microphone and USB,” Goryachy told SearchSecurity. “Given this privileged level of access, a hacker with malicious intent could also use it to attack a target below the radar of traditional software-based countermeasures such as anti-virus.”

How dangerous are Intel ME vulnerabilities

The Intel ME has been a controversial feature because of the highly-privileged level of access it has and the fact that it can continue to run even when the system is powered off. Some have even suggested it could be used as a backdoor to any systems running on Intel hardware.

Tod Beardsley, research director at Rapid7, said that given Intel ME’s “uniquely sensitive position on the network,” he’s happy the security review was done, but he had reservations.

Controlling privilege isn’t difficult to do, but it is key to securing systems.
James Maudesenior security engineer, Avecto

“It is frustrating that it’s difficult to impossible to completely disable this particular management application, even in sites where it’s entirely unused. The act of disabling it tends to require actually touching a keyboard connected to the affected machine,” Beardsley told SearchSecurity. “This doesn’t lend itself well to automation, which is a bummer for sites that have hundreds of affected devices whirring away in far-flung data centers. It’s also difficult to actually get a hold of firmware to fix these things for many affected IoT devices.”

James Maude, senior security engineer at Avecto Limited, an endpoint security software company based in the U.K., said that the Intel firmware vulnerabilities highlight the importance of controlling user privileges because some of the flaws require higher access to exploit.

“From hardware to software, admin accounts with wide-ranging privilege rights present a large attack surface. The fact that these critical security gaps have appeared in hardware that can be found in almost every organization globally demonstrates that all businesses need to bear this in mind,” Maude told SearchSecurity. “Controlling privilege isn’t difficult to do, but it is key to securing systems. It’s time for both enterprises and individual users to realize that they can’t rely solely on inbuilt security — they must also have robust security procedures in place.”

However, Beardsley noted all of the firmware vulnerabilities across the Intel products require physical access to the machine in order to exploit.

“For the majority of issues that require local access, the best advice is simply not to allow untrusted users physical access to the affected systems,” Beardsley said. “This is pretty easy for server farms, but can get trickier for things like point-of-sale systems, kiosks, and other computing objects where low-level employees or the public are expected to touch the machines. That said, it’s nothing a little epoxy in the USB port can’t solve.”

AI’s sharing economy: Microsoft creates publicly available datasets

From left, Adam Atkinson of Microsoft Research Maluuba, Yoshua Bengio of University of Montreal and Samira Ebrahimi Kahou of Microsoft Research Maluuba are among the AI experts who worked on the FigureQA dataset. Photo courtesy of Microsoft Research Maluuba.

Samira Ebrahimi Kahou and her colleagues at Microsoft Research Maluuba recently set out to solve an interesting research problem: How could they use artificial intelligence to correctly reason about information found in graphs and pie charts?

One big obstacle, they discovered, was that the research area was so new that there weren’t any existing datasets available for them to test their hypotheses.

So, they made one.

The FigureQA dataset, which the team released publicly earlier this fall, is one of a number of datasets, metrics and other tools for testing AI systems that Microsoft researchers and engineers have created and shared in recent years. Researchers all over the world use them to see how well their AI systems do at everything from translating conversational speech to predicting the next word a person may want to type.

The teams say these tools provide a codified way for everyone from academic researchers to industry experts to test their systems, compare their work and learn from each other.

“It clarifies our goals, and then others in the research community can say, ‘OK, I see where you’re going,’” said Rangan Majumder, a partner group program manager within Microsoft’s Bing division who also leads development of the MS MARCO machine reading comprehension dataset. The year-old dataset is getting an update in the next few months.

For people used to the technology industry’s more traditional way of doing things, that kind of information sharing can seem surprising. But in the field of AI, where academics and industry players are very much intertwined, researchers say this type of openness is becoming more common.

“Traditionally, companies have kept their research in-house. Now, we’re really seeing an industrywide impact where almost every company is publishing papers and trying to move the state of the art forward, rather than moving it into a walled garden,” said Rahul Mehrotra, a program manager at Microsoft’s Montreal-based Maluuba lab, which also has released two other datasets, NewsQA and Frames, in the past year.

Many AI experts say that this more collaborative culture is crucial to advancing the field of AI. They note that many of the early breakthroughs in the field were the result of researchers from competing institutions sharing knowledge and building on each other’s work.

“We can’t have all the ideas on the planet, so if someone else has a great idea and wants to try it out, we can give them a dataset to do that,” said Christian Federmann, a senior program manager with the Microsoft Translator team.

Federmann’s team developed the Microsoft Speech Language Translation Corpus so they and others could test bilingual conversational speech translation systems such as the Microsoft Translator live feature and Skype Translator. The corpus was recently updated with additional language pairs.

Federmann also notes that Microsoft is one of the few big players that has the budget and resources to create high-quality tools and datasets that allow the industry to compare its work.

That’s key to creating the kind of benchmarks that people can use to credibly showcase their achievements. For example, the recent milestones in conversational speech recognition are based on results of the Switchboard corpus.

Rangan Majumder, a partner group program manager within Microsoft’s Bing division, leads development of the MS MARCO machine reading comprehension dataset

Paying it forward

Many of the teams that are developing datasets and other metrics say they are, in a sense, paying it forward because they also rely on datasets that others have created.

When they were a small startup, Mehrotra said Maluuba relied heavily on a Microsoft dataset called MCTest. Now, as part of Microsoft, they’ve been pleased to see that the datasets they are creating are being used by others in the field.

Devi Parikh, an assistant professor at Georgia Tech and research scientist at Facebook AI Research, said the FigureQA dataset Maluuba recently released is helpful because it allows researchers like herself to work on problems that require the use of multiple types of AI. To accurately read a graphic and answer a question about it requires both computer vision and natural language processing.

“From a research perspective, I think there’s more and more interest in working on problems that are at the intersection of subfields of AI,” she said.

Still, researchers and engineers working in the AI field say that while some information sharing is valuable, there are also times when competing researchers want to be able to compare their systems without revealing all the information about the data they are using.

Doug Orr, a senior software engineering lead with SwiftKey, which Microsoft acquired last year, said his team wanted to create a standard way for measuring how good a job a system does at predicting what a person will type next. That’s a key component of SwiftKey’s systems, which offer personalized predictions based on a person’s communications style.

Instead of sharing a dataset, the team created a set of metrics that researchers can use with any dataset. The metrics, which are available on GitHub, allow researchers to have standardized benchmarks with which they can measure their own improvement and compare their results to others, without having to share proprietary data.

Orr said the metrics have benefited the team internally because they have a better sense of how much their systems are improving over time, and it allowed everyone in the field to be more transparent about how they are performing against each other.

Majumder, from the Bing team, says his team sees value in testing their systems with any and all available benchmarks, including internal data they don’t share publicly, datasets they build for public use and ones that others create, such as the SQuAD dataset.

When people join his team from other areas of the company, he says they often have to get used to the fact that they are entering a hybrid area where the team is developing products while also making AI research breakthroughs.

In the field of AI, he says, “what we have is somewhere in between engineering and science.”


Allison Linn is a senior writer at Microsoft. Follow her on Twitter.

Tags: AI, Big Data

Neural fuzzing: applying DNN to software security testing

William Blum, Principal Research Engineering Lead. (Photography by Scott Eklund/Red Box Pictures)

Microsoft researchers have developed a new method for discovering software security vulnerabilities that uses machine learning and deep neural networks to help the system root out bugs better by learning from past experience. This new research project, called neural fuzzing, is designed to augment traditional fuzzing techniques, and early experiments have demonstrated promising results.

Software security testing is a hard task that is traditionally done by security experts through costly and targeted code audits, or by using very specialized and complex security tools to detect and assess vulnerabilities in code. We recently released a tool, called Microsoft Security Risk Detection, that significantly simplifies security testing and does not require you to be an expert in security in order to root out software bugs. The Azure-based tool is available to Windows users and in preview for Linux users.

Fuzz testing
The key technology underpinning Microsoft Security Risk Detection is fuzz testing, or fuzzing. It’s a program analysis technique that looks for inputs causing error conditions that have a high chance of being exploitable, such as buffer overflows, memory access violations and null pointer dereferences.

Fuzzers come in different categories:

  • Blackbox fuzzers, also called “dumb fuzzers,” rely solely on the sample input files to generate new inputs.
  • Whitebox fuzzers analyze the target program either statically or dynamically to guide the search for new inputs aimed at exploring as many code paths as possible.
  • Greybox fuzzers, just like blackbox fuzzers, don’t have any knowledge of the structure of the target program, but make use of a feedback loop to guide their search based on observed behavior from previous executions of the program.

Figure 1 – Crashes reported by AFL. Experimental support in MSRD

Neural fuzzing
Earlier this year, Microsoft researchers including myself, Rishabh Singh, and Mohit Rajpal, began a research project looking at ways to improve fuzzing techniques using machine learning and deep neural networks. Specifically, we wanted to see what a machine learning model could learn if we were to insert a deep neural network into the feedback loop of a greybox fuzzer.

For our initial experiment, we looked at whether we could learn over time by observing past fuzzing iterations of an existing fuzzer.

We applied our methods to a type of greybox fuzzer called American fuzzy lop, or AFL.

We tried four different types of neural networks and ran the experiment on four target programs, using parsers for four different file formats: ELF, PDF, PNG, XML.

The results were very encouraging—we saw significant improvements over traditional AFL in terms of code coverage, unique code paths and crashes for the four input formats.

  • The AFL system using deep neural networks based on the Long short-term memory (LSTM) neural network model gives around 10 percent improvement in code coverage over traditional AFL for two files parsers: ELF and PNG.
  • When looking at unique code paths, neural AFL discovered more unique paths than traditional AFL for all parsers except PDF. For the PNG parser, after 24 hours of fuzzing it found twice as many unique code paths as traditional AFL.

Figure 2 – Input gain over time (in hours) for the libpng file parser.

  • A good way to evaluate fuzzers is to compare the number of crashes reported. For the ELF file parser, neural AFL reported more than 20 crashes whereas traditional AFL did not report any. This is astonishing given that neural AFL was trained on AFL itself. We also observed more crashes being reported for text-based file formats like XML, where neural AFL could find 38 percent more crashes than traditional AFL. For PDF, traditional AFL did overall better than neural AFL in terms of new code paths found. However, neither system reported any crashes.

Figure 3 – Reported crashes over time (in hours) for readelf (left) and libxml (right).

Overall, using neural fuzzing outperformed traditional AFL in every instance except the PDF case, where we suspect the large size of the PDF files incurs noticeable overhead when querying the neural model.

In general, we believe our neural fuzzing approach yields a novel way to perform greybox fuzzing that is simple, efficient and generic.

  • Simple: The search is not based on sophisticated hand-crafted heuristics — the system learns a strategy from an existing fuzzer. We just give it sequences of bytes and let it figure out all sorts of features and automatically generalize from them to predict which types of inputs are more important than others and where the fuzzer’s attention should be focused.
  • Efficient: In our AFL experiment, in the first 24 hours we explored significantly more unique code paths than traditional AFL. For some parsers we even report crashes not already reported by AFL.
  • Generic: Although we’ve tested it only on AFL, our approach could be applied to any fuzzer, including blackbox and random fuzzers.

We believe our neural fuzzing research project is just scratching the surface of what can be achieved using deep neural networks for fuzzing. Right now, our model only learns fuzzing locations, but we could also use it to learn other fuzzing parameters such as the type of mutation or strategy to apply. We are also considering online versions of our machine learning model, in which the fuzzer constantly learns from ongoing fuzzing iterations.

William Blum leads the engineering team for Microsoft Security Risk Detection.


Bad Rabbit ransomware data recovery may be possible

Two different security research firms uncovered important information about the Bad Rabbit ransomware attacks, including the motives and a possible way to recover data without paying.

A threat research team from FireEye found a connection between the Bad Rabbit ransomware and “Backswing,” which FireEye described as a “malicious JavaScript profiling framework.” According to the researchers, Backswing has been seen in use in the wild since September 2016 and recently some sites harboring the framework were redirecting to Bad Rabbit distribution URLs.

“Malicious profilers allow attackers to obtain more information about potential victims before deploying payloads (in this case, the Bad Rabbit ‘flash update’ dropper),” FireEye researchers wrote. “The distribution of sites compromised with Backswing suggest a motivation other than financial gain. FireEye observed this framework on compromised Turkish sites and Montenegrin sites over the past year. We observed a spike of Backswing instances on Ukrainian sites, with a significant increase in May 2017. While some sites hosting Backswing do not have a clear strategic link, the pattern of deployment raises the possibility of a strategic sponsor with specific regional interests.”

Researchers added that using Backswing to gather information on targets and the growing number of malicious websites containing the framework could point to “a considerable footprint the actors could leverage in future attacks.”

Bad Rabbit ransomware recovery

Meanwhile, researchers from Kaspersky Lab discovered flaws in the Bad Rabbit ransomware that could give victims a chance to recover encrypted data without paying the ransom.

The Kaspersky team wrote in a blog post that early reports that the Bad Rabbit ransomware leaked the encryption key were false, but the team did find a flaw in the code where the malware doesn’t wipe the generated password from memory, leaving a slim chance to extract it before the process terminates.

However, the team also detailed an easier way to potentially recover files.

“We have discovered that Bad Rabbit does not delete shadow copies after encrypting the victim’s files,” Kaspersky researchers wrote. “It means that if the shadow copies had been enabled prior to infection and if the full disk encryption did not occur for some reason, then the victim can restore the original versions of the encrypted files by the means of the standard Windows mechanism or 3rd-party utilities.”