Tag Archives: research

From search to translation, AI research is improving Microsoft products

The evolution from research to product

It’s one thing for a Microsoft researcher to use all the available bells and whistles, plus Azure’s powerful computing infrastructure, to develop an AI-based machine translation model that can perform as well as a person on a narrow research benchmark with lots of data. It’s quite another to make that model work in a commercial product.

To tackle the human parity challenge, three research teams used deep neural networks and applied other cutting-edge training techniques that mimic the way people might approach a problem to provide more fluent and accurate translations. Those included translating sentences back and forth between English and Chinese and comparing results, as well as repeating the same translation over and over until its quality improves.

“In the beginning, we were not taking into account whether this technology was shippable as a product. We were just asking ourselves if we took everything in the kitchen sink and threw it at the problem, how good could it get?” Menezes said. “So we came up with this research system that was very big, very slow and very expensive just to push the limits of achieving human parity.”

“Since then, our goal has been to figure out how we can bring this level of quality — or as close to this level of quality as possible — into our production API,” Menezes said.

Someone using Microsoft Translator types in a sentence and expects a translation in milliseconds, Menezes said. So the team needed to figure out how to make its big, complicated research model much leaner and faster. But as they were working to shrink the research system algorithmically, they also had to broaden its reach exponentially — not just training it on news articles but on anything from handbooks and recipes to encyclopedia entries.

To accomplish this, the team employed a technique called knowledge distillation, which involves creating a lightweight “student” model that learns from translations generated by the “teacher” model with all the bells and whistles, rather than the massive amounts of raw parallel data that machine translation systems are generally trained on. The goal is to engineer the student model to be much faster and less complex than its teacher, while still retaining most of the quality.

In one example, the team found that the student model could use a simplified decoding algorithm to select the best translated word at each step, rather than the usual method of searching through a huge space of possible translations.

The researchers also developed a different approach to dual learning, which takes advantage of “round trip” translation checks. For example, if a person learning Japanese wants to check and see if a letter she wrote to an overseas friend is accurate, she might run the letter back through an English translator to see if it makes sense. Machine learning algorithms can also learn from this approach.

In the research model, the team used dual learning to improve the model’s output. In the production model, the team used dual learning to clean the data that the student learned from, essentially throwing out sentence pairs that represented inaccurate or confusing translations, Menezes said. That preserved a lot of the technique’s benefit without requiring as much computing.

With lots of trial and error and engineering, the team developed a recipe that allowed the machine translation student model — which is simple enough to operate in a cloud API — to deliver real-time results that are nearly as accurate as the more complex teacher, Menezes said.

Arul Menezes standing with arms folded in front of green foliage in the background
Arul Menezes, Microsoft distinguished engineer and founder of Microsoft Translator. Photo by Dan DeLong.

Improving search with multi-task learning

In the rapidly evolving AI landscape, where new language understanding models are constantly introduced and improved upon by others in the research community, Bing’s search experts are always on the hunt for new and promising techniques. Unlike the old days, in which people might type in a keyword and click through a list of links to get to the information they’re looking for, users today increasingly search by asking a question — “How much would the Mona Lisa cost?” or “Which spider bites are dangerous?” — and expect the answer to bubble up to the top.

“This is really about giving the customers the right information and saving them time,” said Rangan Majumder, partner group program manager of search and AI in Bing. “We are expected to do the work on their behalf by picking the most authoritative websites and extracting the parts of the website that actually shows the answer to their question.”

To do this, not only does an AI model have to pick the most trustworthy documents, but it also has to develop an understanding of the content within each document, which requires proficiency in any number of language understanding tasks.

Last June, Microsoft researchers were the first to develop a machine learning model that surpassed the estimate for human performance on the General Language Understanding Evaluation (GLUE) benchmark, which measures mastery of nine different language understanding tasks ranging from sentiment analysis to text similarity and question answering. Their Multi-Task Deep Neural Network (MT-DNN) solution employed both knowledge distillation and multi-task learning, which allows the same model to train on and learn from multiple tasks at once and to apply knowledge gained in one area to others.

Bing’s experts this fall incorporated core principles from that research into their own machine learning model, which they estimate has improved answers in up to 26 percent of all questions sent to Bing in English markets. It also improved caption generation — or the links and descriptions lower down on the page — in 20 percent of those queries. Multi-task deep learning led to some of the largest improvements in Bing question answering and captions, which have traditionally been done independently, by using a single model to perform both.

For instance, the new model can answer the question “How much does the Mona Lisa cost?” with a bolded numerical estimate: $830 million. In the answer below, it first has to know that the word cost is looking for a number, but it also has to understand the context within the answer to pick today’s estimate over the older value of $100 million in 1962. Through multi-task training, the Bing team built a single model that selects the best answer, whether it should trigger and which exact words to bold.

Screenshot of a Bing search results page showing an enhanced answer of how much the Mona Lisa costs, with a snippet from Wikipedia
This screenshot of Bing search results illustrates how natural language understanding research is improving the way Bing answers questions like “How much does the Mona Lisa cost?” A new AI model released this fall understands the language and context of the question well enough to distinguish between the two values in the answer — $100 million in 1962 and $830 million in 2018 — and highlight the more recent value in bold. Image by Microsoft.

Earlier this year, Bing engineers open sourced their code to pretrain large language representations on Azure.  Building on that same code, Bing engineers working on Project Turing developed their own neural language representation, a general language understanding model that is pretrained to understand key principles of language and is reusable for other downstream tasks. It masters these by learning how to fill in the blanks when words are removed from sentences, similar to the popular children’s game Mad Libs.

You take a Wikipedia document, remove a phrase and the model has to learn to predict what phrase should go in the gap only by the words around it,” Majumder said. “And by doing that it’s learning about syntax, semantics and sometimes even knowledge. This approach blows other things out of the water because when you fine tune it for a specific task, it’s already learned a lot of the basic nuances about language.”

To teach the pretrained model how to tackle question answering and caption generation, the Bing team applied the multi-task learning approach developed by Microsoft Research to fine tune the model on multiple tasks at once. When a model learns something useful from one task, it can apply those learnings to the other areas, said Jianfeng Gao, partner research manager in the Deep Learning Group at Microsoft Research.

For example, he said, when a person learns to ride a bike, she has to master balance, which is also a useful skill in skiing. Relying on those lessons from bicycling can make it easier and faster to learn how to ski, as compared with someone who hasn’t had that experience, he said.

“In some sense, we’re borrowing from the way human beings work. As you accumulate more and more experience in life, when you face a new task you can draw from all the information you’ve learned in other situations and apply them,” Gao said.

Like the Microsoft Translator team, the Bing team also used knowledge distillation to convert their large and complex model into a leaner model that is fast and cost-effective enough to work in a commercial product.

And now, that same AI model working in Microsoft Search in Bing is being used to improve question answering when people search for information within their own company. If an employee types a question like “Can I bring a dog to work”? into the company’s intranet, the new model can recognize that a dog is a pet and pull up the company’s pet policy for that employee — even if the word dog never appears in that text. And it can surface a direct answer to the question.

“Just like we can get answers for Bing searches from the public web, we can use that same model to understand a question you might have sitting at your desk at work and read through your enterprise documents and give you the answer,” Majumder said.

Top image: Microsoft investments in natural language understanding research are improving the way Bing answers search questions like “How much does the Mona Lisa cost?” Image by Musée du Louvre/Wikimedia Commons. 

Related:

Jennifer Langston writes about Microsoft research and innovation. Follow her on Twitter.

Go to Original Article
Author: Microsoft News Center

Microsoft Open Data Project adopts new data use agreement for datasets

Datasets compilation for Open Data

Last summer we announced Microsoft Research Open Data—an Azure-based repository-as-a-service for sharing datasets—to encourage the reproducibility of research and make research data assets readily available in the cloud. Among other things, the project started a conversation between the community and Microsoft’s legal team about dataset licensing. Inspired by these conversations, our legal team developed a set of brand new data use agreements and released them for public comment on Github earlier this year.

Today we’re excited to announce that Microsoft Research Open Data will be adopting these data use agreements for several datasets that we offer.

Diving a bit deeper on the new data use agreements

The Open Use of Data Agreement (O-UDA) is intended for use by an individual or organization that is able to distribute data for unrestricted uses, and for which there is no privacy or confidentiality concern. It is not appropriate for datasets that include any data that might include materials subject to privacy laws (such as the GDPR or HIPAA) or other unlicensed third-party materials. The O-UDA meets the open definition: it does not impose any restriction with respect to the use or modification of data other than ensuring that attribution and limitation of liability information is passed downstream. In the research context, this implies that users of the data need to cite the corresponding publication with which the data is associated. This aids in findability and reusability of data, an important tenet in the FAIR guiding principles for scientific data management and stewardship.

We also recognize that in certain cases, datasets useful for AI and research analysis may not be able to be fully “open” under the O-UDA. For example, they may contain third-party copyrighted materials, such as text snippets or images, from publicly available sources. The law permits their use for research, so following the principle that research data should be “as open as possible, as closed as necessary,” we developed the Computational Use of Data Agreement (C-UDA) to make data available for research while respecting other interests. We will prefer the O-UDA where possible, but we see the C-UDA as a useful tool for ensuring that researchers continue to have access to important and relevant datasets.

Datasets that reflect the goals of our project

The following examples reference datasets that have adopted the Open Use of Data Agreement (O-UDA).

Location data for geo-privacy research

Microsoft researcher John Krumm and collaborators collected GPS data from 21 people who carried a GPS receiver in the Seattle area. Users who provided their data agreed to it being shared as long as certain geographic regions were deleted. This work covers key research on privacy preservation of GPS data as evidenced in the corresponding paper, “Exploring End User Preferences for Location Obfuscation, Location-Based Services, and the Value of Location,” which was accepted at the Twelfth ACM International Conference on Ubiquitous Computing (UbiComp 2010). The paper has been cited 147 times, including for research that builds upon this work to further the field of preservation of geo-privacy for location-based services providers.

Hand gestures data for computer vision

Another example dataset is that of labeled hand images and video clips collected by researchers Eyal Krupka, Kfir Karmon, and others. The research addresses an important computer vision and machine learning problem that deals with developing a hand-gesture-based interface language. The data was recorded using depth cameras and has labels that cover joints and fingertips. The two datasets included are FingersData, which contains 3,500 labeled depth frames of various hand poses, and GestureClips, which contains 140 gesture clips (100 of these contain labeled hand gestures and 40 contain non-gesture activity). The research associated with this dataset is available in the paper “Toward Realistic Hands Gesture Interface: Keeping it Simple for Developers and Machines,” which was published in Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems.

Question-Answer data for machine reading comprehension

Finally, the FigureQA dataset generated by researchers Samira Ebrahimi Kahou, Adam Atkinson, Adam Trischler, Yoshua Bengio and collaborators, introduces a visual reasoning task for research that is specific to graphical plots and figures. The dataset has 180,000 figures with 1.3 million question-answer pairs in the training set. More details about the dataset are available in the paper “FigureQA: An Annotated Figure Dataset for Visual Reasoning” and corresponding Microsoft Research Blog post. The dataset is pivotal to developing more powerful visual question answering and reasoning models, which potentially improve accuracy of AI systems that are involved in decision making based on charts and graphs.

The data agreements are a part of our larger goals

Microsoft Research Open Data project was conceived from the start to reflect Microsoft Research’s commitment to fostering open science and research and to achieve this without compromising the ethics of collecting and sharing data. Our goal is to make it easier for researchers to maintain provenance of data while having the ability to reference and build upon it.

The addition of the new data agreements to Microsoft Research Open Data’s feature set is an exciting step in furthering our mission.

Acknowledgements: This work would not have been possible without the substantial team effort by — Dave Green, Justin Colannino, Gretchen Deo, Sarah Kim, Emily McReynolds, Mario Madden, Emily Schlesinger, Elaine Peterson, Leila Stevenson, Dave Baskin, and Sergio Loscialo.

Go to Original Article
Author: Microsoft News Center

New research reveals a surprising link between the workplace and business success

To help businesses stay a step ahead in the digital age, Microsoft has released new research in partnership with Dr. Michael Parke of the London Business School. Surveying 9,000 workers and business leaders across 15 European markets, the research delved into company growth, employee engagement, leadership styles and technology.

According to the findings, change is the new normal as businesses race to adapt and better compete: 92% of European leaders say their organization has recently undergone a major transformation.

And, the number-one transformation challenge in leaders’ minds is company culture.

Our customers and partners across Europe tell us that keeping up with the pace of digital transformation and innovation is among their chief concerns. But based on our own internal cultural transformation at Microsoft over the past few years, I always encourage business leaders to give as much consideration to company culture as they do to deploying new technology. After all, it’s not just about having the best technology; it’s about how you and your teams react to, and adapt to, change. – Vahé Torossian, Corporate Vice President, Microsoft, and President, Microsoft Western Europe.

The study revealed that getting the workplace culture component right can benefit businesses in a significant way.

Companies that were assessed as having ‘innovative cultures’ – generally defined as cultures where new ideas are embraced and supported – were twice as likely to expect double-digit growth. These businesses also seem positioned to win the war for talent: the majority of workers within these organizations (86%) plan to stay in their jobs, as opposed to 57% of those employees working in less innovative cultures.

There are three key attributes that set these innovative companies apart:

I. Tearing down silos and building bridges

Companies with the most innovative cultures have leaders who are not only tearing down silos, they’re replacing them with partnerships and transparency. These leaders are more likely to see effective collaboration as vital for business growth – whether it’s within teams, across teams, or with customers and partners.

Among leaders of highly innovative cultures:

  • 86 percent said collaboration within their teams is very important for future business growth, compared to 70 percent in less innovative cultures.
  • 86 percent said internal collaboration across teams is very important to growth, compared to 72 percent of leaders in less innovative businesses.
  • 79 percent said collaborating externally with their partners is vital for growing their business, compared to just 54 percent of their counterparts in lower-innovation companies.

II. Empowering teams and creating a learning culture

The research shows that in the most innovative companies, leaders are focused on mobilizing their teams and empowering them.

In the most innovative companies, 73 percent of workers say their teams can choose how they approach the work – with only 45 percent of workers in low-innovation workplaces feeling that way. Further, approximately twice as many people in high-innovation workplaces feel empowered to make decisions without a manger’s approval, compared to employees in low-innovation companies.

Finally, nearly three in four employees say their leaders create a culture where it’s OK to make mistakes, compared to just half of the employees in lower-innovation companies.

Profound growth requires innovation and to foster innovation, you need people to feel trusted and supported to experiment and learn. There can be real returns for leaders who learn to let go and coach teams to constantly improve. – Dr. Parke.

III. Protect attention and promote flow

Workers report feeling like they waste 52% of their time each week due to things like unproductive meetings and emails, unnecessary interruptions, and time taken to track down information.

The study suggests that a combination of having the right physical environment, tech tools and a manager who supports diverse ways of working can cut this sub-optimal time in half.

However, the data from the study highlights there’s a greater opportunity than just the possibility of employers helping people be more productive. In fact, there’s also a significant opportunity to bolster employee engagement. When people are able to devote all of their attention and energy to a particular task, they are able to work in a flow state – sometimes known as ‘in the zone.’ Employees who can work in this way – at least some of the time – were three times more likely to say they were happy in their jobs

A working culture that values empowerment and autonomy appear to have an advantage in terms of people being able to work in a flow state: 72 percent of employees who report that they are able to work in flow state say their teams can choose how they approach work. In workplaces with low states of flow, only half of workers feel similarly.

In quick summary: the business leaders that will succeed tomorrow are not thinking about how they can make their workforce more productive – they are focused on helping their people be more innovative.

Any business leader knows that innovation is the key to growth or survival. The challenge, however, is how to establish a culture that consistently innovates, again and again, to avoid getting left behind.– Dr. Parke.

Go to Original Article
Author: Microsoft News Center

Bringing together deep bioscience and AI to help patients worldwide: Novartis and Microsoft work to reinvent treatment discovery and development   – The Official Microsoft Blog

In the world of commercial research and science, there’s probably no undertaking more daunting – or more expensive – than the process of bringing a new medicine to market. For a new compound to make it from initial discovery through development, testing and clinical trials to finally earn regulatory approval can take a decade or more. Nine out of 10 promising drug candidates fail somewhere along the way. As a result, on average, it costs life sciences companies $2.6 billion to introduce a single new prescription drug.

This is much more than just a challenge for life sciences companies. Streamlining drug development is an urgent issue for human health more broadly. From uncovering new ways to treat age-old sicknesses like malaria that still kills hundreds of thousands of people every year, to finding new cancer treatments, or developing new vaccines to prevent highly-contagious diseases from turning into global pandemics, the impact in terms of lives saved worldwide would be enormous if we could make inventing new medicines faster.

As announced today, this is why Novartis and Microsoft are collaborating to explore how to take advantage of advanced Microsoft AI technology combined with Novartis’ deep life sciences expertise to find new ways to address the challenges underlying every phase of drug development – including research, clinical trials, manufacturing, operations and finance. In a recent interview, Novartis CEO Vas Narasimhan spoke about the potential for this alliance to unlock the power of AI to help Novartis accelerate research into new treatments for many of the thousands of diseases for which there is, as yet, no known cure.

In the biotech industry, there have been amazing scientific advances in recent years that have the potential to revolutionize the discovery of new, life-saving drugs. Because many of these advances are based on the ability to analyze huge amounts of data in new ways, developing new drugs has become as much an AI and data science problem as it is a biology and chemistry problem. This means companies like Novartis need to become data science companies to an extent never seen before. Central to our work together is a focus on empowering Novartis associates at each step of drug development to use AI to unlock the insights hidden in vast amounts of data, even if they aren’t data scientists. That’s because while the exponential increase in digital health information in recent years offers new opportunities to improve human health, making sense of all the data is a huge challenge.

The issue isn’t just a problem of the overwhelming volume. Much of the information exists in the form of unstructured data, such as research lab notes, medical journal articles, and clinical trial results, all of which is typically stored in disconnected systems. This makes bringing all that data together extremely difficult. Our two companies have a dream. We want all Novartis associates – even those without special expertise in data science – to be able to use Microsoft AI solutions every day, to analyze large amounts of information and discover new correlations and patterns critical to finding new medicines. The goal of this strategic collaboration is to make this dream a reality. This offers the potential to empower everyone from researchers exploring the potential of new compounds and scientists figuring out dosage levels, to clinical trial experts measuring results, operations managers seeking to improve supply chains more efficiently, and even business teams looking to make more effective decisions. And as associates work on new problems and develop new AI models, they will continually build on each other’s work, creating a virtuous cycle of exploration and discovery. The result? Pervasive intelligence that spans the company and reaches across the entire drug discovery process, improving Novartis’ ability to find answers to some of the world’s most pressing health challenges.

As part of our work with Novartis, data scientists from Microsoft Research and research teams from Novartis will also work together to investigate how AI can help unlock transformational new approaches in three specific areas. The first is about personalized treatment for macular degeneration – a leading cause of irreversible blindness. The second will involve exploring ways to use AI to make manufacturing new gene and cell therapies more efficient, with an initial focus on acute lymphoblastic leukemia. And the third area will focus on using AI to shorten the time required to design new medicines, using pioneering neural networks developed by Microsoft to automatically generate, screen and select promising molecules. As our work together moves forward, we expect that the scope of our joint research will grow.

At Microsoft, we’re excited about the potential for this collaboration to transform R&D in life sciences. As Microsoft CEO Satya Nadella explained, putting the power of AI in the hands of Novartis employees will give the company unprecedented opportunities to explore new frontiers of medicine that will yield new life-saving treatments for patients around the world.

While we’re just at the beginning of a long process of exploration and discovery, this strategic alliance marks the start of an important collaborative effort that promises to have a profound impact on how breakthrough medicines and treatments are developed and delivered. With the depth and breadth of knowledge that Novartis offers in bioscience and Microsoft’s unmatched expertise in computer science and AI, we have a unique opportunity to reinvent the way new medicines are created. Through this process, we believe we can help lead the way forward toward a world where high-quality treatment and care is significantly more personal, more effective, more affordable and more accessible.

Tags: , , ,

Go to Original Article
Author: Steve Clarke

Cute but vulnerable: Scientists to use drones, cloud, and AI to protect Australia’s Quokkas – Asia News Center

Microsoft AI for Earth boosts DNA research for species at risk of extinction

The quokkas of Rottnest Island hop about and raise their babies in pouches  – just like mini kangaroos. They have chubby cheeks, pointy ears, big brown eyes, and tiny mouths that always seem to smile. As far as furry little critters go, they have real star power.

But being super-cute doesn’t mean being safe.

The International Union for the Conservation of Nature (ICUN) has classified the quokka as “vulnerable” on its Red List of 28,000 species threatened with extinction.

Scientists want to know more about these animals and are turning to digital technologies to help find out. Their initial focus is on Rottnest, a small island just off the coast from Western Australia’s state capital, Perth.

It is one of the few places where quokkas are doing well. But unlike the hundreds of thousands of day-trippers who go there every year, the researchers aren’t taking selfies with the friendly, cat-sized marsupials.

Instead, they’re after quokka “scat” – a polite biological term for their droppings. More precisely, they want to study the DNA that those droppings contain.

– Jennifer Marsman, principal engineer, Microsoft AI for Earth.

Microsoft recently awarded an AI for Earth Compute Grant to the University of Western Australia (UWA) to study quokkas with new methods that could accelerate research into other threatened and endangered species around the world.

The UWA team is planning to trial a program to monitor at-risk species in faster and cheaper ways using specially designed “scat drones” along with high-powered cloud computing.

“A scat drone has a little probe attached to it for DNA analysis. It can go and look for scat samples around the island, and analyze them in real-time for us,” says UWA Associate Professor Parwinder Kaur who is leading the research. This initial information can then be sequenced and analyzed further in the cloud with the help of machine learning and artificial intelligence.

Jennifer Marsman, principal engineer on Microsoft’s AI for Earth program & UWA Associate Professor Parwinder Kaur director of the Australian DNA Zoo
Jennifer Marsman, principal engineer on Microsoft’s AI for Earth program, and UWA Associate Professor Parwinder Kaur, director of the Australian DNA Zoo

The quokka project is part of an initiative by DNA Zoo, a global organization made up of  more than 55 collaborators in eight countries. It aims to use new digital technologies and scientific rigor to facilitate conservation efforts to help slow, and perhaps one day, halt extinction rates around the world.

The United Nations estimates that around 1 million plant and animal species are now at risk of dying out. Scientists want to prevent that catastrophe by better understanding the complex forces that drive extinctions. To do that, they need lots of data. Just as importantly, they need ways to process and analyze that data on a massive scale.

“It’s a classic big data challenge,” explains Dr. Kaur, who is also a director of DNA Zoo’s Australian node. “The genome of a single mammal may run to 3.2 gigabytes (GB).

“To properly understand the genome, it needs to be read 50 times – creating a 172 GB data challenge for a single animal. Multiply that challenge across entire populations of threatened species and the scale of the computing and analysis problem is clear.

“By using supercomputing power and also Microsoft cloud, artificial intelligence, and machine learning, we hope to automate and accelerate genome assemblies and subsequent analyses.”

With their AI for Earth grant, DNA Zoo will use the cloud to democratize genome assemblies worldwide. It will also come up with insights to help protect and preserve species that are now at risk.

Dr. Kaur (center) with students in her lab at the University of Western Australia.
Dr. Kaur (center) with team members in her lab at the University of Western Australia.

Importantly, data collected through the DNA Zoo program is open-source. When it is shared with other open-source data collections, machine learning can search for patterns that, in turn, can reveal new insights into the health and condition of species populations.

This sort of comparative genomics means scientists can study the DNA of a threatened species or population alongside those which appear to thrive in the same or similar habitats. Ultimately, that will help researchers learn more about how to slow or reverse population decline.

Among other things, the researchers will be looking for genetic clues that might help explain why quokkas thrive on Rottnest but struggle on the Western Australian mainland, just 22 kilometers (13.6 miles) away.

Before Europeans started settling this part of Australia less than two centuries ago, quokkas were common across much of the bottom end of the state. But today’s mainland populations have dropped dramatically. The species now exists in only small scattered mainland locations and two offshore islands, including Rottnest, where they are out of the reach of dangers, such as introduced predators, like wild cats, dogs, and foxes, as well as habitat loss from urbanization and agriculture.

Michelle Reynolds, Executive Director of the Rottnest Island Authority, says the island’s quokka population is a much-loved conservation target. “We welcome the support from Microsoft, DNA Zoo, and UWA that will add to our knowledge of the quokka in ensuring its ongoing survival,” she says.

A species with star power: A quokka on Rottnest Island.

Studying the quokka is just the start for DNA Zoo Australia, which plans to focus its efforts on the country’s top 40 most threatened mammals.

“In the last 200 years, we’ve lost more than 30 species,” says Dr. Kaur. “It’s critical that we act now and join hands with the global initiatives where we can empower our genetically and developmentally unique Australian species with genomic resources.”

Jennifer Marsman, principal engineer on Microsoft’s AI for Earth program, argues that “preserving biodiversity is one of the most important challenges facing scientists today.”

“By putting AI in the hands of researchers and organizations, we can use important data insights to help solve important issues related to water, agriculture, biodiversity, and climate change,” she says.

AI for Earth is more than just grants. Microsoft is helping to bring transformative solutions to commercial scale and offering open-source API solutions to help organizations everywhere boost their impact.”

Go to Original Article
Author: Microsoft News Center

Addressing the coming IoT talent shortage – Microsoft Industry Blogs

This blog is the third in a series highlighting our newest research, IoT Signals. Each week will feature a new top-of-mind topic to provide insights into the current state of IoT adoption across industries, how business leaders can develop their own IoT strategies, and why companies should use IoT to improve service to partners and customers.

As companies survey the possibilities of the Internet of Things (IoT), one of the challenges they face is a significant growing talent shortage. Recent research from Microsoft, IoT Signals, drills down into senior leaders’ concerns and plans. Microsoft surveyed 3,000 decision-makers at companies across China, France, Germany, Japan, the United States, and the United Kingdom who are involved in IoT.

Exploring IoT skills needs at enterprises today

Most IoT challenges today relate to staffing and skills. Our research finds that only 33 percent of companies adopting IoT say they have enough workers and resources, 32 percent lack enough workers and resources, and 35 percent reported mixed results or didn’t know their resourcing issues. Worldwide, talent shortages are most acute in the United States (37 percent) and China (35 percent).

Of the top challenges that impede the 32 percent of companies struggling with IoT skills shortages, respondents cited a lack of knowledge (40 percent), technical challenges (39 percent), lack of budget (38 percent), an inability to find the right solutions (28 percent), and security (19 percent).

a close up of a logo graph of tech assessment

a close up of a logo graph of tech assessment

Companies will need to decide which capabilities they should buy, in the form of hiring new talent; build, in the form of developing staff competencies; or outsource, in the form of developing strategic partnerships. For example, most companies evaluating the IoT space aren’t software development or con­nectivity experts and will likely turn to partners for these services.

Adequate resourcing is a game-changer for IoT companies

Our research found that having the right team and talent was critical to IoT success on a number of measures. First, those with sufficient resources were more likely to say that IoT was very critical to their company’s future success: 51 percent versus 39 percent. Hardship created more ambivalence, with only 41 percent of IoT high performers saying IoT was somewhat critical to future success, whereas 48 percent of lower-performing companies agreed.

Similarly, companies with strong IoT teams viewed IoT as a more successful investment, attributing 28 percent of current ROI to IoT (inclusive of cost savings and efficiencies) versus 20 percent at less enabled companies. That’s likely why 89 percent of those who have the right team is planning to use IoT more in the future versus 75 percent of those who lack adequate resources.

IoT talent shortage may cause higher failure rate

Getting IoT off the ground can be a challenge for any company, given its high learning curve, long-term commitment, and significant investment. It’s doubly so for companies that lack talent and resources. IoT Signals found that companies who lack adequate talent and resources have a higher failure rate in the proof of concept phase: 30 percent versus 25 percent for those with the right team. At companies with high IoT success, the initiative is led by a staffer in an IT role, such as a director of IT, a chief technology officer, or a chief information officer. With leadership support, a defined structure, and budget, these all-in IoT organizations are able to reach the production stage on an average of nine months, while those who lack skilled workers and resources take 12 months on average.

Despite initial challenges, company leaders are unlikely to call it quits. Business and technology executives realize that IoT is a strategic business imperative and will be increasingly required to compete in the marketplace. Setting up the right team, tools, and resources now can help prevent team frustration, business burnout, and leadership commitment issues.

Overcoming the skills issues with simpler platforms

Fortunately, industry trends like fully hosted SaaS platforms are reducing the complexity of building IoT programs: from connecting and managing devices to providing integrated tooling and security, to enabling analytics.

Azure IoT Central, a fully managed IoT platform, is designed to let anyone build an IoT initiative within hours, empowering business teams and other non-technical individuals to easily gain mastery and contribute. Azure includes IoT Plug and Play, which provides an open modeling language to connect IoT devices to the cloud seamlessly.

Additionally, Microsoft is working with its partner ecosystem to create industry-specific solutions to help companies overcome core IoT adoption blockers and investing in training tools like IoT School and AI Business School. Microsoft has one of the largest and fastest-growing partner ecosystems. Our more than 10,000 IoT partners provide domain expertise across industries and help address connectivity, security infrastructure, and application infrastructure requirements, allowing companies to drive to value faster. 

Learn more about how global companies are using IoT to drive value by downloading the IoT Signals report and reading our Transform Blog on IoT projects companies such as ThyssenKrupp, Bühler, Chevron, and Toyota Material Handling Group are driving.

Go to Original Article
Author: Microsoft News Center

WannaMine cryptojacker targets unpatched EternalBlue flaw

New research detailed successful cryptojacking attacks by WannaMine malware after almost one year of warnings about this specific cryptominer and more than a year and a half  of warnings about the EternalBlue exploit.

The Cybereason Nocturnus research team and Amit Serper, head of security research for the Boston-based cybersecurity company, discovered a new outbreak of the WannaMine cryptojacker, which the researchers said gains access to computer systems “through an unpatched [Server Message Block, or SMB] service and gains code execution with high privileges” to spread to more systems.

Serper noted in a blog post that neither WannaMine nor the EternalBlue exploit are new, but they are still taking advantage of those unpatched SMB services, even though Microsoft patched against EternalBlue in March 2017.

“Until organizations patch and update their computers, they’ll continue to see attackers use these exploits for a simple reason: they lead to successful campaigns,” Serper wrote in the blog post. “Part of giving the defenders an advantage means making the attacker’s job more difficult by taking steps to boost an organization’s security. Patching vulnerabilities, especially the ones associated with EternalBlue, falls into this category.”

It is fair to say that any unpatched system with SMB exposed to the internet has been compromised repeatedly and is definitely infected with one or more forms of malware.
Jake Williamsfounder and CEO, Rendition Infosec

The EternalBlue exploit was famously part of the Shadow Brokers dump of National Security Agency cyberweapons in April 2017; less than one month later, the WannaCry ransomware was sweeping the globe and infecting unpatched systems. However, that was only the beginning for EternalBlue.

EternalBlue was added into other ransomware, like GandCrab, to help it spread faster. It was morphed into Petya. And there were constant warnings for IT to patch vulnerable systems.

WannaMine was first spotted in October 2017 by Panda Security. And in January 2018, Sophos warned users that WannaMine was still active and preying on unpatched systems. According to researchers at ESET, the EternalBlue exploit saw a spike in use in April 2018.

Jake Williams, founder and CEO of Rendition Infosec, based in Augusta, Ga., said there are many ways threat actors may use EternalBlue in attacks.

“It is fair to say that any unpatched system with SMB exposed to the internet has been compromised repeatedly and is definitely infected with one or more forms of malware,” Williams wrote via Twitter direct message. “Cryptojackers are certainly one risk for these systems. These systems don’t have much power for crypto-mining (most lack dedicated GPUs), but when compromised en-masse they can generate some profit for the attacker. More concerning in some cases are the use of these systems for malware command and control servers and launching points for other attacks.”

Putting the cloud under the sea with Ben Cutler – Microsoft Research

ben cutler podcast

Ben Cutler from Microsoft Research. Photo by Maryatt Photography.

Episode 40, September 5, 2018

Data centers have a hard time keeping their cool. Literally. And with more and more data centers coming online all over the world, calls for innovative solutions to “cool the cloud” are getting loud. So, Ben Cutler and the Special Projects team at Microsoft Research decided to try to beat the heat by using one of the best natural venues for cooling off on the planet: the ocean. That led to Project Natick, Microsoft’s prototype plan to deploy a new class of eco-friendly data centers, under water, at scale, anywhere in the world, from decision to power-on, in 90 days. Because, presumably for Special Projects, go big or go home.

In today’s podcast we find out a bit about what else the Special Projects team is up to, and then we hear all about Project Natick and how Ben and his team conceived of, and delivered on, a novel idea to deal with the increasing challenges of keeping data centers cool, safe, green, and, now, dry as well!

Related:


Episode Transcript

Ben Cutler: In some sense we’re not really solving new problems. What we really have here is a marriage of these two mature industries. One is the IT industry, which Microsoft understands very well. And then the other is a marine technologies industry. So, we’re really trying to figure out how do we blend these things together in a way that creates something new and beneficial?

(music plays)

Host: You’re listening to the Microsoft Research Podcast, a show that brings you closer to the cutting-edge of technology research and the scientists behind it. I’m your host, Gretchen Huizinga.

Host: Data centers have a hard time keeping their cool. Literally. And with more and more data centers coming online all over the world, calls for innovative solutions to “cool the cloud” are getting loud. So, Ben Cutler and the Special Projects team at Microsoft Research decided to try to beat the heat by using one of the best natural venues for cooling off on the planet: the ocean. That led to Project Natick, Microsoft’s prototype plan to deploy a new class of eco-friendly data centers, under water, at scale, anywhere in the world, from decision to power-on, in 90 days. Because, presumably for Special Projects, go big or go home.

In today’s podcast we find out a bit about what else the Special Projects team is up to, and then we hear all about Project Natick, and how Ben and his team conceived of, and delivered on, a novel idea to deal with the increasing challenges of keeping data centers cool, safe, green, and, now, dry as well! That and much more on this episode of the Microsoft Research Podcast.

Host: Ben Cutler. Welcome to the podcast.

Ben Cutler: Thanks for having me.

Host: You’re a researcher in Special Projects at MSR. Give us a brief description of the work you do. In broad strokes, what gets you up in the morning?

Ben Cutler: Well, so I think Special Projects is a little unusual. Rather than have a group that always does the same thing persistently, it’s more based on this idea of projects. We find some new idea, something, in our case, that we think is materially important to the company, and go off and pursue it. And it’s a little different in that we aren’t limited by the capabilities of the current staff. We’ll actually go out and find partners, whether they be in academia or very often in industry, who can kind of help us grow and stretch in some new direction.

Host: How did Special Projects come about? Has it always been “a thing” within Microsoft Research, or is it a fairly new idea?

Ben Cutler: Special Projects is a relatively new idea. In early 2014, my manager, Norm Whitaker, who’s a managing scientist inside Microsoft Research was recruited to come here. Norm had spent the last few years of his career at DARPA, which is Defense Advanced Research Projects Agency, which has a very long history in the United States, and a lot of the seminal technology achieved is not just on the defense side, where we see things like stealth, but also on the commercial or consumer side had their origins in DARPA. And so, we’re trying to bring some of that culture here into Microsoft Research and a willingness to go out and pursue crazy things and a willingness not just to pursue new types of things, but things that are in areas that historically we would never have touched as a company, and just be willing to crash into some new thing and see if it has value for us.

Host: So, that seems like a bit of a shift from Microsoft, in general, to go in this direction. What do you think prompted it, within Microsoft Research to say, “Hey let’s do something similar to DARPA here?”

Ben Cutler: I think if you look more broadly at the company, with Satya, we have this very different perspective, right? Which is, not everything is based on what we’ve done before. And a willingness to really go out there and draw in things from outside Microsoft and new ideas and new concepts in ways that we’ve never done, I think, historically as a company. And this is in some sense a manifestation of this idea of, you know, what can we do to enable every person in every organization on the planet to achieve more? And a part of that is to go out there and look at the broader context of things and what kind of things can we do that might be new that might help solve problems for our customers?

Host: You’re working on at least two really cool projects right now, one of which was recently in the news and we’ll talk about that in a minute. But I’m intrigued by the work you’re doing in holoportation. Can you tell us more about that?

Ben Cutler: If you think about what we typically do with a camera, we’re capturing this two-dimensional information. One stage beyond that is what’s called a depth camera, which is, in addition to capturing color information, it captured the distance to each pixel. So now I’m getting a perspective and I can actually see the distance and see, for example, the shape of someone’s face. Holoportation takes that a step further where we’ll have a room that we outfit with, say, several cameras. And from that, now, I can reconstruct the full, 3-D content of the room. So, you can kind of think of this as, I’m building a holodeck. And so now you can imagine I’m doing a video conference, or, you know, something as simple as like Facetime, but rather than just sort of getting that 2-D, planar information, I can actually now wear a headset and be in some immersive space that might be two identical conferences rooms in two different locations and I see my local content, but I also see the remote content as holograms. And then of course we can think of other contexts like virtual environments, where we kind of share across different spaces, people in different locations. Or even, if you will, a broadcast version of this. So, you can imagine someone’s giving a concert. And now I can actually go be at that concert even if I’m not there. Or think about fashion. Imagine going to a fashion show and actually being able to sit in the front row even though I’m not there. Or, everybody gets the front row seats at the World Cup soccer.

Host: Wow. It’s democratizing event attendance.

Ben Cutler: It really is. And you can imagine I’m visiting the Colosseum and a virtual tour guide appears with me as I go through it and can tell me all about that. Or some, you know, awesome event happens at the World Cup again, and I want to actually be on the soccer field where that’s happening right now and be able to sort of review what happened to the action as though I was actually there rather than whatever I’m getting on television.

Host: So, you’re wearing a headset for this though, right?

Ben Cutler: You’d be wearing an AR headset. For some of the broadcast things you can imagine not wearing a headset. It might be I’ve got it on my phone and just by moving my phone around I can kind of change my perspective. So, there’s a bunch of different ways that this might be used. So, it’s this interesting new capture technology. Much as HoloLens is a display, or a viewing technology, this is the other end, capture, and there’s different ways we can kind of consume that content. One might be with a headset, the other might just be on a PC using a mouse to move around much as I would on a video game to change my perspective or just on a cell phone, because today, there’s a relatively small number of these AR/VR headsets but there are billions of cell phones.

Host: Right. Tell me what you’re specifically doing in this project?

Ben Cutler: In the holoportation?

Host: Yeah.

Ben Cutler: So, really what’s going on right now is, when this project first started to outfit a room, to do this sort of a thing, might’ve been a couple hundred thousand dollars of cost, and it might be 1 to 3 gigabits of data between sites. So, it’s just not really practical, even at an enterprise level. And so, what we’re working on is, with the HoloLens team and other groups inside the company, to really sort of dramatically bring down that cost. So now you can imagine you’re a grandparent and you want to kind of play with your grandkids who are in some other location in the world. So, this is something that we think, in the next couple years, actually might be at the level the consumers can have access to this technology and use it every day.

Host: This is very much in the research stage, though, right?

Ben Cutler: We have an email address and we hear from people every day, “How do I buy this? How can I get this?” And you know, it’s like, “Hey, here’s our website. It’s just research right now. It’s not available outside the company. But keep an eye on this because maybe that will change in the future.”

Host: Yeah. Yeah, and that is your kind of raison d’etre is to bring these impossibles into inevitables in the market. That should be a movie. The Inevitables.

Ben Cutler: I think there’s something similar to that, but anyway…

Host: I think a little, yeah. So just drilling a little bit on the holoportation, what’s really cool I noticed on the website, which is still research, is moving from a room-based hologram, or holoported individual, into mobile holoportation. And you’ve recently done this, at least in prototype, in a car, yes?

Ben Cutler: We have. So, we actually took an SUV. We took out the middle seat. And then we mounted cameras in various locations. Including, actually, the headrests of the first-row passengers. So that if you’re sitting in that back row we could holoport you somewhere. Now this is a little different than, say, that room-to-room scenario. You can imagine, for example, the CEO of our company can’t make a meeting in person, so he’ll take it from the car. And so, the people who are sitting in that conference room will wear an AR headset like a HoloLens. And then Satya would appear in that room as though he’s actually there. And then from Satya’s perspective, he’d wear a VR headset, right? So, he would not be sitting in his car anymore. He would be holoported into that conference room.

(music plays)

Host: Let’s talk about the other big project you’re doing: Project Natick. You basically gave yourself a crazy list of demands and then said, “Hey, let’s see if we can do it!” Tell us about Project Natick. Give us an overview. What it is, how did it come about, where it is now, what does it want to be when it grows up?

Ben Cutler: So, Project Natick is an exploration of manufactured data centers that we place underwater in the ocean. And so, the genesis of this is kind of interesting, because it also shows not just research trying to influence the rest of the company, but that if you’re working elsewhere inside Microsoft, you can influence Microsoft Research. So, in this case, go back to 2013, and a couple employees, Sean James and Todd Rawlings, wrote this paper that said we should put data centers in the ocean and the core idea was, the ocean is a place where you can get good cooling, and so maybe we should look at that for data centers. Historically, when you look at data centers, the dominant cost, besides the actual computers doing the work, is the air conditioning. And so, we have this ratio in the industry called PUE, or Power Utilization Effectiveness. And if you go back a long time ago to data centers, PUEs might be as high as 4 or 5. A PUE of 5 says that, for every watt of power for computers, there’s an additional 4 watts for the air conditioning, which is just kind of this crazy, crazy thing. And so, industry went through this phase where we said, “OK, now we’re going to do this thing called hot aisle/cold aisle. We line up all the computers in a row, and cold air comes in one side and hot air goes out the other.” Now, modern data centers that Microsoft builds have a PUE of about 1.125. And the PUE we see of what we have right now in the water is about 1.07. So, we have cut the cooling cost. But more importantly we’ve done it in a way that we’ve made the data center much colder. So, we’re about 10-degrees Celsius cooler than land data centers. And we’ve known, going back to the middle of the 20th century, that higher temperatures are a problem for components and in fact, a factor of 10-degree Celsius difference can be a factor of 2 difference of the life expectancy of equipment. So, we think that this is one way to bring reliability up a lot. So, this idea of reliability is really a proxy for server longevity and how do we make things last longer? In addition to cooling, there’s other things that we have here. One of which is the atmosphere inside this data center is dry nitrogen atmosphere. So, there’s no oxygen. And the humidity is low. And we think that helps get rid of corrosion. And then the other thing is, data centers we get stuff comes from outside. So, by having this sealed container, safe under the ocean we hopefully have this environment that will allow servers to last much longer.

Host: How did data center technology and submarine technology come together so that you could put the cloud under water?

Ben Cutler: Natick is a little bit unusual as a research project because in some sense we’re not really solving new problems. What we really have here is a marriage of these two mature industries. One is the IT industry, which Microsoft understands very well. And then the other is a marine technologies industry. So, we’re really trying to figure out, how do we blend these things together in a way that creates something new and beneficial?

Host: And so, the submarine technology, making something watertight and drawing on the decades that people have done underwater things, how did you bring that together? Did you have a team of naval experts…?

Ben Cutler: So, the first time we did this, we just, sort of, crashed into it, and we, literally, just built this can and we just kind of dropped it in the water, and ok, we can do this, it kind of works. And so, then the second time around, we put out what we call a Request for Information. We’re thinking of doing this thing, and we did this to government and to academia and to industry, and just to see who’s interested in playing this space? What do they think about it? What kind of approaches would they take? And you know, we’re Microsoft. We don’t really know anything about the ocean. We’ve identified a bunch of folks we think do know about it. And on the industry side we really looked at three different groups. We looked to ship builders, we looked to people who were doing renewable energy in the ocean, which we should come back to that, and then we looked to oil and gas services industry. And so, we got their response and on the basis of that, we then crafted a Request for Proposal to actually go off and do something with us. And that identified what kind of equipment we put inside it, what our requirements were in terms of how we thought that this would work, how cool it had to be, the operating environment that needed to be provided for the servers, and also some more mundane stuff like, when you’re shipping it, what’s the maximum temperature things can get to when it’s like, sitting in the sun on a dock somewhere? And, on the basis of that, we got a couple dozen proposals from four different continents. And so, we chose a partner and then set forward. And so, in part, we were working with University of Washington Applied Physics Lab… is one of three centers of excellence for ocean sciences in the United States, along with Woods Hole and Scripps. And so, we leveraged that capability to help us go through the selection process. And then the company we chose to work with is a company called Naval Group, which is a French company, and among other things, they do naval nuclear submarines, surface ships, but they also do renewable energies. And, in particular, renewable energies in the ocean, so offshore wind, they do tidal energy which is to say, gaining energy from the motion of the tides, as well as something called OTEC which is Ocean Thermal Energy Conversion. So, they have a lot of expertise in renewable energy. Which is very interesting to us. Because another aspect of this that we like is this idea of co-location with offshore renewable energies. So, the idea is, rather than connecting to the grid, I might connect to renewable energies that get placed in the same location where we put this. That’s actually not a new idea for Microsoft. We have data centers that are built near hydroelectric dams or built near windfarms in Texas. So, we like this idea of renewable energy. And so, as we think about this idea of data centers in the ocean, it’s kind of a normal thing, in some sense, that this idea of the renewables would go with us.

Host: You mentioned the groups that you reached out to. Did you have any conversation with environmental groups or how this might impact sea life or the ocean itself?

Ben Cutler: So, we care a lot about that. We like the idea of co-location with the offshore renewables, not just for the sustainability aspects of this, but also for the fact that a lot of those things are going up near large populations centers. So, it’s a way to get close to customers. We’re also interested in other aspects of sustainability. And those include things like artificial reefs. We’ve actually filed an application for a patent having to use this idea of undersea data centers, potentially, as artificial reefs.

Host: So, as you look to maybe, scaling up… Say this thing, in your 5-year experiment, does really well. And you say, “Hey, we’re going to deploy more of these.” Are you looking, then, with the sustainability goggles on, so to speak, for Natick staying green both for customers but also for the environment itself?

Ben Cutler: We are. And I think one thing people should understand too, is you look out at the ocean and it looks like this big, vast open space, but in reality, it’s actually very carefully regulated. So anywhere we go, there are always authorities and rules as to what you can do and how you do them, so there’s that oversight. And there’s also things that we look at directly, ourselves. One of the things that we like about these, is from a recyclability standpoint, it’s a pretty simple structure. Every five years, we bring that thing back to shore, we put a new set of servers in, refresh it, send it back down, and then when we’re all done we bring it back up, we recycle it, and the idea is you leave the seabed as you found it. On the government side, there’s a lot of oversight, and so, the first thing to understand is, typically, like, as I look at the data center that’s there now, the seawater that we eject back into the ocean is about 8/10 of a degree warmer, Celsius, than the water that came in. It’s a very rapid jet, so, it very quickly mixes with the other seawater. And in our case, the first time we did this, a few meters downstream it was a few thousandths of a degree warmer by the time we were that far downstream.

Host: So, it dissipates very quickly.

Ben Cutler: Water… it takes an immense amount of energy to heat it. If you looked at all of the energy generated by all the data centers in the world and pushed all of them at the ocean, per year you’d raise the temperature a few millionths of a degree. So, in net, we don’t really worry about it. The place that we worry about it is this idea of local warming. And so, one of the things that’s nice about the ocean is because there are these persistent currents, we don’t have buildup of temperature anywhere. So, this question of the local heating, it’s really just, sort of, make sure your density is modest and then the impact is really negligible. An efficient data center in the water actually has less impact on the oceans than an inefficient data center on land does.

Host: Let’s talk about latency for a second. One of your big drivers in putting these in the water, but near population centers, is so that data moves fairly quickly. Talk about the general problems of latency with data centers and how Natick is different.

Ben Cutler: So, there are some things that you do where latency really doesn’t matter. But I think latency gets you in all sorts of ways, and in sometimes surprising ways. The thing to remember is, even if you’re just browsing the web, when a webpage gets painted, there’s all of this back-and-forth traffic. And so, ok, so I’ve got now a data center that’s, say, 1,000 kilometers away, so it’s going to be 10 milliseconds, roundtrip, per each communication. But I might have a couple hundred of those just to paint one webpage. And now all of a sudden it takes me like 2 seconds to paint that webpage. Whereas it would be almost instantaneous if that data center is nearby. And think about, also, I’ve got factories and automation and I’ve got to control things. I need really tight controls there in terms of the latency in order to do that effectively. Or imagine a future where autonomous vehicles become real and they’re interacting with data centers for some aspect of their navigation or other critical functions. So, this notion of latency really matters in a lot of ways that will become, I think, more present as this idea of intelligent edge grows over time.

Host: Right. And so, what’s Natick’s position there?

Ben Cutler: So, Natick’s benefit here, is more than half the world’s population lives within a couple hundred kilometers of the ocean. And so, in some sense, you’re finding a way to put data centers very close to a good percentage of the population. And you’re doing it in a way that’s very low impact. We’re not taking land because think about if I want to put a data center in San Francisco or New York City. Well turns out, land’s expensive around big cities. Imagine that. So, this is a way to go somewhere where we don’t have some of those high costs. And, potentially, with this offshore renewable energy, and not, as we talked about before, having any impact on the water supply.

Host: So, it could solve a lot of problems all at once.

Ben Cutler: It could solve a lot of problems in this very, sort of, environmentally sustainable way, as well as, in some sense, adding these socially sustainable factors as well.

Host: Yeah. Talk a little bit about the phases of this project. I know there’s been more than one. You alluded to that a little bit earlier. But what have you done stage wise, phase wise? What have you learned?

Ben Cutler: So, Phase 1 was a Proof of Concept, which is literally, we built a can, and that can had a single computer rack in it, and that rack only had 24 servers. And that was about one-third of the space of the rack. It was a standard, what we call, 42U rack, which reflects the size of the rack. Fairly standard for data centers. And then other two thirds were filled with what we call load trays. Think of them as, all they do is, they’ve got big resistors that generate heat. So, it’s like hairdryers. And so, they’re used, actually, today in data centers to just, sort of, commission new data centers. Test the cooling system, actually. In our case, we just wanted to generate heat. Could we put these things in the water? Could we cool it? What would that look like? What would be the thermal properties? So, that was a Proof of Concept just to see, could we do this? Could we just, sort of, understand the basics? Were our intuitions right about this? What sort of problems might we encounter? And just, you know, I hate to use… but, you know, get our feet wet. Learning how to interact…

Host: You had to go there.

Ben Cutler: It is astonishing the number of expressions that relate to water that we use.

Host: Oh gosh, the puns are…

Ben Cutler: It’s tough to avoid. So, we just really wanted to get some sense of, what it like was to work with the marine industry? Every company and, to some degree, industry, has ways in which they work. And so, this was really an opportunity for us to learn some of those and become informed, before we go to this next stage that we’re at now. Which is more as a prototype stage. So, this vessel that we built this time, is about the size of a shipping container. And that’s by intent. Because then we’ve got something that’s of a size that we can use standard logistics to ship things around. Whether the back of a truck, or on a container ship. Again, keeping with this idea of, if something like this is successful, we have to think about what are the economics of this? So, it’s got 12 racks this time. It’s got 864 servers. It’s got FPGAs, which is something that we use for certain types of acceleration. And then, each of those 864 servers has 32 terabytes of disks. So, this is a substantial amount of capability. It’s actually located in the open ocean in realistic operating conditions. And in fact, where we are, in the winter, the waves will be up to 10 meters. We’re at 36 meters depth. So that means the water above us will vary between 26 and 46 meters deep. And so, it’s a really robust test area. So, we want to understand, can this really work? And what, sort of, the challenges might be in this realistic operating environment.

Host: So, this is Phase 2 right now.

Ben Cutler: This is Phase 2. And so now we’re in the process of learning and collecting data from this. And just going through the process of designing and building this, we learned all sorts of interesting things. And so, turns out, when you’re building these things to go under the ocean, one of the cycling that you get is just from the waves going by. And so, as you design these things, you have to think about how many waves go by this thing over the lifetime? What’s the frequency of those waves? What’s the amplitude of those waves? And this all impacts your design, and what you need to do, based on where you’re going to put it and how long it will be. So, we learned a whole bunch of stuff from this. And we expect everything will all be great and grand over the next few years here. But we’ll obviously be watching, and we’ll be learning. If there is a next phase, it would be a pilot. And now we’re talking to build something that’s larger scale. So, it might be multiple vessels. There might be a different deployment technology than what we used this time, to get greater efficiency. So, I think those are things that, you know, we’re starting to think about, but mostly, right now, we’ve got this great thing in the water and we’re starting to learn.

Host: Yeah. And you’re going to leave it alone for 5 years, right?

Ben Cutler: This thing will just be down there. Nothing will happen to it. There will be no maintenance until it’s time to retire the servers, which, in a commercial setting, might be every 5 years or longer. And then we’ll bring it back. So, it really is the idea of a lights-out thing. You put it there. It just does its thing and then we go and pull it back later. In an actual commercial deployment, we’d probably be deeper than 36 meters. The reason we’re at 36 meters, is, it turns out, 40 meters is a safe distance for human divers to go without a whole lot of special equipment. And we just wanted that flexibility in case we did need some sort of maintenance or some sort of help during this time. But in a real commercial deployment, we’d go deeper, and one of the reasons for that, also, is just, it will be harder for people to get to it. So, people worry about physical security. We, in some sense, have a simpler challenge than a submarine because a submarine is typically trying to hide from its adversaries. We’re not trying to hide. If we deploy these things, we’d always be within the coastal waters of a country and governed by the laws of that country. But we do also think about, let’s make this thing safe. And so, one of the safety aspects is not just the ability to detect when things are going around you, but also to put it in a place where it’s not easy for people to go and mess with it.

Host: Who’s using this right now? I mean this is an actual test case, so, it’s a data center that somebody’s accessing. Is it an internal data center or what’s the deal on that?

Ben Cutler: So, this data center is actually on our global network. Right now, it’s being used by people internally. We have a number of different teams that are using it for their own production projects. One group that’s working with it, is we have an organization inside Microsoft called AI for Earth. We have video cameras, and so, one of the things that they do is, they’re watching the different fish going by, and other types of much more bizarre creatures that we see. And characterizing and counting those, and so we can kind of see how things evolve over time. And one of the things we’re looking to do, potentially, is to work with other parties that do these more general assessments and then provide some of those AI technologies to them for their general research of marine environment and how, when you put different things in the water, how that affects things, either positively or negatively. Not just, sort of, what we’re doing, but other types of things that go in the water which might be things as simple as cables or marine energy devices or other types of infrastructure.

Host: I would imagine, when you deploy something in a brand-new environment, that you have unintended consequences or unexpected results. Is there anything interesting that’s come out of this deployment that you’d like to share?

Ben Cutler: So, I think when people think of the ocean, they think this is like a really hostile and dangerous place to put things. Because we’re all used to seeing big storms, hurricanes and everything that happens. And to be sure, right at that interface between land and water is a really dangerous place to be. But what you find is that, deep under the waves on the seabed, is a pretty quiet and calm place. And so, one of the benefits that we see out of this, is that even for things like 100-year hurricanes, you will hear, acoustically, what’s going on, on the surface, or near the land… waves crashing and all this stuff going on. But it’s pretty calm down there. The idea that we have this thing deep under the water that would be immune to these types of things is appealing. So, you can imagine this data center down there. This thing hits. The only connectivity back to land is going to be fiber. And that fiber is largely glass, with some insulating shell, so it might be fuse so it will break off. But the data center will keep operating. Your data center will still be safe, even though there might be problems on land. So, this diversity of risk is another thing that’s interesting to people when we talk about Natick.

Host: What about deployment sites? How have you gone about selecting where you put Project Natick and what do you think about other possibilities in the future?

Ben Cutler: So, for this Phase 2, we’re in Europe. And Europe, today, is the leader in offshore renewable energies. Twenty-nine of the thirty largest offshore windfarms are located in Europe. We’re deployed at the European Marine Energy Center in the Orkney Islands of Scotland. The grid up there is 100% renewable energy. It’s a mix of solar and wind as well as these offshore energies that people are testing at the European Marine Energy Center or EMEC. So, tidal energy and wave energy. One of the things that’s nice about EMEC is people are testing these devices. So, in the future, we have the option to go completely off this grid. It’s 100% renewable grid, but we can go off and directly connect to one of those devices and test out this idea of a co-location with renewable energies.

Host: Did you look at other sites and say, hey, this one’s the best?

Ben Cutler: We looked at a number of sites. Both test sites for these offshore renewables as well as commercial sites. For example, go into a commercial windfarm right off the bat. And we just decided, at this research phase, we had better support and better capabilities in a site that was actually designed for that. One of the things is, as I might have mentioned, the waves there get very, very large in the winter. So, we wanted some place that had very aggressive waters so that we know that if we survive in this space that we’ll be good pretty much anywhere we might choose to deploy.

Host: Like New York. If you can make it there…

Ben Cutler: Like New York, exactly.

Host: You can make it anywhere.

Ben Cutler: That’s right.

(music plays)

Host: what was your path to Microsoft Research?

Ben Cutler: So, my career… I would say that there’s been very little commonality in what I’ve done. But the one thing that has been common is this idea of taking things from early innovation to market introduction. So, a lot of my early career was in startup companies, either as a founder or as a principle. I was in super computers, computer storage, video conferencing, different types of semiconductors, and then I was actually here at Microsoft earlier, and I was working in a group exploring new operating system technologies. And then, after that, I went to DARPA, where I was there for a few years working on different types of information technology. And then I came back here. And, truthfully, when I first heard about this idea that they were thinking about doing these underwater data centers, it just sounded like the dumbest idea to me, and… But you know, I was willing to go and then, sort of, try and think through, ok, on the surface it sounds ridiculous. But a lot of things start that way. And you have to be willing to go in, understand the economics, understand the science and the technology involved, and then draw some conclusion of whether you think that can actually go somewhere reasonable.

Host: As we close, Ben, I’m really interested in what kinds of people you have on your team, what kinds of people might be interested in working on Special Projects here. Who’s a good fit for a Special Projects research career?

Ben Cutler: I think we’re looking for people who are excited about the idea of doing something new and don’t have fear of doing something new. In some sense, it’s a lot like people who’d go into a startup. And what I mean by that is, you’re taking a lot more risk, because I’m not in in a large organization, I have to figure out a lot of things out myself, I don’t have a team that will know all these things, and a lot of things may fall on the floor just because we don’t have enough people do get everything done. It’s kind of like driving down the highway and you’re, you know, lashed to the front bumper of the car. You’re fully exposed to all the risk and all the challenges of what you’re doing. And you’re, you know, wide open. There’s no end of things to do and you have to figure out what’s important, what to prioritize, because not everything can get done. But have the flexibility to really, then, understand that even though I can’t get everything done, I’m going to pick and choose the things that are most important and really drive in new directions without a whole lot of constraints on what you’re doing. So, I think that’s kind of what we look to. I have only two people who actually directly report to me on this project. That’s the team. But then I have other people who are core members, who worked on it, who report to other people, and then across the whole company, more than two hundred people touched this Phase 2, in ways large and small. Everything from helping us design the data center, to people who refurbished servers that went into this. So, it’s really a “One Microsoft” effort. And so, I think that there’s always opportunities to engage, not just by being on a team, but interacting and providing your expertise and your knowledge base to help us be successful. Because only in that way that we can take these big leaps. And so, in some sense, we’re trying to make sure that Microsoft Research is really staying true to this idea of pursuing new things but not just five years out, in known fields, but look at these new fields. Because the world is changing. And so, we’re always looking for people who are open to these new ideas and frankly are willing to bring new ideas with them as to where they think we should go and why. And that’s how we as a company I think grow and see new markets and are successful.

(music plays)

Host: Ben Cutler, it’s been a pleasure. Thanks for coming on the podcast today.

Ben Cutler: My pleasure as well.

To learn more about Ben Cutler, Project Natick, and the future of submersible data centers, visit natick.research.microsoft.com.

Skip User Research Unless You’re Doing It Right — Seriously


Skip User Research Unless You’re Doing It Right — Seriously

Is your research timeless? It’s time to put disposable research behind us

Focus on creating timeless research. (Photo: Aron on Unsplash)

We need to ship soon. How quickly can you get us user feedback?”

What user researcher hasn’t heard a question like that? We implement new tools and leaner processes, but try as we might, we inevitably meet the terminal velocity of our user research — the point at which it cannot be completed any faster while still maintaining its rigor and validity.

And, you know what? That’s okay! While the need for speed is valuable in some contexts, we also realize that if an insight we uncover is only useful in one place and at one time, it becomes disposable. Our goal should never be disposable research. We want timeless research.

Speed has its place

Now, don’t get me wrong. I get it. I live in this world, too. First to market, first to patent, first to copyright obviously requires an awareness of speed. Speed of delivery can also be the actual mechanism by which you get rapid feedback from customers.

I recently participated in a Global ResOps workshop. One thing I heard loud and clear was the struggle for our discipline to connect into design and engineering cycles. There were questions about how to address the “unreasonable expectations” of what we can do in short time frames. I also heard that researchers struggle with long and slow timelines: Anyone ever had a brilliant, generative insight ignored because “We can’t put that into the product for another 6 months”?

The good news is that there are methodologies such as “Lean” and “Agile” that can help us. Our goal as researchers is to use knowledge to develop customer-focused solutions. I personally love that these methodologies, when implemented fully, incorporate customers as core constituents in collaborative and iterative development processes.

In fact, my team has created an entire usability and experimentation engine using “Lean” and “Agile” methods. However, this team recognizes that letting speed dictate user research is a huge risk. If you cut corners on quality, customer involvement, and adaptive planning, your research could become disposable.

Do research right, or don’t do it at all

I know, that’s a bold statement. But here’s why: When time constraints force us to drop the rigor and process that incorporates customer feedback, the user research you conduct loses its validity and ultimately its value.

The data we gather out of exercises that over-index on speed are decontextualized and disconnected from other relevant insights we’ve collected over time and across studies. We need to pause and question whether this one-off research adds real value and contributes to an organization’s growing understanding of customers when we know it may skip steps critical to identifying insights that transcend time and context.

User research that takes time to get right has value beyond the moment for which it was intended. I’m betting you sometimes forgo conducting research if you think your stakeholders believe it’s too slow. But, if your research uncovered an insight after v1 shipped, you could still leverage that insight on v1+x.

For example, think of the last time a product team asked you, “We’re shipping v1 next week. Can you figure out if our customers want or need this?” As a researcher, you know you need more time to answer this question in a valid way. So, do you skip this research? No. Do you rush through your research, compromising its rigor? No. You investigate anyway and apply your learnings to v2.

To help keep track of these insights, we should build systems that capture our knowledge and enable us to resurface it across development cycles and projects. Imagine this: “Hey Judy, remember that thing we learned 6 months ago? Research just reminded me that it is applicable in our next launch!”

That’s what we’re looking for: timeless user insights that help our product teams again and again and contribute to a curated body of knowledge about our customers’ needs, beliefs, and behaviors. Ideally, we house these insights in databases, so they can be accessed and retrieved easily by anyone for future use (but that’s another story for another time). If we only focus on speed, we lose sight of that goal.

Creating timeless research

Here’s my point: we’ll always have to deal with requests to make our research faster, but once you or your user research team has achieved terminal velocity with any given method, stop trying to speed it up. Instead, focus on capturing each insight, leveling it up to organizational knowledge, and applying that learning in the future. Yes, that means when an important insight doesn’t make v1, go ahead and bring it back up to apply to v2. Timeless research is really about building long-term organizational knowledge and curating what you’ve already learned.

Disposable research is the stuff you throw away, after you ship. To be truly lean, get rid of that wasteful process. Instead, focus your research team’s time on making connections between past insights, then reusing and remixing them in new contexts. That way, you’re consistently providing timeless research that overcomes the need for speed.

Have you ever felt pressure to bypass good research for the sake of speed? Tell me about it in the comments, or tweet @insightsmunko.


To stay in-the-know with what’s new at Microsoft Research + Insight, follow us on Twitter and Facebook. And if you are interested in becoming a user researcher at Microsoft, head over to careers.microsoft.com.

WhatsApp vulnerabilities let hackers alter messages

Attackers are able to intercept and manipulate messages in the encrypted messaging app WhatsApp.

According to new research from Check Point, there are WhatsApp vulnerabilities that enable attackers to manipulate and modify messages in both public and private conversations. This type of manipulation could make it easy to continue the spread of misinformation.

WhatsApp, which is owned by Facebook, has over 1.5 billion users who send approximately 65 billion messages daily. The Check Point researchers warned of online scams, rumors and the spread of fake news with a user base that large, and WhatsApp has already been used for a number of these types of scams.

The new WhatsApp vulnerabilities that Check Point outlined in its blog post involve social engineering techniques that can be used to deceive users in three ways: by changing the identity of the sender of a message in a group, changing the text of someone else’s reply message, and by sending a private message to a group member to which replies are made public.

“We believe these vulnerabilities to be of the utmost importance and require attention,” the researchers wrote.

The WhatsApp vulnerabilities have to do with the communications between the mobile version of the application and the desktop version. Check Point was able to discover them by decrypting the communications between the mobile and desktop version.

“By decrypting the WhatsApp communication, we were able to see all the parameters that are actually sent between the mobile version of WhatsApp and the Web version. This allowed us to then be able to manipulate them and start looking for security issues,” the researchers wrote in their blog post detailing the WhatsApp vulnerabilities.

In the first attack outlined by Check Point’s Dikla Barda, Roman Zaikin and Oded Vanunu, hackers can change the identity of a sender in a group message, even if they are not part of the group. The researchers were also able to change the text of the message to something completely different.

In the second attack, a hacker can change someone’s reply to a message. In doing this, “it would be possible to incriminate a person, or close a fraudulent deal,” the Check Point team explained.

In the final attack disclosed, “it is possible to send a message in a group chat that only a specific person will see, though if he replies to this message, the entire group will see his reply.” This means that the person who responds could reveal information to the group that he did not intend to.

Check Point said it disclosed these vulnerabilities to WhatsApp before making them public.

In other news

  • Computers at the office of PGA America have reportedly been infected with ransomware. According to a report from Golfweek, employees of the golf organization noticed the infection earlier this week when a ransom note appeared on their screens when they tried to access the affected files. “Your network has been penetrated. All files on each host in the network have been encrypted with a strong algorythm (sic),” the note said, according to Golfweek. The files contained information for the PGA Championship at Bellerive and the Ryder Cup in France, including “extensive” promotional materials. According to the Golfweek report, no specific ransom amount was demanded, though the hacker included a bitcoin wallet number.
  • Microsoft may be adding a new security feature to Windows 10 called “InPrivate Desktop.” According to a report from Bleeping Computer, the feature acts like a “throwaway sandbox for secure, one-time execution of untrusted software” and will only be available on Windows 10 Enterprise. Bleeping Computer became aware of this previously undisclosed feature through a Windows 10 Insider Feedback Hub quest and said that it will enable “administrators to run untrusted executables in a secure sandbox without fear that it can make any changes to the operating system or system’s files.” The Feedback Hub said it is an “in-box, speedy VM that is recycled when you close” the application, according to the report. There are no details yet about when this feature may be rolled out.
  • Comcast Xfinity reportedly exposed personal data of over 26.5 million of its customers. Security researcher Ryan Stevenson discovered two previously unreported vulnerabilities in Comcast Xfinity’s customer portals and through those vulnerabilities, partial home addresses and Social Security numbers of Comcast customers were exposed. The first vulnerability could be exploited by refreshing an in-home authentication page that lets users pay their bills without signing into their accounts. Through this, hackers could have figured out the customer’s IP address and partial home address. The second vulnerability was on a sign-up page for Comcast’s Authorized Dealer and revealed the last four digits of a customer’s SSN. There is no evidence yet that the information was actually stolen, and Comcast patched the vulnerabilities after Stevenson reported them.