Tag Archives: learning

AWS, NFL machine learning partnership looks at player safety

The NFL will use AWS’ AI and machine learning products and services to better simulate and predict player injuries, with the goal of ultimately improving player health and safety.

The new NFL machine learning and AWS partnership, announced during a press event Thursday with AWS CEO Andy Jassy and NFL Commissioner Roger Goodell at AWS re:Invent 2019, will change the game of football, Goodell said.

“It will be changing the way it’s played, it will [change] the way its coached, the way we prepare athletes for the game,” he said.

The NFL machine learning journey

The partnership builds off Next Gen Stats, an existing NFL and AWS agreement that has helped the NFL capture and process data on its players. That partnership, revealed back in 2017, introduced new sensors on player equipment and the football to capture real-time location, speed and acceleration data.

That data is then fed into AWS data analytics and machine learning tools to provide fans, broadcasters and NFL Clubs with live and on-screen stats and predictions, including expected catch rates and pass completion probabilities.

Taking data from that, as well as from other sources, including video feeds, equipment choice, playing surfaces, player injury information, play type, impact type and environmental factors, the new NFL machine learning and AWS partnership will create a digital twin of players.

AWS CEO Andy Jassy and NFL Commissioner Roger Goodell
AWS CEO Andy Jassy, left, and NFL Commissioner Roger Goodell announced a new AI and machine learning partnership at AWS re:Invent 2019.

The NFL began the project with a collection of different data sets from which to gather information, said Jeff Crandall, chairman of the NFL Engineering Committee, during the press event.

It wasn’t just passing data, but also “the equipment that players were wearing, the frequency of those impacts, the speeds the players were traveling, the angles that they hit one another,” he continued.

Typically used in manufacturing to predict machine outputs and potential breakdowns, a digital twin is essentially a complex virtual replica of a machine or person formed out of a host of real-time and historical data. Using machine learning and predictive analytics, a digital twin can be fed into countless virtual scenarios, enabling engineers and data scientists to see how its real-life counterpart would react.

The new AWS and NFL partnership will create digital athletes, or digital twins of a scalable sampling of players, that can be fed into infinite scenarios without risking the health and safety of real players. Data collected from these scenarios is expected to provide insights into changes to game rules, player equipment and other factors that could make football a safer game.

“For us, what we see the power here is to be able to take the data that we’ve created over the last decade or so” and use it, Goodell said. “I think the possibilities are enormous.”

Partnership’s latest move to enhance safety

It will be changing the way it’s played, it will [change] the way its coached, the way we prepare athletes for the game.
Roger GoodellCommissioner, NFL

New research in recent years has highlighted the extreme health risks of playing football. In 2017, researchers from the VA Boston Healthcare System and the Boston University School of Medicine published a study in the Journal of the American Medical Association that indicated football players are at a high risk for developing long-term neurological conditions.

The study, which did not include a control group, looked at the brains of high school, college and professional-level football players. Of the 111 NFL-level football players the researchers looked at, 110 of them had some form of degenerative brain disease.

The new partnership is just one of the changes the NFL has made over the last few years in an attempt to make football safer for its players. Other recent efforts include new helmet rules, and a recent $3 million challenge to create safer helmets.

The AWS and NFL partnership “really has a chance to transform player health and safety,” Jassy said.

AWS re:Invent, the annual flagship conference of AWS, was held this week in Las Vegas.

Go to Original Article
Author:

How to achieve explainability in AI models

When machine learning models deliver problematic results, it can often happen in ways that humans can’t make sense of — and this becomes dangerous when there are no limitations of the model, particularly for high-stakes decisions. Without straightforward and simple tools that highlight explainability in AI models, organizations will continue to struggle in implementing AI algorithms. Explainable AI refers to the process of making it easier for humans to understand how a given model generates the results it does and planning for cases when the results should be second-guessed.

AI developers need to incorporate explainability techniques into their workflows as part of their overall modeling operations. AI explainability can refer to the process of creating algorithms for teasing apart how black box models deliver results or the process of translating these results to different types of people. Data science managers working on explainable AI should keep tabs on the data used in models, strike a balance between accuracy and explainability, and focus on the end user.

Opening the black box

Traditional rule-based AI systems included explainability in AI as part of models, since humans would typically handcraft the inputs to output. But deep learning techniques using semi-autonomous neural-network models can’t provide a model’s results map to an intended goal.

Researchers are working to build learning algorithms that generate explainable AI systems from data. Currently, however, most of the dominant learning algorithms do not yield interpretable AI systems, said Ankur Taly, head of data science at Fiddler Labs, an explainable AI tools provider.

“This results in black box ML techniques, which may generate accurate AI systems, but it’s harder to trust them since we don’t know how these systems’ outputs are generated,” he said. 

AI explainability often describes post-hoc processes that attempt to explain the behavior of AI systems, rather than alter their structure. Other machine learning model properties like accuracy are straightforward to measure, but there are no corresponding simple metrics for explainability. Thus, the quality of an explanation or interpretation of an AI system needs to be assessed in an application-specific manner. It’s also important for practitioners to understand the assumptions and limitations of the techniques they use for implementing explainability.

“While it is better to have some transparency rather than none, we’ve seen teams fool themselves into a false sense of security by wiring an off-the-shelf technique without understanding how the technique works,” Taly said. 

Start with the data

The results of a machine learning model could be explained by the training data itself, or how a neural network interprets a dataset. Machine learning models often start with data labeled by humans. Data scientists can sometimes explain the way a model is behaving by looking at the data it was trained on.

“What a particular neural network derives from a dataset are patterns that it finds that may or may not be obvious to humans,” said Aaron Edell, director of applied AI at AI platform Veritone.

But it can be hard to understand what good data looks like. Biased training data can show in up a variety of ways. A machine learning model trained to identify sheep might only come from pictures of farms, causing the model to misinterpret sheep in other settings, or white clouds on farm pictures as sheep. Facial recognition software can be trained on company faces — but if those faces are mostly male or white, the data is biased.

One good practice is to train machine learning models on data that should be indistinguishable from the data the model will be expected to run on. For example, a face recognition model that identified how long Jennifer Aniston appears in every episode of Friends should be trained on frames of actual episodes rather than Google image search results for ‘Jennifer Aniston.’ In a similar vein, it’s OK to train models on publicly available datasets, but generic pre-trained models as a service will be harder to explain and change if necessary.   

Balancing explainability, accuracy and risk

The real problem with implementing explainability in AI is that there are major trade-offs between accuracy, transparency and risk in different types of AI models, said Matthew Nolan, senior director of decision sciences at Pegasystems. More opaque models may be more accurate, but fail the explainability test. Other types of models like decision trees and Bayesian networks are considered more transparent but are less powerful and complex.

“These models are critical today as businesses deal with regulations such as like GDPR that require explainability in AI-based systems, but this sometimes will sacrifice performance,” said Nolan.

Focusing on transparency can cost a business, but turning to more opaque models can leave a model unchecked and might expose the consumer, customer and the business to additional risks or breaches.

To address this gap, platform vendors are starting to embed transparency settings into their AI tool sets. This can make it easier to companies to adjust the acceptable amount of opaqueness or transparency thresholds used in their AI models and gives enterprises the control to adjust the models based on their needs or on corporate governance policy so they can manage risk, maintain regulatory compliance and ensure customers a differentiated experience in a responsible way.

Data scientists should also identify when the complexity of new models are getting in the way of explainability. Yifei Huang, data science manager at sales engagement platform Outreach, said there are often simpler models available for attaining the same performance, but machine learning practitioners have a tendency toward using more fancy and advanced models.

Focus on the user

Explainability means different things to a highly skilled data scientist compared to a call center worker who may need to make decisions based on an explanation. The task of implementing explainable AI is not just to foster trust in explanations but also help the end users make decisions, said Ankkur Teredesai, CTO and co-founder at KenSci, an AI healthcare platform.

Often data scientists make the mistake of thinking about explanations from the perspective of a computer scientist, when the end user is a domain expert who may need just enough information to make a decision. For a model that predicts the risk of a patient being readmitted, a physician may want an explanation of the underlying medical reasons, while a discharge planner may want to know the likelihood of readmission to plan accordingly.

Teredesai said there is still no general guideline for explainability, particularly for different types of users. It’s also challenging to integrate these explanations into the machine learning and end user workflows. End users typically need explanations as possible actions to take based on a prediction rather than just explanation as reasons, and this requires striking the right balance between focusing on prediction and explanation fidelity.

There are a variety of tools for implementing explainability on top of machine learning models which generate visualizations and technical descriptions, but these can be difficult for end users to understand, said Jen Underwood, vice president of product management at Aible, an automated machine learning platform. Supplementing visualizations with natural language explanations is a way to partially bridge the data science literacy gap. Another good practice is to directly use humans in the loop to evaluate your explanations to see if they make sense to a human, said Daniel Fagnan, director of applied science on the Zillow Offers Analytics team. This can help lead to more accurate models through key improvements including model selection and feature engineering.

KPIs for AI risks

Enterprises should consider the specific reasons that explainable AI is important when looking towards how to measure explainability and accessibility. Teams should first and foremost establish a set of criteria for key AI risks including robustness, data privacy, bias, fairness, explainability and compliance, said Dr. Joydeep Ghosh, chief scientific officer at AI vendor CognitiveScale. It’s also useful to generate appropriate metrics for key stakeholders relevant to their needs.

External organizations like AI Global can help establish measurement targets that determine acceptable operating values. AI Global is a nonprofit organization that has established the AI Trust Index, a scoring benchmarks for explainable AI that is like a FICO score. This enables firms to not only establish their own best practices, but also compare the enterprise against industry benchmarks.

When someone offers you a silver bullet explainable AI technology or solution, check whether you can have a common-grounded conversation with the AI that goes deep and scales to the needs of the application.
Mark StefikResearch Fellow, PARC, a Xerox Company

Vendors are starting to automate this process with tools for automatically scoring, measuring and reporting on risk factors across the AI operations lifecycle based on the AI Trust Index. Although the tools for explainable AI are getting better, the technology is at an early research stage with proof-of-concept prototypes, cautioned Mark Stefik, a research fellow at PARC, a Xerox Company. There are substantial technology risks and gaps in machine learning and in AI explanations, depending on the application.

“When someone offers you a silver bullet explainable AI technology or solution, check whether you can have a common-grounded conversation with the AI that goes deep and scales to the needs of the application,” Stefik said.

Go to Original Article
Author:

Salesforce Trailhead app makes learning more convenient

SAN FRANCISCO — Salesforce customers see the value in the Trailhead learning platform and its new mobile app.

Trailhead Go for iOS is one of two new mobile apps that Salesforce announced here at Dreamforce 2019. Trailhead Go is a mobile extension of Trailhead, Salesforce’s free customer success learning platform enabling Salesforce users and nonusers to follow different paths to learn Salesforce skills. It now also offers Amazon Partner Connect to learn how to build Amazon Alexa skills and AWS. By the end of the year, Trailhead plans to roll out live and on-demand training videos.

Salesforce provides customer success tools to users before they even become customers. For most businesses, this model is flipped, providing these tools to users after they sign contracts, said Gerry Murray, a research director at IDC.

“It’s not only about how the product works, it’s about teaching the line- of-business people to elevate their skills or further their careers in and out of their companies,” Murray said. “Trailhead Go makes it all that more convenient.”

Making education accessible

A skills gap costs companies $1.3 trillion each year, said Sarah Franklin, general manager of Trailhead, in a keynote. While many workers think they can fill that gap with education, it has become more and more inaccessible. Over the last 20 years, student tuition has increased by 200%, and student debt has increased by 163%.

Anyone who has access to the Trailhead Go app can learn, said Ray Wang, principal analyst and founder at Constellation Research.

“You don’t have to go to school; you don’t need a computer; you just need a phone,” he said.

Customers see benefits

Trailhead Go app screenshot
This personalized homepage of the Trailhead Go app shows what trails a user is working on with a quick navigation bar at the bottom.

Supermums, based in London, equips moms with Salesforce skills through a combination of training, mentoring, work experience and job search support to get them into the Salesforce ecosystem. Trainees go through a customized six-month program where they earn 50 to 100 Trailhead badges. Trainees can benefit from the Trailhead app because they’ll be able to learn on the go, making it easier to fit into their schedules, said Heather Black, a certified Salesforce administrator and CEO of Supermums.

“[Trailhead Go] will help me complete more trails and fit it into my life while I’m busy supporting a team and juggling kids,” she said. “Trailhead Go makes this accessible to more people.”

Trailhead has also branched out beyond technical skills and into functional skills, Black said.

“It helps you develop as a person, as well as help you be successful in a Salesforce career,” she said.

Trailhead is great for helping learn the basics when people are entering the CRM world, said Sayantani Mitra, a data scientist at Goby Inc., a company that specializes in accounts payable automation.

“Read them, learn them, ask the community, ask people questions, do them multiple times,” Mitra said.

The best way to learn anything is practice, practice and practice more.
Sayantani MitraData scientist, Goby

But just getting a Salesforce certification won’t get someone a job, Mitra said. They have to know what they’re doing.

“The best way to learn anything is practice, practice and practice more,” Mitra said.

Mitra plans to use the Trailhead Go app particularly on long-haul flights.

“When I go home to India … you cannot watch movies for 20 hours or sleep for 20 hours; you need something more,” she said.

Trailhead Go is generally available now for free on the Apple App Store.

Go to Original Article
Author:

SwiftStack 7 storage upgrade targets AI, machine learning use cases

SwiftStack turned its focus to artificial intelligence, machine learning and big data analytics with a major update to its object- and file-based storage and data management software.

The San Francisco software vendor’s roots lie in the storage, backup and archive of massive amounts of unstructured data on commodity servers running a commercially supported version of OpenStack Swift. But SwiftStack has steadily expanded its reach over the last eight years, and its 7.0 update takes aim at the new scale-out storage and data management architecture the company claims is necessary for AI, machine learning and analytics workloads.

SwiftStack said it worked with customers to design clusters that scale linearly to handle multiple petabytes of data and support throughput of more than 100 GB per second. That allows it to handle workloads such as autonomous vehicle applications that feed data into GPU-based servers.

Marc Staimer, president of Dragon Slayer Consulting, said throughput of 100 GB per second is “really fast” for any type of storage and “incredible” for an object-based system. He said the fastest NVMe system tests at 120 GB per second, but it can scale only to about a petabyte.

“It’s not big enough, and NVMe flash is extremely costly. That doesn’t fit the AI [or machine learning] market,” Staimer said.

This is the second object storage product launched this week with speed not normally associated with object storage. NetApp unveiled an all-flash StorageGrid array Tuesday at its Insight user conference.

Staimer said SwiftStack’s high-throughput “parallel object system” would put the company into competition with parallel file system vendors such as DataDirect Networks, IBM Spectrum Scale and Panasas, but at a much lower cost.

New ProxyFS Edge

SwiftStack 7 plans introduce a new ProxyFS Edge containerized software component next year to give remote applications a local file system mount for data, rather than having to connect through a network file serving protocol such as NFS or SMB. SwiftStack spent about 18 months creating a new API and software stack to extend its ProxyFS to the edge.

Founder and chief product officer Joe Arnold said SwiftStack wanted to utilize the scale-out nature of its storage back end and enable a high number of concurrent connections to go in and out of the system to send data. ProxyFS Edge will allow each cluster node to be relatively stateless and cache data at the edge to minimize latency and improve performance.

SwiftStack 7 will also add 1space File Connector software in November to enable customers that build applications using the S3 or OpenStack Swift object API to access data in their existing file systems. The new File Connector is an extension to the 1space technology that SwiftStack introduced in 2018 to ease data access, migration and searches across public and private clouds. Customers will be able to apply 1space policies to file data to move and protect it.

Arnold said the 1space File Connector could be especially helpful for media companies and customers building software-as-a-service applications that are transitioning from NAS systems to object-based storage.

“Most sources of data produce files today and the ability to store files in object storage, with its greater scalability and cost value, makes the [product] more valuable,” said Randy Kerns, a senior strategist and analyst at Evaluator Group.

Kerns added that SwiftStack’s focus on the developing AI area is a good move. “They have been associated with OpenStack, and that is not perceived to be a positive and colors its use in larger enterprise markets,” he said.

AI architecture

A new SwiftStack AI architecture white paper offers guidance to customers building out systems that use popular AI, machine learning and deep learning frameworks, GPU servers, 100 Gigabit Ethernet networking, and SwiftStack storage software.

“They’ve had a fair amount of success partnering with Nvidia on a lot of the machine learning projects, and their software has always been pretty good at performance — almost like a best-kept secret — especially at scale, with parallel I/O,” said George Crump, president and founder of Storage Switzerland. “The ability to ratchet performance up another level and get the 100 GBs of bandwidth at scale fits perfectly into the machine learning model where you’ve got a lot of nodes and you’re trying to drive a lot of data to the GPUs.”

SwiftStack noted distinct differences between the architectural approaches that customers take with archive use cases versus newer AI or machine learning workloads. An archive customer might use 4U or 5U servers, each equipped with 60 to 90 drives, and 10 Gigabit Ethernet networking. By contrast, one machine learning client clustered a larger number of lower horsepower 1U servers, each with fewer drives and a 100 Gigabit Ethernet network interface card, for high bandwidth, he said.

An optional new SwiftStack Professional Remote Operations (PRO) paid service is now available to help customers monitor and manage SwiftStack production clusters. SwiftStack PRO combines software and professional services.

Go to Original Article
Author:

Countdown to Microsoft Global Learning Connection 2019: Two weeks to go—join us on Nov 5-6 to celebrate global learning and open students’ hearts and minds | | Microsoft EDU

The Microsoft Global Learning Connection (formerly Skype-a-Thon) event is almost here. Thousands of educators from more than 110 countries are preparing to connect their students with experts and classrooms around the world to share stories and cultural traditions, play games, and collaborate on projects. The goal is to empower young people to become more engaged global citizens and expand their horizons.

Our global community will count the virtual miles traveled after each connection. Ultimately, these will all contribute to our global goal of traveling 17 million virtual miles and connecting nearly a half-million students via Skype, Teams and Flipgrid.

This 48-hour annual event is a true celebration of the power of global learning and an opportunity to shift perspectives and foster greater empathy and compassion for our planet and each other. If you have arranged a connection, make sure to share your plans with us on social @SkypeClassroom with #MSFTGlobalConnect and #MicrosoftEDU.

And if you haven’t arranged a connection for the two days of the event, there is still time to join us.

Head to msftglobalclassroom.com to learn more about the event. We hope you will join us to connect and inspire your students on November 5 and 6.

To help you get started and plan your participation, we have gathered below all the necessary resources:

  • Download a step-by-step activity plan to help you organize your connections for the two-day
  • Access the teacher toolkit, which is full of resources for you and your students. This includes maps, stickers, digital passports, activity sheets, a letter to parents and more.
  • Are you interested in making the Global Learning Connection the starting point for an event at your school or getting ideas on how to tie the event with a global cause? Check out educators’ tips here.
  • Find out how to schedule connections via Skype, Teams and Flipgrid here.
  • Explore the event’s social toolkit and download ready-made templates to share your participation on social channels with our global community @SkypeClassroom with #MSFTGlobalConnect #MicrosoftEDU.

Happy Traveling!

Explore tools for Future Ready SkillsExplore tools for Future Ready Skills

Go to Original Article
Author: Microsoft News Center

Azure Sentinel—the cloud-native SIEM that empowers defenders is now generally available

Machine learning enhanced with artificial intelligence (AI) holds great promise in addressing many of the global cyber challenges we see today. They give our cyber defenders the ability to identify, detect, and block malware, almost instantaneously. And together they give security admins the ability to deconflict tasks, separating the signal from the noise, allowing them to prioritize the most critical tasks. It is why today, I’m pleased to announce that Azure Sentinel, a cloud-native SIEM that provides intelligent security analytics at cloud scale for enterprises of all sizes and workloads, is now generally available.

Our goal has remained the same since we first launched Microsoft Azure Sentinel in February: empower security operations teams to help enhance the security posture of our customers. Traditional Security Information and Event Management (SIEM) solutions have not kept pace with the digital changes. I commonly hear from customers that they’re spending more time with deployment and maintenance of SIEM solutions, which leaves them unable to properly handle the volume of data or the agility of adversaries.

Recent research tells us that 70 percent of organizations continue to anchor their security analytics and operations with SIEM systems,1 and 82 percent are committed to moving large volumes of applications and workloads to the public cloud.2 Security analytics and operations technologies must lean in and help security analysts deal with the complexity, pace, and scale of their responsibilities. To accomplish this, 65 percent of organizations are leveraging new technologies for process automation/orchestration, while 51 percent are adopting security analytics tools featuring machine learning algorithms.3 This is exactly why we developed Azure Sentinel—an SIEM re-invented in the cloud to address the modern challenges of security analytics.

Learning together

When we kicked off the public preview for Azure Sentinel, we were excited to learn and gain insight into the unique ways Azure Sentinel was helping organizations and defenders on a daily basis. We worked with our partners all along the way; listening, learning, and fine-tuning as we went. With feedback from 12,000 customers and more than two petabytes of data analysis, we were able to examine and dive deep into a large, complex, and diverse set of data. All of which had one thing in common: a need to empower their defenders to be more nimble and efficient when it comes to cybersecurity.

Our work with RapidDeploy offers one compelling example of how Azure Sentinel is accomplishing this complex task. RapidDeploy creates cloud-based dispatch systems that help first responders act quickly to protect the public. There’s a lot at stake, and the company’s cloud-native platform must be secure against an array of serious cyberthreats. So when RapidDeploy implemented a SIEM system, it chose Azure Sentinel, one of the world’s first cloud-native SIEMs.

Microsoft recently sat down with Alex Kreilein, Chief Information Security Officer at RapidDeploy. Here’s what he shared: “We build a platform that helps save lives. It does that by reducing incident response times and improving first responder safety by increasing their situational awareness.”

Now RapidDeploy uses the complete visibility, automated responses, fast deployment, and low total cost of ownership in Azure Sentinel to help it safeguard public safety systems. “With many SIEMs, deployment can take months,” says Kreilein. “Deploying Azure Sentinel took us minutes—we just clicked the deployment button and we were done.”

Learn even more about our work with RapidDeploy by checking out the full story.

Another great example of a company finding results with Azure Sentinel is ASOS. As one of the world’s largest online fashion retailers, ASOS knows they’re a prime target for cybercrime. The company has a large security function spread across five teams and two sites—but in the past, it was difficult for ASOS to gain a comprehensive view of cyberthreat activity. Now, using Azure Sentinel, ASOS has created a bird’s-eye view of everything it needs to spot threats early, allowing it to proactively safeguard its business and its customers. And as a result, it has cut issue resolution times in half.

“There are a lot of threats out there,” says Stuart Gregg, Cyber Security Operations Lead at ASOS. “You’ve got insider threats, account compromise, threats to our website and customer data, even physical security threats. We’re constantly trying to defend ourselves and be more proactive in everything we do.”

Already using a range of Azure services, ASOS identified Azure Sentinel as a platform that could help it quickly and easily unite its data. This includes security data from Azure Security Center and Azure Active Directory (Azure AD), along with data from Microsoft 365. The result is a comprehensive view of its entire threat landscape.

“We found Azure Sentinel easy to set up, and now we don’t have to move data across separate systems,” says Gregg. “We can literally click a few buttons and all our security solutions feed data into Azure Sentinel.”

Learn more about how ASOS has benefitted from Azure Sentinel.

RapidDeploy and ASOS are just two examples of how Azure Sentinel is helping businesses process data and telemetry into actionable security alerts for investigation and response. We have an active GitHub community of preview participants, partners, and even Microsoft’s own security experts who are sharing new connectors, detections, hunting queries, and automation playbooks.

With these design partners, we’ve continued our innovation in Azure Sentinel. It starts from the ability to connect to any data source, whether in Azure or on-premises or even other clouds. We continue to add new connectors to different sources and more machine learning-based detections. Azure Sentinel will also integrate with Azure Lighthouse service, which will enable service providers and enterprise customers with the ability to view Azure Sentinel instances across different tenants in Azure.

Secure your organization

Now that Azure Sentinel has moved out of public preview and is generally available, there’s never been a better time to see how it can help your business. Traditional on-premises SIEMs require a combination of infrastructure costs and software costs, all paired with annual commitments or inflexible contracts. We are removing those pain points, since Azure Sentinel is a cost-effective, cloud-native SIEM with predictable billing and flexible commitments.

Infrastructure costs are reduced since you automatically scale resources as you need, and you only pay for what you use. Or you can save up to 60 percent compared to pay-as-you-go pricing by taking advantage of capacity reservation tiers. You receive predictable monthly bills and the flexibility to change capacity tier commitments every 31 days. On top of that, bringing in data from Office 365 audit logs, Azure activity logs and alerts from Microsoft Threat Protection solutions doesn’t require any additional payments.

Please join me for the Azure Security Expert Series where we will focus on Azure Sentinel on Thursday, September 26, 2019, 10–11 AM Pacific Time. You’ll learn more about these innovations and see real use cases on how Azure Sentinel helped detect previously undiscovered threats. We’ll also discuss how Accenture and RapidDeploy are using Azure Sentinel to empower their security operations team.

Get started today with Azure Sentinel!

1 Source: ESG Research Survey, Security Analytics and Operations: Industry Trends in the Era of Cloud Computing, September 2019
2 Source: ESG Research Survey, Security Analytics and Operations: Industry Trends in the Era of Cloud Computing, September 2019
3 Source: ESG Research Survey, Security Analytics and Operations: Industry Trends in the Era of Cloud Computing, September 2019

Go to Original Article
Author: Microsoft News Center

Deep learning rises: New methods for detecting malicious PowerShell – Microsoft Security

Scientific and technological advancements in deep learning, a category of algorithms within the larger framework of machine learning, provide new opportunities for development of state-of-the art protection technologies. Deep learning methods are impressively outperforming traditional methods on such tasks as image and text classification. With these developments, there’s great potential for building novel threat detection methods using deep learning.

Machine learning algorithms work with numbers, so objects like images, documents, or emails are converted into numerical form through a step called feature engineering, which, in traditional machine learning methods, requires a significant amount of human effort. With deep learning, algorithms can operate on relatively raw data and extract features without human intervention.

At Microsoft, we make significant investments in pioneering machine learning that inform our security solutions with actionable knowledge through data, helping deliver intelligent, accurate, and real-time protection against a wide range of threats. In this blog, we present an example of a deep learning technique that was initially developed for natural language processing (NLP) and now adopted and applied to expand our coverage of detecting malicious PowerShell scripts, which continue to be a critical attack vector. These deep learning-based detections add to the industry-leading endpoint detection and response capabilities in Microsoft Defender Advanced Threat Protection (Microsoft Defender ATP).

Word embedding in natural language processing

Keeping in mind that our goal is to classify PowerShell scripts, we briefly look at how text classification is approached in the domain of natural language processing. An important step is to convert words to vectors (tuples of numbers) that can be consumed by machine learning algorithms. A basic approach, known as one-hot encoding, first assigns a unique integer to each word in the vocabulary, then represents each word as a vector of 0s, with 1 at the integer index corresponding to that word. Although useful in many cases, the one-hot encoding has significant flaws. A major issue is that all words are equidistant from each other, and semantic relations between words are not reflected in geometric relations between the corresponding vectors.

Contextual embedding is a more recent approach that overcomes these limitations by learning compact representations of words from data under the assumption that words that frequently appear in similar context tend to bear similar meaning. The embedding is trained on large textual datasets like Wikipedia. The Word2vec algorithm, an implementation of this technique, is famous not only for translating semantic similarity of words to geometric similarity of vectors, but also for preserving polarity relations between words. For example, in Word2vec representation:

Madrid – Spain + Italy ≈ Rome

Embedding of PowerShell scripts

Since training a good embedding requires a significant amount of data, we used a large and diverse corpus of 386K distinct unlabeled PowerShell scripts. The Word2vec algorithm, which is typically used with human languages, provides similarly meaningful results when applied to PowerShell language. To accomplish this, we split the PowerShell scripts into tokens, which then allowed us to use the Word2vec algorithm to assign a vectorial representation to each token .

Figure 1 shows a 2-dimensional visualization of the vector representations of 5,000 randomly selected tokens, with some tokens of interest highlighted. Note how semantically similar tokens are placed near each other. For example, the vectors representing -eq, -ne and -gt, which in PowerShell are aliases for “equal”, “not-equal” and “greater-than”, respectively, are clustered together. Similarly, the vectors representing the allSigned, remoteSigned, bypass, and unrestricted tokens, all of which are valid values for the execution policy setting in PowerShell, are clustered together.

Figure 1. 2D visualization of 5,000 tokens using Word2vec

Examining the vector representations of the tokens, we found a few additional interesting relationships.

Token similarity: Using the Word2vec representation of tokens, we can identify commands in PowerShell that have an alias. In many cases, the token closest to a given command is its alias. For example, the representations of the token Invoke-Expression and its alias IEX are closest to each other. Two additional examples of this phenomenon are the Invoke-WebRequest and its alias IWR, and the Get-ChildItem command and its alias GCI.

We also measured distances within sets of several tokens. Consider, for example, the four tokens $i, $j, $k and $true (see the right side of Figure 2). The first three are usually used to represent a numeric variable and the last naturally represents a Boolean constant. As expected, the $true token mismatched the others – it was the farthest (using the Euclidean distance) from the center of mass of the group.

More specific to the semantics of PowerShell in cybersecurity, we checked the representations of the tokens: bypass, normal, minimized, maximized, and hidden (see the left side of Figure 2). While the first token is a legal value for the ExecutionPolicy flag in PowerShell, the rest are legal values for the WindowStyle flag. As expected, the vector representation of bypass was the farthest from the center of mass of the vectors representing all other four tokens.

Figure 2. 3D visualization of selected tokens

Linear Relationships: Since Word2vec preserves linear relationships, computing linear combinations of the vectorial representations results in semantically meaningful results. Below are a few interesting relationships we found:

high – $false + $true ≈’ low
‘-eq’ – $false + $true ‘≈ ‘-neq’
DownloadFile – $destfile + $str ≈’ DownloadString ‘
Export-CSV’ – $csv + $html ‘≈ ‘ConvertTo-html’
‘Get-Process’-$processes+$services ‘≈ ‘Get-Service’

In each of the above expressions, the sign ≈ signifies that the vector on the right side is the closest (among all the vectors representing tokens in the vocabulary) to the vector that is the result of the computation on the left side.

Detection of malicious PowerShell scripts with deep learning

We used the Word2vec embedding of the PowerShell language presented in the previous section to train deep learning models capable of detecting malicious PowerShell scripts. The classification model is trained and validated using a large dataset of PowerShell scripts that are labeled “clean” or “malicious,” while the embeddings are trained on unlabeled data. The flow is presented in Figure 3.

Figure 3 High-level overview of our model generation process

Using GPU computing in Microsoft Azure, we experimented with a variety of deep learning and traditional ML models. The best performing deep learning model increases the coverage (for a fixed low FP rate of 0.1%) by 22 percentage points compared to traditional ML models. This model, presented in Figure 4, combines several deep learning building blocks such as Convolutional Neural Networks (CNNs) and Long Short-Term Memory Recurrent Neural Networks (LSTM-RNN). Neural networks are ML algorithms inspired by biological neural systems like the human brain. In addition to the pretrained embedding described here, the model is provided with character-level embedding of the script.

Figure 4 Network architecture of the best performing model

Real-world application of deep learning to detecting malicious PowerShell

The best performing deep learning model is applied at scale using Microsoft ML.Net technology and ONNX format for deep neural networks to the PowerShell scripts observed by Microsoft Defender ATP through the AMSI interface. This model augments the suite of ML models and heuristics used by Microsoft Defender ATP to protect against malicious usage of scripting languages.

Since its first deployment, this deep learning model detected with high precision many cases of malicious and red team PowerShell activities, some undiscovered by other methods. The signal obtained through PowerShell is combined with a wide range of ML models and signals of Microsoft Defender ATP to detect cyberattacks.

The following are examples of malicious PowerShell scripts that deep learning can confidently detect but can be challenging for other detection methods:

Figure 5. Heavily obfuscated malicious script

Figure 6. Obfuscated script that downloads and runs payload

Figure 7. Script that decrypts and executes malicious code

Enhancing Microsoft Defender ATP with deep learning

Deep learning methods significantly improve detection of threats. In this blog, we discussed a concrete application of deep learning to a particularly evasive class of threats: malicious PowerShell scripts. We have and will continue to develop deep learning-based protections across multiple capabilities in Microsoft Defender ATP.

Development and productization of deep learning systems for cyber defense require large volumes of data, computations, resources, and engineering effort. Microsoft Defender ATP combines data collected from millions of endpoints with Microsoft computational resources and algorithms to provide industry-leading protection against attacks.

Stronger detection of malicious PowerShell scripts and other threats on endpoints using deep learning mean richer and better-informed security through Microsoft Threat Protection, which provides comprehensive security for identities, endpoints, email and data, apps, and infrastructure.

Shay Kels and Amir Rubin
Microsoft Defender ATP team

Additional references:

Go to Original Article
Author: Microsoft News Center

IT pros look to VMware’s GPU acceleration projects to kick-start AI

SAN FRANCISCO — IT pros who need to support emerging AI and machine learning workloads see promise in a pair of developments VMware previewed this week to bolster support for GPU-accelerated computing in vSphere.

GPUs are uniquely suited to handle the massive processing demands of AI and machine learning workloads, and chipmakers like Nvidia Corp. are now developing and promoting GPUs specifically designed for this purpose.

A previous partnership with Nvidia introduced capabilities that allowed VMware customers to assign GPUs to VMs, but not more than one GPU per VM. The latest development, which Nvidia calls its Virtual Compute Server, allows customers to assign multiple virtual GPUs to a VM.

Nvidia’s Virtual Compute Server also works with VMware’s vMotion capability, allowing IT pros to live migrate a GPU-accelerated VM to another physical host. The companies have also extended this partnership to VMware Cloud on AWS, allowing customers to access Amazon Elastic Compute Cloud bare-metal instances with Nvidia T4 GPUs.

VMware gave the Nvidia partnership prime time this week at VMworld 2019, playing a prerecorded video of Nvidia CEO Jensen Huang talking up the companies’ combined efforts during Monday’s general session. However, another GPU acceleration project also caught the eye of some IT pros who came to learn more about VMware’s recent acquisition of Bitfusion.io Inc.

VMware acquired Bitfusion earlier this year and announced its intent to integrate the startup’s GPU virtualization capabilities into vSphere. Bitfusion’s FlexDirect connects GPU-accelerated servers over the network and provides the ability to assign GPUs to workloads in real time. The company compares its GPU vitalization approach to network-attached storage because it disaggregates GPU resources and makes them accessible to any server on the network as a pool of resources.

The software’s unique approach also allows customers to assign just portions of a GPU to different workloads. For example, an IT pro might assign 50% of a GPU’s capacity to one VM and 50% to another VM. This approach can allow companies to more efficiently use its investments in expensive GPU hardware, company executives said. FlexDirect also offers extensions to support field-programmable gate arrays and application-specific integrated circuits.

“I was really happy to see they’re doing this at the network level,” said Kevin Wilcox, principal virtualization architect at Fiserv, a financial services company. “We’ve struggled with figuring out how to handle the power and cooling requirements for GPUs. This looks like it’ll allow us to place to our GPUs in a segmented section of our data center that can handle those power and cooling needs.”

AI demand surging

Many companies are only beginning to research and invest in AI capabilities, but interest is growing rapidly, said Gartner analyst Chirag Dekate.

“By end of this year, we anticipate that one in two organizations will have some sort of AI initiative, either in the [proof-of-concept] stage or the deployed stage,” Dekate said.

In many cases, IT operations professionals are being asked to move quickly on a variety of AI-focused projects, a trend echoed by multiple VMworld attendees this week.

“We’re just starting with AI, and looking at GPUs as an accelerator,” said Martin Lafontaine, a systems architect at Netgovern, a software company that helps customers comply with data locality compliance laws.

“When they get a subpoena and have to prove where [their data is located], our solution uses machine learning to find that data. We’re starting to look at what we can do with GPUs,” Lafontaine said.

Is GPU virtualization the answer?

Recent efforts to virtualize GPU resources could open the door to broader use of GPUs for AI workloads, but potential customers should pay close attention to benchmark testing, compared to bare-metal deployments, in the coming years, Gartner’s Dekate said.

So far, he has not encountered a customer using these GPU virtualization tactics for deep learning workloads at scale. Today, most organizations still run these deep learning workloads on bare-metal hardware.

 “The future of this technology that Bitfusion is bringing will be decided by the kind of overheads imposed on the workloads,” Dekate said, referring to the additional compute cycles often required to implement a virtualization layer. “The deep learning workloads we have run into are extremely compute-bound and memory-intensive, and in our prior experience, what we’ve seen is that any kind of virtualization tends to impose overheads. … If the overheads are within acceptable parameters, then this technology could very well be applied to AI.”

Go to Original Article
Author:

New machine learning model sifts through the good to unearth the bad in evasive malware – Microsoft Security

We continuously harden machine learning protections against evasion and adversarial attacks. One of the latest innovations in our protection technology is the addition of a class of hardened malware detection machine learning models called monotonic models to Microsoft Defender ATP‘s Antivirus.

Historically, detection evasion has followed a common pattern: attackers would build new versions of their malware and test them offline against antivirus solutions. They’d keep making adjustments until the malware can evade antivirus products. Attackers then carry out their campaign knowing that the malware won’t initially be blocked by AV solutions, which are then forced to catch up by adding detections for the malware. In the cybercriminal underground, antivirus evasion services are available to make this process easier for attackers.

Microsoft Defender ATP’s Antivirus has significantly advanced in becoming resistant to attacker tactics like this. A sizeable portion of the protection we deliver are powered by machine learning models hosted in the cloud. The cloud protection service breaks attackers’ ability to test and adapt to our defenses in an offline environment, because attackers must either forgo testing, or test against our defenses in the cloud, where we can observe them and react even before they begin.

Hardening our defenses against adversarial attacks doesn’t end there. In this blog we’ll discuss a new class of cloud-based ML models that further harden our protections against detection evasion.

Most machine learning models are trained on a mix of malicious and clean features. Attackers routinely try to throw these models off balance by stuffing clean features into malware.

Monotonic models are resistant against adversarial attacks because they are trained differently: they only look for malicious features. The magic is this: Attackers can’t evade a monotonic model by adding clean features. To evade a monotonic model, an attacker would have to remove malicious features.

Monotonic models explained

Last summer, researchers from UC Berkeley (Incer, Inigo, et al, “Adversarially robust malware detection using monotonic classification”, Proceedings of the Fourth ACM International Workshop on Security and Privacy Analytics, ACM, 2018) proposed applying a technique of adding monotonic constraints to malware detection machine learning models to make models robust against adversaries. Simply put, the said technique only allows the machine learning model to leverage malicious features when considering a file – it’s not allowed to use any clean features.

Figure 1. Features used by a baseline versus a monotonic constrained logistic regression classifier. The monotonic classifier does not use cleanly-weighted features so that it’s more robust to adversaries.

Inspired by the academic research, we deployed our first monotonic logistic regression models to Microsoft Defender ATP cloud protection service in late 2018. Since then, they’ve played an important part in protecting against attacks.

Figure 2 below illustrates the production performance of the monotonic classifiers versus the baseline unconstrained model. Monotonic-constrained models expectedly have lower outcome in detecting malware overall compared to classic models. However, they can detect malware attacks that otherwise would have been missed because of clean features.

Figure 2. Malware detection machine learning classifiers comparing the unconstrained baseline classifier versus the monotonic constrained classifier in customer protection.

The monotonic classifiers don’t replace baseline classifiers; they run in addition to the baseline and add additional protection. We combine all our classifiers using stacked classifier ensembles–monotonic classifiers add significant value because of the unique classification they provide.

How Microsoft Defender ATP uses monotonic models to stop adversarial attacks

One common way for attackers to add clean features to malware is to digitally code-sign malware with trusted certificates. Malware families like ShadowHammer, Kovter, and Balamid are known to abuse certificates to evade detection. In many of these cases, the attackers impersonate legitimate registered businesses to defraud certificate authorities into issuing them trusted code-signing certificates.

LockerGoga, a strain of ransomware that’s known for being used in targeted attacks, is another example of malware that uses digital certificates. LockerGoga emerged in early 2019 and has been used by attackers in high-profile campaigns that targeted organizations in the industrial sector. Once attackers are able breach a target network, they use LockerGoga to encrypt enterprise data en masse and demand ransom.

Figure 3. LockerGoga variant digitally code-signed with a trusted CA

When Microsoft Defender ATP encounters a new threat like LockerGoga, the client sends a featurized description of the file to the cloud protection service for real-time classification. An array of machine learning classifiers processes the features describing the content, including whether attackers had digitally code-signed the malware with a trusted code-signing certificate that chains to a trusted CA. By ignoring certificates and other clean features, monotonic models in Microsoft Defender ATP can correctly identify attacks that otherwise would have slipped through defenses.

Very recently, researchers demonstrated an adversarial attack that appends a large volume of clean strings from a computer game executable to several well-known malware and credential dumping tools – essentially adding clean features to the malicious files – to evade detection. The researchers showed how this technique can successfully impact machine learning prediction scores so that the malware files are not classified as malware. The monotonic model hardening that we’ve deployed in Microsoft Defender ATP is key to preventing this type of attack, because, for a monotonic classifier, adding features to a file can only increase the malicious score.

Given how they significantly harden defenses, monotonic models are now standard components of machine learning protections in Microsoft Defender ATP‘s Antivirus. One of our monotonic models uniquely blocks malware on an average of 200,000 distinct devices every month. We now have three different monotonic classifiers deployed, protecting against different attack scenarios.

Monotonic models are just the latest enhancements to Microsoft Defender ATP’s Antivirus. We continue to evolve machine learning-based protections to be more resilient to adversarial attacks. More effective protections against malware and other threats on endpoints increases defense across the entire Microsoft Threat Protection. By unifying and enabling signal-sharing across Microsoft’s security services, Microsoft Threat Protection secures identities, endpoints, email and data, apps, and infrastructure.

Geoff McDonald (@glmcdona),Microsoft Defender ATP Research team
with Taylor Spangler, Windows Data Science team


Talk to us

Questions, concerns, or insights on this story? Join discussions at the Microsoft Defender ATP community.

Follow us on Twitter @MsftSecIntel.

Go to Original Article
Author: Microsoft News Center

At HR Technology Conference, Walmart says virtual reality works

LAS VEGAS — Learning technology appears to be heading for a major upgrade. Walmart is using virtual reality, or VR, to train its employees, and many other companies may soon do the same.

VR adoption is part of a larger tech shift in employee learning. For example, companies such as Wendy’s are using simulation or gamification to help employees learn about food preparation.

Deploying VR technology is expensive, with cost estimates ranging from tens of thousands of dollars to millions, attendees at the HR Technology Conference learned. But headset prices are declining rapidly, and libraries of VR training tools for dealing with common HR situations — such as how to fire an employee — may make this tool affordable to firms of all sizes.

For Walmart, a payoff of using virtual reality comes from higher job certification test scores. Meanwhile, Wendy’s has been using computer simulations to help employees learn their jobs. It is also adapting its training to the expectations of its workers, and its efforts have led to a turnover reduction. Based on presentations and interviews at the HR Technology Conference, users deploying these technologies are enthusiastic about them.

Walmart employees experience VR’s 3D

“It truly becomes an experience,” said Andy Trainor, senior director of Walmart Academies, in an interview about the impact of VR and augmented reality on training. It’s unlike a typical classroom lesson. “Employees actually feel like they experience it,” he said.

Walmart has adopted virtual reality for its training program.
Walmart’s training and virtual reality team, from left to right: Brock McKeel, senior director of digital operations at Walmart and Andy Trainor, senior director of Walmart Academies.

Walmart employees go to “academies” for training, testing and certification on certain processes, such as taking care of the store’s produce section, interacting with customers or preparing for Black Friday. As one person in a class wears the VR headset or goggles, what that person sees and experiences displays on a monitor for the class to follow.

Walmart has been using VR in training from startup STRIVR for just over a year. In classes using VR, Trainor said the company is seeing an increase in test scores as high as 15% over traditional methods of instruction. Trainor said his team members are convinced VR, with its ability to create 3D simulations, is here to stay as a training tool. 

“Life isn’t 2D,” said Brock McKeel, senior director of digital operations at Walmart. For problems ranging from customer service issues to emergency weather planning, “we want our associates to be the best prepared that we can get them to be.”

Walmart has also created a simulation-type game that helps employees understand store management. The company plans to soon release its simulation as an app for anyone to experience, Trainor said.

The old ways of training are broken

The need to do things differently in learning was a theme at the HR Technology Conference.

Life isn’t 2D.
Brock McKeelsenior director of digital operations at Walmart

The idea that employees will take time out of their day to watch a training video or read material that may not be connected to their task at hand is not effective, said David Mallon, a vice president and chief analyst at Bersin, Deloitte Consulting, based in Oakland, Calif.

The traditional methods of learning “have fallen apart,” Mallon said. Employees “want to engage with content on their terms, when they need it, where they need it and in ways that make more sense.”

Mallon’s point is something Wendy’s realized about its restaurant workers, who understand technology and have expectations about content, said Coley O’Brien, chief people officer at the restaurant chain. Employees want the content to be quick, they want the ability to swipe, and videos should be 30 seconds or less, he said.

“We really had to think about how we evolve our training approach and our content to really meet their expectations,” said O’Brien, who presented at the conference.

Wendy’s also created simulations that reproduce some of the time pressures faced with certain food-preparation processes. Employees must make choices in simulations, and mistakes are tracked. The company uses Cornerstone OnDemand’s platform.

Restaurants in which employees received a certain level of certification see higher sales of 1% to 2%, increases in customer satisfaction and a turnover reduction as high as 20%, O’Brien said.