Tag Archives: learning

How to win in the AI era? For now, it’s all about the data

Artificial intelligence is the new electricity, said deep learning pioneer Andrew Ng. Just as electricity transformed every major industry a century ago, AI will give the world a major jolt. Eventually.

For now, 99% of the economic value created by AI comes from supervised learning systems, according to Ng. These algorithms require human teachers and tremendous amounts of data to learn. It’s a laborious, but proven process.

AI algorithms, for example, can now recognize images of cats, although they required thousands of labeled images of cats to do so; and they can understand what someone is saying, although leading speech recognition systems needed 50,000 hours of speech — and their transcripts — to do so.

Ng’s point is that data is the competitive differentiator for what AI can do today — not algorithms, which, once trained, can be copied.

“There’s so much open source, word gets out quickly, and it’s not that hard for most organizations to figure out what algorithms organizations are using,” said Ng, an AI thought leader and an adjunct professor of computer science at Stanford University, at the recent EmTech conference in Cambridge, Mass.

His presentation gave attendees a look at the state of the AI era, as well as the four characteristics he believes will be a part of every AI company, which includes a revamp of job descriptions.

Positive feedback loop

So data is vital in today’s AI era, but companies don’t need to be a Google or a Facebook to reap the benefits of AI. All they need is enough data upfront to get a project off the ground, Ng said. That starter data will attract customers who, in turn, will create more data for the product.

“This results in a positive feedback loop. So, after a period of time, you might have enough data yourself to have a defensible business,” said Ng.

Andrew Ng, Stanford, AI, state of AI, deep learning, EmTech
Andrew Ng on stage at EmTech

A couple of his students at Stanford did just that when they launched Blue River Technology, an ag-tech startup that combines computer vision, robotics and machine learning for field management. The co-founders started with lettuce, collecting images and putting together enough data to get lettuce farmers on board, according to Ng. Today, he speculated, they likely have the biggest data asset of lettuce in the world.

“And this actually makes their business, in my opinion, pretty defensible because even the global giant tech companies, as far as I know, do not have this particular data asset, which makes their business at least challenging for the very large tech companies to enter,” he said.

Turns out, that data asset is actually worth hundreds of millions: John Deere acquired Blue River for $300 million in September.

“Data accumulation is one example of how I think corporate strategy is changing in the AI era, and in the deep learning era,” he said.

Four characteristics of an AI company

While it’s too soon to tell what successful AI companies will look like, Ng suggested another corporate disruptor might provide some insight: the internet.

One of the lessons Ng learned with the rise of the internet was that companies need more than a website to be an internet company. The same, he argued, holds true for AI companies.

“If you take a traditional tech company and add a bunch of deep learning or machine learning or neural networks to it, that does not make it an AI company,” he said.

Internet companies are architected to take advantage of internet capabilities, such as A/B testing, short cycle times to ship products, and decision-making that’s pushed down to the engineer and product level, according to Ng.

AI companies will need to be architected to do the same in relation to AI. What A/B testing’s equivalent will be for AI companies is still unknown, but Ng shared four thoughts on characteristics he expects AI companies will share.

  1. Strategic data acquisition. This is a complex process, requiring companies to play what Ng called multiyear chess games, acquiring important data from one resource that’s monetized elsewhere. “When I decide to launch a product, one of the criteria I use is, can we plan a path for data acquisition that results in a defensible business?” Ng said.
  2. Unified data warehouse. This likely comes as no surprise to CIOs, who have been advocates of the centralized data warehouse for years. But for AI companies that need to combine data from multiple sources, data silos — and the bureaucracy that comes with them — can be an AI project killer. Companies should get to work on this now, as “this is often a multiyear exercise for companies to implement,” Ng said.
  3. New job descriptions. AI products like chatbots can’t be sketched out the way apps can, and so product managers will have to communicate differently with engineers. Ng, for one, is training product managers to give product specifications.
  4. Centralized AI team. AI talent is scarce, so companies should consider building a single AI team that can then support business units across the organization. “We’ve seen this pattern before with the rise of mobile,” Ng said. “Maybe around 2011, none of us could hire enough mobile engineers.” Once the talent numbers caught up with demand, companies embedded mobile talent into individual business units. The same will likely play out in the AI era, Ng said.

The uphill battle of beating back weaponized AI

Artificial intelligence isn’t just for the law-abiding. Machine learning algorithms are as freely available to cybercriminals and state-sponsored actors as they are to financial institutions, retailers and insurance companies.

“When we look especially at terrorist groups who are exploiting social media, [and] when we look at state-sponsored efforts to influence and manipulate, they’re using really powerful algorithms that are at everyone’s disposal,” said Yasmin Green, director of research and development at Jigsaw, a technology incubator launched by Google to try to solve geopolitical problems.

Criminals need not develop new algorithms or new AI, Green said at the recent EmTech conference in Cambridge, Mass. They can and are exploiting what is already out there to manipulate public opinion.

The good news about weaponized AI? The tools to combat these nefarious efforts are also advancing. One promising lead, according to Green, is bad actors don’t exhibit the same kinds of online behavior that typical users do. And security experts are hoping to exploit the behavioral “tells” they’re seeing — with the help of machines, of course.

Variations on weaponized AI

Cybercriminals and internet trolls are adept at using AI to simulate human behavior and trick systems or peddle propaganda. The online test used to tell humans from machines, CAPTCHA, is continuously bombarded by bad guys trying to trick it.

In an effort to stay ahead of cybercriminals, CAPTCHA, which stands for Completely Automated Public Turing Test to Tell Computers and Humans Apart, has had to evolve, creating some unanticipated consequences, according to Shuman Ghosemajumder, CTO at Shape Security in Mountain View, Calif. Recent data from Google shows that humans solve CAPTCHAs just 33% of the time. That’s compared to state-of-the-art machine learning optical character recognition technology that has a solve rate of 99.8%.

“This is doing exactly the opposite of what CAPTCHA was originally intended to do,” Ghosemajumder said. “And that has now been weaponized.”

He said advances in computer vision technology have led to weaponized AI services such as Death By CAPTCHA, an API plug-in that promises to solve 1,000 CAPTCHAs for $1.39. “And there are, of course, discounts for gold members of the service.”

A more aggressive attack is credential stuffing, where cybercriminals use stolen usernames and passwords from third-party sources to gain access to accounts.

Sony was the victim of a credential-stuffing attack in 2011. Cybercriminals culled a list of 15 million credentials stolen from other sites and then tested if they worked on Sony’s login page using a botnet. Today, an outfit by the good-guy-sounding name of Sentry MBA — the MBA stands for Modded By Artists — provides cybercriminals with a user interface and automation technology, making it easy to test the veracity of stolen usernames and passwords and to even bypass security features like CAPTCHAs.

“We see these types of attacks responsible for tremendous amounts of traffic on some of the world’s largest websites,” Ghosemajumder said. In the case of one Fortune 100 company, credential-stuffing attacks made up more than 90% of its login activity.

Shuman Ghosemajumder, EmTech, Shape Security, credential-stuffing attacks
Shuman Ghosemajumder shares a snippet of traffic from a Fortune 100 retailer. ‘We see that on a 24/7 basis, more than 90% of the login activity was coming from credential-stuffing attacks,’ he said.

Behavioral tells in weaponized AI

Ghosemajumder’s firm Shape Security is now using AI to detect credential-stuffing efforts. One method is to use machine learning to identify behavioral characteristics that are typical of cybercriminal exploits.

When cybercriminals simulate human interactions, they will, for example, move the mouse from the username field to the password field quickly and efficiently — in an unhumanlike manner. “Human beings are not capable of doing things like moving a mouse in a straight line — no matter how hard they try,” Ghosemajumder said.

Jigsaw’s Green said her team is also looking for “technical markers” that can distinguish truly organic campaigns from coordinated ones. She described state-sponsored actors who peddle propaganda and attempt to spread misinformation through what she called “seed-and-fertilizer campaigns.”

The goal of these state-sponsored campaigns is to plant a seed in social conversations and to have the unwitting masses fertilize that seed for it to actually become an organic conversation.
Yasmin Greendirector of research and development, Jigsaw

“The goal of these state-sponsored campaigns is to plant a seed in social conversations and to have the unwitting masses fertilize that seed for it to actually become an organic conversation,” she said.

“There are a few dimensions that we think are promising to look at. One is the temporal dimension,” she said.

Looking across the internet, Jigsaw began to understand that coordinated attacks tend to move together, last longer than organic campaigns and pause as state-sponsored actors waited for instructions on what to do. “You’ll see a little delay before they act,” she said.

Other dimensions include network shape and semantics. State-sponsored actors tend to be more tightly linked together than communities within organic campaigns, and they tend to use “irregularly similar” language in their messaging.

The big question is can behavioral tells — identified by machines and combined with automated detection — be used to effectively identify state-sponsored campaigns? No doubt, time will tell.

CIOs should lean on AI ‘giants’ for machine learning strategy

NEW YORK — Machine learning and deep learning will be part of every data science organization, according to Edd Wilder-James, former vice president of technology strategy at Silicon Valley Data Science and now an open source strategist at Google’s TensorFlow.

Wilder-James, who spoke at the Strata Data Conference, pointed to recent advancements in image and speech recognition algorithms as examples of why machine learning and deep learning are going mainstream. He believes image and speech recognition software has evolved to the point where it can see and understand some things as well as — and in some use cases better than — humans. That makes it ripe to become part of the internal workings of applications and the driver of new and better services to internal and external customers, he said.

But what investments in AI should CIOs make to provide these capabilities to their companies? When building a machine learning strategy, choice abounds, Wilder-James said.

Machine learning vs. deep learning

Deep learning is a subset of machine learning, but it’s different enough to be discussed separately, according to Wilder-James. Examples of machine learning models include optimization, fraud detection and preventive maintenance. “We use machine learning to identify patterns,” Wilder-James said. “Here’s a pattern. Now, what do we know? What can we do as a result of identifying this pattern? Can we take action?”

Deep learning models perform tasks that more closely resemble human intelligence such as image processing and recognition. “With a massive amount of compute power, we’re able to look at a massively large number of input signals,” Wilder-James said. “And, so what a computer is able to do starts to look like human cognitive abilities.”

Some of the terrain for machine learning will look familiar to CIOs. Statistical programming languages such as SAS, SPSS and Matlab are known territory for IT departments. Open source counterparts such as R, Python and Spark are also machine-learning ready. “Open source is probably a better guarantee of stability and a good choice to make in terms of avoiding lock-in and ensuring you have support,” Wilder-James said.

Unlike other tech rollouts

The rollout of machine learning and deep learning models, however, is a different process than most technology rollouts. After getting a handle on the problem, CIOs will need to investigate if machine learning is even an appropriate solution.

“It may not be true that you can solve it with machine learning,” Wilder-James said. “This is one important difference from other technical rollouts. You don’t know if you’ll be successful or not. You have to enter into this on the pilot, proof-of-concept ladder.”

The most time-consuming step in deploying a machine learning model is feature engineering, or finding features in the data that will help the algorithms self-tune. Deep learning models skip the tedious feature engineering step and go right to the training step. To tune a deep learning model correctly requires immense data sets, graphic processing units or tensor processing units, and time. Wilder-James said it could take weeks and even months to train a deep learning model.

One more thing to note: Building deep learning models is hard and won’t be a part of most companies’ machine learning strategy.

“You have to be aware that a lot of what’s coming out is the closest to research IT has ever been,” he said. “These things are being published in papers and deployed in production in very short cycles.”

CIOs whose companies are not inclined to invest heavily in AI research and development should instead rely on prebuilt, reusable machine and deep learning models rather than reinvent the wheel. Image recognition models, such as Inception, and natural language models, such as SyntaxNet and Parsey McParseface, are examples of models that are ready and available for use.

“You can stand on the shoulders of giants, I guess that’s what I’m trying to say,” Wilder-James said. “It doesn’t have to be from scratch.”

Machine learning tech

The good news for CIOs is that vendors have set the stage to start building a machine learning strategy now. TensorFlow, a machine learning software library, is one of the best known toolkits out there. “It’s got the buzz because it’s an open source project out of Google,” Wilder-James said. “It runs fast and is ubiquitous.”

While not terribly developer-friendly, a simplified interface called Keras eases the burden and can handle the majority of use cases. And TensorFlow isn’t the only deep learning library or framework option, either. Others include MXNet, PyTorch, CNTK, and Deeplearning4j.

For CIOs who want AI to live on premises, technologies such as Nvidia’s DGX-1 box, which retails for $129,000, are available.

But CIOs can also utilize cloud as a computing resource, which would cost anywhere between $5 and $15 an hour, according to Wilder-James. “I worked it out, and the cloud cost is roughly the same as running the physical machine continuously for about a year,” he said.

Or they can choose to go the hosted platform route, where a service provider will run trained models for a company. And other tools, such as domain-specific proprietary tools like the personalization platform from Nara Logics, can fill out the AI infrastructure.

“It’s the same kind of range we have with plenty of other services out there,” he said. “Do you rent an EC2 instance to run a database or do you subscribe to Amazon Redshift? You can pick the level of abstraction that you want for these services.”

Still, before investments in technology and talent are made, a machine learning strategy should start with the basics: “The single best thing you can do to prepare with AI in the future is to develop a competency with your own data, whether it’s getting access to data, integrating data out of silos, providing data results readily to employees,” Wilder-James said. “Understanding how to get at your data is going to be the thing to prepare you best.”

AWS and Microsoft announce Gluon, making deep learning accessible to all developers – News Center

New open source deep learning interface allows developers to more easily and quickly build machine learning models without compromising training performance. Jointly developed reference specification makes it possible for Gluon to work with any deep learning engine; support for Apache MXNet available today and support for Microsoft Cognitive Toolkit coming soon.

SEATTLE and REDMOND, Wash. — Oct. 12, 2017 — On Thursday, Amazon Web Services Inc. (AWS), an Amazon.com company (NASDAQ: AMZN), and Microsoft Corp. (NASDAQ: MSFT) announced a new deep learning library, called Gluon, that allows developers of all skill levels to prototype, build, train and deploy sophisticated machine learning models for the cloud, devices at the edge and mobile apps. The Gluon interface currently works with Apache MXNet and will support Microsoft Cognitive Toolkit (CNTK) in an upcoming release. With the Gluon interface, developers can build machine learning models using a simple Python API and a range of prebuilt, optimized neural network components. This makes it easier for developers of all skill levels to build neural networks using simple, concise code, without sacrificing performance. AWS and Microsoft published Gluon’s reference specification so other deep learning engines can be integrated with the interface. To get started with the Gluon interface, visit https://github.com/gluon-api/gluon-api/.

Developers build neural networks using three components: training data, a model and an algorithm. The algorithm trains the model to understand patterns in the data. Because the volume of data is large and the models and algorithms are complex, training a model often takes days or even weeks. Deep learning engines like Apache MXNet, Microsoft Cognitive Toolkit and TensorFlow have emerged to help optimize and speed the training process. However, these engines require developers to define the models and algorithms up front using lengthy, complex code that is difficult to change. Other deep learning tools make model-building easier, but this simplicity can come at the cost of slower training performance.

The Gluon interface gives developers the best of both worlds — a concise, easy-to-understand programming interface that enables developers to quickly prototype and experiment with neural network models, and a training method that has minimal impact on the speed of the underlying engine. Developers can use the Gluon interface to create neural networks on the fly, and to change their size and shape dynamically. In addition, because the Gluon interface brings together the training algorithm and the neural network model, developers can perform model training one step at a time. This means it is much easier to debug, update and reuse neural networks.

“The potential of machine learning can only be realized if it is accessible to all developers. Today’s reality is that building and training machine learning models require a great deal of heavy lifting and specialized expertise,” said Swami Sivasubramanian, VP of Amazon AI. “We created the Gluon interface so building neural networks and training models can be as easy as building an app. We look forward to our collaboration with Microsoft on continuing to evolve the Gluon interface for developers interested in making machine learning easier to use.”

“We believe it is important for the industry to work together and pool resources to build technology that benefits the broader community,” said Eric Boyd, corporate vice president of Microsoft AI and Research. “This is why Microsoft has collaborated with AWS to create the Gluon interface and enable an open AI ecosystem where developers have freedom of choice. Machine learning has the ability to transform the way we work, interact and communicate. To make this happen we need to put the right tools in the right hands, and the Gluon interface is a step in this direction.”

“FINRA is using deep learning tools to process the vast amount of data we collect in our data lake,” said Saman Michael Far, senior vice president and CTO, FINRA. “We are excited about the new Gluon interface, which makes it easier to leverage the capabilities of Apache MXNet, an open source framework that aligns with FINRA’s strategy of embracing open source and cloud for machine learning on big data.”

“I rarely see software engineering abstraction principles and numerical machine learning playing well together — and something that may look good in a tutorial could be hundreds of lines of code,” said Andrew Moore, dean of the School of Computer Science at Carnegie Mellon University. “I really appreciate how the Gluon interface is able to keep the code complexity at the same level as the concept; it’s a welcome addition to the machine learning community.”

“The Gluon interface solves the age old problem of having to choose between ease of use and performance, and I know it will resonate with my students,” said Nikolaos Vasiloglou, adjunct professor of Electrical Engineering and Computer Science at Georgia Institute of Technology. “The Gluon interface dramatically accelerates the pace at which students can pick up, apply and innovate on new applications of machine learning. The documentation is great, and I’m looking forward to teaching it as part of my computer science course and in seminars that focus on teaching cutting-edge machine learning concepts across different cities in the U.S.”

“We think the Gluon interface will be an important addition to our machine learning toolkit because it makes it easy to prototype machine learning models,” said Takero Ibuki, senior research engineer at DOCOMO Innovations. “The efficiency and flexibility this interface provides will enable our teams to be more agile and experiment in ways that would have required a prohibitive time investment in the past.”

The Gluon interface is open source and available today in Apache MXNet 0.11, with support for CNTK in an upcoming release. Developers can learn how to get started using Gluon with MXNet by viewing tutorials for both beginners and experts available by visiting https://mxnet.incubator.apache.org/gluon/.

About Amazon Web Services

For 11 years, Amazon Web Services has been the world’s most comprehensive and broadly adopted cloud platform. AWS offers over 90 fully featured services for compute, storage, networking, database, analytics, application services, deployment, management, developer, mobile, Internet of Things (IoT), Artificial Intelligence (AI), security, hybrid and enterprise applications, from 44 Availability Zones (AZs) across 16 geographic regions in the U.S., Australia, Brazil, Canada, China, Germany, India, Ireland, Japan, Korea, Singapore, and the UK. AWS services are trusted by millions of active customers around the world — including the fastest-growing startups, largest enterprises, and leading government agencies — to power their infrastructure, make them more agile, and lower costs. To learn more about AWS, visit https://aws.amazon.com.

About Amazon

Amazon is guided by four principles: customer obsession rather than competitor focus, passion for invention, commitment to operational excellence, and long-term thinking. Customer reviews, 1-Click shopping, personalized recommendations, Prime, Fulfillment by Amazon, AWS, Kindle Direct Publishing, Kindle, Fire tablets, Fire TV, Amazon Echo, and Alexa are some of the products and services pioneered by Amazon. For more information, visit www.amazon.com/about and follow @AmazonNews.

About Microsoft

Microsoft (Nasdaq “MSFT” @microsoft) is the leading platform and productivity company for the mobile-first, cloud-first world, and its mission is to empower every person and every organization on the planet to achieve more.

For more information, press only:

Microsoft Media Relations, WE Communications for Microsoft, (425) 638-7777, rrt@we-worldwide.com

Note to editors: For more information, news and perspectives from Microsoft, please visit the Microsoft News Center at http://news.microsoft.com. Web links, telephone numbers and titles were correct at time of publication, but may have changed. For additional assistance, journalists and analysts may contact Microsoft’s Rapid Response Team or other appropriate contacts listed at http://news.microsoft.com/microsoft-public-relations-contacts.

Oracle cloud security beefed up amid unprotected data worries

SAN FRANCISCO — Last week saw an expansion of data handling and machine learning capabilities for Oracle cloud security and management product lines.

The rollout came along with some warnings about the dangers of unprotected data, and a few brickbats for upstart rival Splunk, which has made headway in the field of security information and event management (SIEM).

Oracle’s updates appear amid a whirl of headlines on a massive data breach at Equifax, the large credit and collections agency that this year put millions of Americans’ private data at risk. Some viewers suggest the breach was the work of state-sponsored hackers.

Among those viewers is Oracle founder and CTO Larry Ellison, who chose Oracle’s OpenWorld 2017 event to roll out updates to its Oracle Management Cloud and Oracle Security Monitoring and Analytics Cloud Service. State-sponsored hackers up the ante in cybersecurity, he said.

“Companies have to defend themselves against nation-states. And, some of these guys are very good at what they do,” Ellison warned. “This is really a very bad situation.”

Looking for bad patterns

Oracle database security has been a strong selling point for the company over many years, although its overall security came in for continual criticism after a 2008 purchase of Sun Microsystems that included Java and the J2EE framework.

Larry Ellison, founder and CTO, OracleLarry Ellison

Now, Oracle cloud security is gaining special focus. Oracle cloud security efforts were buttressed in 2016 with acquisitions, including DNS services provider Dyn and cloud access broker provider Palerra. For its new releases, acquired services are further strengthened by data management and machine learning advances forged within Oracle. 

As described by Ellison and others, the essence of the updates to Oracle Management Cloud and Oracle Security Monitoring and Analytics Cloud Service rely on a well-curated, unified data store for massive amounts of log and other activity data.

Add to that a heaping helping of machine learning algorithms that look for good and bad patterns of activity. Finally, runbook-style automation will be employed to fix more and more security flaws without human intervention.

Splunk-y rival attracts wrath of Larry

Oracle OpenWorld sometimes serves as a stage for leader Ellison’s zest for heated competition. Last year, with cloud database technology being showcased, he berated Amazon Web Services. This year, with Oracle’s enhanced data, cloud and security management software on the docket, Ellison’s targets expanded to include Splunk, a San Francisco-based software company that has made a mark in log analysis in addition to SIEM.

Ellison challenged Splunk for lack of an entity model for unified data handling, difficult-to-use machine learning and lack of remediation capabilities. In his view, not surprisingly, the Oracle offering is better.

“It is not simply an analytical system, like Splunk. It is a security monitoring and management system designed to detect and remediate the problem,” he told the OpenWorld gathering.

Splunk — again, not surprisingly — responded. In a blog post entitled “Splunk Fires Back at Ludicrous Larry,” CEO Doug Merritt contended that there are drawbacks to single, unified repositories for threat and contextual data. Merritt dismissed Ellison’s assertion that Splunk is purely an analytical system, without remediation capabilities, citing hooks, for example, to ServiceNow operations automation. And, while Splunk does provide an SDK for data scientists, its capabilities are within reach of “anyone in IT, security or the business, no data science degree required,” he said.

“It was flattering that Oracle finally woke up to the power of machine data and the importance of security,” Merritt wrote. The blog post concludes with a photo of a capsized Oracle America’s Cup series catamaran.

Threats to Oracle cloud security

Oracle will find some favor with its security monitoring and analytics cloud services because they’re logical add-ons for its growing number of cloud-based offerings, according to Eric Parizo, a senior analyst at GlobalData Technology. The new services also have the potential to be a disruptive force among security offerings, Parizo said, if the company provides a cloud-based alternative that’s truly easier to use.

“Oracle sees Splunk succeeding with a security-centric approach that mirrors a lot of what Oracle does in the data management realm, so Oracle believes it is recapturing an opportunity it should have pursued earlier,” he said.

Still, Parizo continued, “it’s impossible to ignore Oracle’s poor track record on cybersecurity.” Over many years, Oracle has “released products rife with security flaws, and ignored those flaws for months or in some cases years after they’ve been widely known,” he said. “The company has a lot of work to do to prove its cybersecurity solutions are effective, and that its approach toward security has evolved enough to justify an investment.”

Meanwhile, Oracle may have found an out for at least some portion of its bad security press. The company recently ceded great portions of its Java software assets to the open source community, putting future revisions largely in the hands of the Eclipse Foundation.

The move could mean that Java flaws, many of which Oracle inherited along with its purchase of Java originator Sun, will become the responsibility of a wider group of software developers.

Flash technology accelerates predictive analytics software

Predictive analytics software combines artificial intelligence, machine learning, data mining and modeling to parse big data resources and create highly accurate and insightful forecasts, but companies need flash technology to support it.

Thanks to its impressive speed, flash technology accelerates predictive analytics software. With flash’s sub-millisecond latency, business, engineering and other verticals can perform more complex analyses in less time than with conventional hard disk drive technology.

“Flash storage is a key technology that enables analysis at larger scales of data in faster time frames,” said Mike Matchett, an analyst with storage industry research firm Taneja Group Inc. in Hopkinton, Mass.

Meeting requirements

According to Vincent Hsu, IBM’s CTO for storage, there are three basic requirements storage must meet to effectively support analytical workloads: compelling data economics, enterprise resiliency and easy infrastructure integration.

“Put simply, faster response times can yield more business agility and quicker time to value from analytics, and more data analyzed at once means more potential value streams,” Hsu said.

There’s a competitive race today to use predictive analytics software in many forms, including machine learning and deep learning applied to operational optimization.

“By becoming predictive at increasing operational speeds, organizations can not only find marked improvement in existing business processes, but exploit disruptive new approaches to their markets,” Matchett said. “We’ve seen predictive analytics evolve from offline, small data scoring into massive web-scale, big data, real-time decision-making.”

Predictive analytics software is not just about analysis, but gaining the ability to respond — rather than react — to rapidly changing market conditions.

“Since actions based on the results are the whole point, faster, smarter and more relevant results win the day and, as a result, flash wins out,” said Donna Taylor, head of consulting firm Taylor & Associates and former Gartner analyst.

Matchett noted that organizations can add flash technology to almost any modern array in the form of cache or as a fast storage tier.

“We also see some innovation in having storage architectures ‘link up’ server-side flash as a virtual local performance tier of persistence,” he said.

Turning data into insights

The key obstacle many predictive analytics software users face is limited file-access speed.

“While the raw storage capacity of legacy [or] traditional storage has increased dramatically in recent years, the rate at which the data can be accessed and served has remained relatively flat,” said Sam Ruchlewicz, director of digital strategy and data analytics at Warschawski, a marketing communications agency based in Baltimore that uses predictive analytics software to study consumer trends and behaviors.

“As the sheer volume of customer data continues to grow, more predictive analytics applications are moving to flash storage to efficiently and effectively access actionable information,” he said.

Ruchlewicz noted that one of the biggest challenges in his field is making sense of terabytes — or even petabytes — of customer data in real time, then using that insight to deliver a better customer experience at relevant touch points.

“To accomplish that goal, the predictive analytics application [or] algorithm must query the database for the requisite information, process it and provide the result to the next component of your marketing technology stack,” he said. Flash technology is the key to making this process fast and efficient.

As they look to accelerate their predictive analytics capabilities, organizations must carefully examine where a flash technology investment can make the most sense.

“Storage-side flash tends to be shared widely, but is probably the most expensive,” Matchett said. “Server-side flash, such as NVMe, can provide a huge boost to applications that can make use of it locally, but might be quite a large investment to make across a large big data cluster.”

Matchett noted that flash storage prices will continue to fall, even as capacities increase.

“What is interesting is that we also see some possible new tiers of faster persistence coming with ReRAM MRAM and the like,” he said.

For now, many predictive analytics software users rely on a combination of storage media types, including HDDs, tape and flash technology.

“This is nothing new; however, companies looking to squeeze additional value from dense data sets will increasingly adopt flash technology in order to reap the benefits of faster seek and processing speeds,” Taylor said.

The essential attribute most flash customers are looking for, according to Hsu, is data agility — the automated, policy-driven reallocation of data to and from a storage medium without a lot of human intervention or time-consuming, expensive steps.

“It is in this state of data agility where flash really shines and paves the way for artificial intelligence and machine learning,” Hsu said.

Taylor urged predictive analytics software users who are planning a full or partial transition to flash technology to thoroughly research the market. “Otherwise, they risk being at the mercy of a salesman’s skewed sales pitch,” she says.

Flash forward

Ruchlewicz said he would advise any organization considering an infrastructure investment designed to support an analytics initiative to seriously think about using flash storage, noting that most predictive models request data faster than a legacy system can provide it.

“Even if the organization’s data set is within more reasonable bounds, flash is the superior alternative and the system of the future,” he noted.

Hsu concurred. “Data is the most valuable commodity organizations can lay claim to, and any organization that considers speed and insight as a competitive advantage can benefit from flash storage for predictive analytics,” he said.

Microsoft’s self-soaring sailplane improves IoT, digital assistants

A machine learning project to build an autonomous sailplane that remains aloft on thermal currents is impressive enough. But the work conducted by Microsoft researchers Andrey Kolobov and Iain Guilliard will also improve the decision making and trustworthiness of IoT devices, personal assistants and autonomous cars.

The constraints limiting the computational resource of weight and space imposed by the airframe of the sailplane adds relevance to the many new developments in ubiquitous computing. The autonomous sailplane is controlled by a 160MHz Arm Cortex M4 with 256KB of RAM and 60KB of flash running on batteries that monitor the sensors, run the autopilot and control the servo motors, to which the researchers have added a machine learning model that continuously learns how to autonomously ride the thermal currents.

+ Also on Network World: The inextricable link between IoT and machine learning +

In the these early days of platforms like digital assistants, IoT and autonomous vehicles, there are hundreds of open problems that will be distilled into a handful of scientific questions that first must be answered to build products that match popular visions of them. When scientific questions start to emerge, the platform’s future becomes predictable — maybe not to an exact month, but within a year or three.

IoT following the same path as Web 2.0

The Web 2.0 platform followed a similar course. First talked about in 1999, it developed enough interest in 2004 for O’Reilly Media and MediaLive to host the first Web 2.0 conference in 2004. But it was not until later in the decade that companies such as Salesforce and Google implemented Web 2.0. This was a decade-long evolution of first a vision, a collection of open problems distilled into scientific questions that university and industrial researchers answered.

As the technology shifts from research to development, product developers find the answers to the scientific questions in the work of researchers that enables them to build a product. Digital assistants, IoT and autonomous vehicles — on a Web 2.0 time-scale — are much closer to 2004 than the later part of the first decade of this millennium when enough research was translated into development that products could be built at scale.

Goals of Microsoft’s sailplane project

At an all-hands meeting, inspired by a 2013 story in the Economist about autonomous sailplanes, the team of Microsoft researchers set the goals of this project to answer two scientific questions: how to build trusted AI systems and how to architect system with AI and machine learning as a fundamental systems design principle. Kolobov’s said:

“The state of the art in AI development is not at the level where AI agents can reliably act fully autonomously, which is why we do not see many AI systems acting in the physical world with full autonomy. MSR is trying to build AI systems that are robust and can be trusted to act fully on their own, performing better than humans. The implications of this research apply to personal assistants, autonomous cars and IoT.

“We wanted to gain experience in designing systems where AI and machine learning are first-class citizens so we do not have to fundamentally modify the architecture of the systems post hoc.”

Kolobov and Guillard are part of a multidisciplinary team with complementary skills. Kolobov has applied AI and machine learning to commercial products such as Windows and Bing. Guillard, after over a decade working on control systems on the Airbus 350 and the A4 Skyhawk, is a computer science Ph.D. candidate at the Australian National University and an intern at Microsoft. An odd pairing perhaps, unless the question that they are trying to answer is understood.

There are many machine learning problems that do not have ground truth. Ground truth is an accurate data set for machine learning classification and training. Machine learning models are programmed (trained) with data, not lines of code. The models are taught to predict a correct answer with data. If a model is taught to recognize cats, labeled images of cats are ground truth. This method called supervised training has two stages: training, typically based on beefy GPUs — learning from images of cats and not cats — and inference, or predicting the right answer — cat or not cat.

Because the physical world in which the machine learning model interacts is unpredictable, ground truth cannot be easily simulated with computers — and classified training data does not exist. How can every defensive maneuver of an autonomous vehicle in reaction to another vehicle be predicted? Or how can every elderly caretaker robot’s response to a patient in distress be predicted?

These unstable systems cannot be predicted. The models need to learn by interacting with chaotic, unstable systems. Vehicles on highways and elderly patients are unsuitable for AI to learn to operate reliably and safely. A sailplane, however, is.

How the sailplane works

Ground truth for an autonomous sailplane is limited because it is impossible to measure the turbulent condition, where the rising thermal column of air begins and ends and what is happening inside. A laptop on the ground performs high-level planning for the sailplane based on data from flights by manned sailplanes using the local terrain and wind conditions to predict thermals sent to the autonomous sailplane via telemetry.

The sailplane uses an onboard Bayesian reinforcement learning algorithm to make decisions by using the observations it gets from its sensors. Bayesian reinforcement learning was chosen because of the model’s ability to plan its actions to learn and exploit knowledge in an optimal manner. This approach is easier to understand the decisions the agent is making and why.

The model uses Monte Carlo tree search to choose the detected areas of lift that can be exploited to optimize altitude to instruct the autopilot to adjust the elevators, which are the horizontal control surfaces on the tail that cause an airplane to climb or descend with servo motors to keep the sailplane soaring. It also uses Monte Carlo tree search to exploit thermal currents, sending instructions to the autopilot. Monte Carlo tree search has been applied to win non-deterministic games, such as Google’s Go project and poker. It gradually builds up a partial game tree of moves, then uses advanced strategies to find a balance between exploring new decision branches and exploiting the most promising branches.

Running Bayesian reinforcement algorithms on a sailplane poses significant challenges compared to Go and poker. Remember the computational and battery constraints? The sailplane is controlled by the real-time open-source ArduPilot running on open-source Pixhawk Arm Cortex M4 hardware. Execution of the Bayesian reinforcement algorithm is interleaved with the real-time ArduPilot in short, less than 100ms intervals so that control of the sailplane’s sensors and servo motors is maintained and crashes avoided.

The pairing of Kolobov and Guillard is a careful match of a machine learning expert with an aeronautic control systems domain expert who is also a computer science Ph.D. candidate. But their pairing isn’t the only clever combination. This search for a deeper understanding was combined with off-the shelf sailplane airframes, sensors, servo motors, open-source hardware and an open-source auto-pilot so that Kolobov and Guillard could get right to implementing and tuning the Bayesian reinforcement learning and quickly iterate their designs, improving their results.

During these pioneering days of digital assistants, IoT and autonomous vehicles, product developers will find some of the most relevant answers to scientific questions that can be translated into their products fields, such as sailplanes that might appear at first to be orthogonal and unrelated.

It may take another five to 10 years for digital assistants, IoT and autonomous vehicles to become as reliable as humans. Microsoft Research is working on one of the scientific questions that must first be answered before these pioneering product visions reach this point.

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.

Experts share their shortcuts to creativity with STEM: Join #MSFTEduChat on Sept. 19 |

Learning has no limits. When educators and students truly connect in the classroom, they see endless learning opportunities in the world around them, today and tomorrow, and inspire creativity within each other. On September 19th, we’re bringing in the experts to help you inspire student creativity and curiosity with STEM during our #MSFTEduChat TweetMeet at 10:00am PDT.

Whether you’re a full-blown STEM expert or just trying to understand what teaching with Science, Technology, Engineering and Math (STEM) means in the context of your classroom, our community of global educators and MIE Experts are here to share their experiences and advice. They’ll answer your questions in a live #MSFTEduChat TweetMeet event on Twitter. (Wait … what’s a TweetMeet?)

And in prep for the TweetMeet, you can check out our Microsoft in Education STEM Resource Collection in the Microsoft Educator Community here. During the #MSFTEduChat we’ll also discuss how we can use STEM to help achieve the UN Sustainable Development Goals.

Why join the #MSFTEduChat TweetMeets?

We’ll let our eager hosts explain with the MSFTEduChat Flipgrid videos they’ve created especially for this month’s event. We highly welcome your video response to this grid, so go ahead and submit yours:

When can I join?

Join us Tuesday, September 19th at 10am PDT on Twitter, using the hashtags #MSFTEduChat and #MicrosoftEDU (which you can always use to stay in touch with us).

To prepare for the #MSFTEduChat TweetMeet, have a look at the questions we have lined up this time. We also highly recommend that you set up a Twitter dashboard – through TweetDeck, for example – to monitor incoming tweets containing the #MSFTEduChat hashtag and other relevant search queries. Watch this TweetDeck Basics tutorial on how you can do this.

TweetMeet questions

An animated GIF showing this week's TweetMeet questions.

  1. What excites you most about #STEM in the classroom?
  2. How can we spark creativity in our students with #STEM education?
  3. What are some low-tech or no-tech ways to get creative with STEM?
  4. What role does #STEM play in 21st-Century Learning skills?
  5. How can we use STEM to achieve UN Sustainable Development Goals?
  6. What’s your best tip, resource or person to improve #STEM learning?

The UN's Sustainable Development Goals.


We have an incredible line-up of passionate and knowledgeable hosts, so be sure to follow each of them on Twitter:


We’ll also be joined by a couple of hosts from Microsoft Education:

STEM Saturdays are now available in the US

STEM Saturdays are free drop-in sessions that give people a chance to engage in fun and interesting Science, Technology, Engineering and Math projects hosted by their local Microsoft Store. This month’s STEM Saturdays will test your engineering and data science skills, as The Education Workshop has partnered with Mattel Hot Wheels® Speedometry™ to challenge you with a new Forces and Motion project and lesson plan. Register and learn more about the online workshops and demos we have lined up for you on the Microsoft STEM Saturdays page.

The Hot Wheels Speedometry track.

What are #MSFTEduChat TweetMeets?

Every month Microsoft Education organizes social events on Twitter targeted at educators globally. The hashtag we use is #MSFTEduChat. A team of topic specialists and international MIE Expert teachers prepare and host these TweetMeets together. Our team of educator hosts first crafts several questions around a certain topic. Then, before the event, they share these questions on social media. Combined with a range of resources, a blog post and background information about the events, this allows all participants to prepare themselves to the full. Afterwards we make an archive available of the most notable tweets and resources shared during the event.

Learn more about TweetMeets and earn a badge in the TweetMeet Course on the Microsoft Education Community.

The #MSFTduChat event time is 10:00am PT. If this time isn’t convenient for you, please follow your local channel or even consider hosting your own #MSFTEduChat in your country and language. Please connect with TweetMeet organizer, Marjolein Hoekstra @OneNoteC, on Twitter for more info on hosting in your language and time that works best for the educators and MIE Experts in your country.

IBM cracks the code for speeding up its deep learning platform

Graphics processing units are a natural fit for deep learning because they can crunch through large amounts of data quickly, which is important when training data-hungry models.

But GPUs have one catch. Adding more GPUs to a deep learning platform doesn’t necessarily lead to faster results. While individual GPUs process data quickly, they can be slow to communicate their computations to other GPUs, which has limited the degree to which users can take advantage of multiple servers to parallelize jobs and put a cap on the scalability of deep learning models.

IBM recently took on this problem to improve scalability in deep learning and wrote code for its deep learning platform to improve communication between GPUs.

“The rate at which [GPUs] update each other significantly affects your ability to scale deep learning,” said Hillery Hunter, director of systems acceleration and memory at IBM. “We feel like deep learning has been held back because of these long wait times.”

Hunter’s team wrote new software and algorithms to optimize communication between GPUs spread across multiple servers. The team used the algorithm to train an image-recognition neural network on 7.5 million images from the ImageNet-22k data set in seven hours. This is a new speed record for training neural networks on the image data set, breaking the previous mark of 10 days, which was held by Microsoft, IBM said.

Hunter said it’s essential to speed up training times in deep learning projects. Unlike virtually every other area of computing today, training deep learning models can take days, which might discourage more casual users.

“We feel it’s necessary to bring the wait times down,” Hunter said.

IBM is rolling out the new functionality in its PowerAI software, a deep learning platform that pulls together and configures popular open source machine learning software, including Caffe, Torch and Tensorflow. PowerAI is available on IBM’s Power Systems line of servers.

But the main reason to take note of the news, according to Forrester analyst Mike Gualtieri, is the GPU optimization software might bring new functionality to existing tools — namely Watson.

“I think the main significance of this is that IBM can bring deep learning to Watson,” he said.

Watson currently has API connectors for users to do deep learning in specific areas, including translation, speech to text and text to speech. But its deep learning offerings are prescribed. By opening up Watson to open source deep learning platforms, its strength in answering natural-language queries could be applied to deeper questions.

Cybersecurity machine learning moves ahead with vendor push

Cybersecurity machine learning is growing in popularity, according to Jon Oltsik, an analyst with Enterprise Strategy Group Inc. in Milford, Mass. Oltsik attended the recent Black Hat conference, where technology vendors were abuzz with talk of cybersecurity machine learning.

ESG research asked 412 respondents about their understanding of artificial intelligence (AI) and cybersecurity machine learning, which revealed that only 30% said they were very knowledgeable on the subject. Only 12% of respondents said their organizations had deployed these systems widely.

According to Olstik, the cybersecurity industry sees an opportunity, because only 6% of respondents in surveys said their organizations were not considering AI or machine learning deployments. He said companies will need to educate the market, identify use cases, work with existing technologies and provide good support.

“I find machine learning [and] AI technology extremely cool but no one is buying technology for technology sake. The best tools will help CISOs improve security efficacy, operational efficiency, and business enablement,” Oltsik wrote.

Read more of Oltsik’s thoughts on cybersecurity machine learning.

Microsoft leverages Kubernetes backing for containers

Microsoft is positioning itself to fight back against the success of Amazon Web Services, according to Charlotte Dunlap, an analyst with Current Analysis in Sterling, Va.

The company launched a new container service and joined the Cloud Native Computing Foundation (CNCF) amidst earnings reports indicating that its Azure platform is outcompeting Salesforce and other providers. Microsoft unveiled a preview of its Azure Container Instances service in a bid to support developers who want to avoid the complexities of virtual machine management.

Dunlap said the announcement is significant because companies are still reluctant to deploy next-generation technologies incorporating containers and microservices, despite their advantages. In particular, Dunlap said providers should focus on explaining the cost-benefit ratios associated with refactoring departmental apps into containers.

By joining CNCF, meantime, Microsoft is “shunning” Amazon in the enterprise cloud market. “Expect to see a lot more platform service rollouts involving containers, microservices, etc., later this year during fall conferences in which cloud rivals continue to attempt to one-up one another,” Dunlap wrote.

Dig deeper into Dunlap’s thoughts on Microsoft’s support for containers.

SIEM for threat detection

Anton Chuvakin, an analyst with Gartner, said security information and event management, or SIEM, is not the best threat detection technology on its own. Based on conversations through Twitter, Chuvakin learned that many network professionals view SIEM as a compliance technology. Chuvakin said he sees these individuals as taking a viewpoint nearly 10 years out of date or perhaps struggling with bad experiences from failed SIEM implementations in the past.

Chuvakin said he uses SIEM for much of his threat detection tasks, but also uses log and traffic analysis, as well as endpoint visibility tools, almost equally. In his view, threat detection that focuses too heavily on the network and endpoints suffer serious security challenges unless they are coupled with log monitoring.

“Based on this logic, log analysis (perhaps using SIEM … or not) is indeed ‘best’ beginner threat detection. On top of this, SIEM will help you centralize and organize your other alerts,” Chuvakin wrote.

Explore more of Chuvakin’s thoughts on SIEM.