Tag Archives: learning

Building a data science pipeline: Benefits, cautions

Enterprises are adopting data science pipelines for artificial intelligence, machine learning and plain old statistics. A data science pipeline — a sequence of actions for processing data — will help companies be more competitive in a digital, fast-moving economy. 

Before CIOs take this approach, however, it’s important to consider some of the key differences between data science development workflows and traditional application development workflows.

Data science development pipelines used for building predictive and data science models are inherently experimental and don’t always pan out in the same way as other software development processes, such as Agile and DevOps. Because data science models break and lose accuracy in different ways than traditional IT apps do, a data science pipeline needs to be scrutinized to assure the model reflects what the business is hoping to achieve.

At the recent Rev Data Science Leaders Summit in San Francisco, leading experts explored some of these important distinctions, and elaborated on ways that IT leaders can responsibly implement a data science pipeline. Most significantly, data science development pipelines need accountability, transparency and auditability. In addition, CIOs need to implement mechanisms for addressing the degradation of a model over time, or “model drift.” Having the right teams in place in the data science pipeline is also critical: Data science generalists work best in the early stages, while specialists add value to more mature data science processes.

Data science at Moody’s

Jacob Grotta, managing director, Moody's AnalyticsJacob Grotta

CIOs might want to take note from Moody’s, the financial analytics giant, which was an early pioneer in using predictive modeling to assess the risks of bonds and investment portfolios. Jacob Grotta, managing director at Moody’s Analytics, said the company has streamlined the data science pipeline it uses to create models in order to be able to quickly adapt to changing business and economic conditions.

“As soon as a new model is built, it is at its peak performance, and over time, they get worse,” Grotta said. Declining model performance can have significant impacts. For example, in the finance industry, a model that doesn’t accurately predict mortgage default rates puts a bank in jeopardy. 

Watch out for assumptions

Grotta said it is important to keep in mind that data science models are created by and represent the assumptions of the data scientists behind them. Before the 2008 financial crisis, a firm approached Grotta with a new model for predicting the value of mortgage-backed derivatives, he said. When he asked what would happen if the prices of houses went down, the firm responded that the model predicted the market would be fine. But it didn’t have any data to support this. Mistakes like these cost the economy almost $14 trillion by some estimates.

The expectation among companies often is that someone understands what the model does and its inherent risks. But these unverified assumptions can create blind spots for even the most accurate models. Grotta said it is a good practice to create lines of defense against these sorts of blind spots.

The first line of defense is to encourage the data modelers to be honest about what they do and don’t know and to be clear on the questions they are being asked to solve. “It is not an easy thing for people to do,” Grotta said.

A second line of defense is verification and validation. Model verification involves checking to see that someone implemented the model correctly, and whether mistakes were made while coding it. Model validation, in contrast, is an independent challenge process to help a person developing a model to identify what assumptions went into the data. Ultimately, Grotta said, the only way to know if the modeler’s assumptions are accurate or not is to wait for the future.

A third line of defense is an internal audit or governance process. This involves making the results of these models explainable to front-line business managers. Grotta said he was working with a bank recently that protested its bank managers would not use a model if they didn’t understand what was driving its results. But he said the managers were right to do this. Having a governance process and ensuring information flows up and down the organization is extremely important, Grotta said.

Baking in accountability

Models degrade or “drift” over time, which is part of the reason organizations need to streamline their model development processes. It can take years to craft a new model. “By that time, you might have to go back and rebuild it,” Grotta said. Critical models must be revalidated every year.

To address this challenge, CIOs should think about creating a data science pipeline with an auditable, repeatable and transparent process. This promises to allow organizations to bring the same kind of iterative agility to model development that Agile and DevOps have brought to software development.

Transparent means that upstream and downstream people understand the model drivers. It is repeatable in that someone can repeat the process around creating it. It is auditable in the sense that there is a program in place to think about how to manage the process, take in new information, and get the model through the monitoring process. There are varying levels of this kind of agility today, but Grotta believes it is important for organizations to make it easy to update data science models in order to stay competitive.

How to keep up with model drift

Nick Elprin, CEO and co-founder of Domino Data Lab, a data science platform vendor, agreed that model drift is a problem that must be addressed head on when building a data science development pipeline. In some cases, the drift might be due to changes in the environment, like changing customer preferences or behavior. In other cases, drift could be caused by more adversarial factors. For example, criminals might adopt new strategies for defeating a new fraud detection model.

Nick Elprin, CEO and co-founder, Domino Data LabNick Elprin

In order to keep up with this drift, CIOs need to include a process for monitoring the effectiveness of their data models over time and establishing thresholds for replacing these models when performance degrades.

With traditional software monitoring, the IT service management needs to track metrics related to CPU, network and memory usage. With data science, CIOs need to capture metrics related to accuracy of model results. “Software for [data science] production models needs to look at the output they are getting from those models, and if drift has occurred, that should raise an alarm to retrain it,” Elprin said.

Fashion-forward data science

At Stitch Fix, a personal shopping service, the company’s data science pipeline allows it to sell clothes online at full price. Using data science in various ways allows them to find new ways to add value against deep discount giants like Amazon, said Eric Colson, chief algorithms officer at Stitch Fix.

Eric Colson, chief algorithms officer,  Stitch FixEric Colson

For example, the data science team has used natural language processing to improve its recommendation engines and buy inventory. Stitch Fix also uses genetic algorithms — algorithms that are designed to mimic evolution and iteratively select the best results following a set of randomized changes. These are used to streamline the process for designing clothes, coming up with countless iterations: Fashion designers then vet the designs.

This kind of digital innovation, however, was only possible he said because the company created an efficient data science pipeline. He added that it was also critical that the data science team is considered a top-level department at Stitch Fix and reports directly to the CEO.

Specialists or generalists?

One important consideration for CIOs in constructing the data science development pipeline is whether to recruit data science specialists or generalists. Specialists are good at optimizing one step in a complex data science pipeline. Generalists can execute all the different tasks in a data science pipeline. In the early stages of a data science initiative, generalists can adapt to changes in the workflow more easily, Colson said.

Some of these different tasks include feature engineering, model training, enhance transform and loading (ETL) data, API integration, and application development. It is tempting to staff each of these tasks with specialists to improve individual performance. “This may be true of assembly lines, but with data science, you don’t know what you are building, and you need to iterate,” Colson said. The process of iteration requires fluidity, and if the different roles are staffed with different people, there will be longer wait times when a change is made.

In the beginning at least, companies will benefit more from generalists. But after data science processes are established after a few years, specialists may be more efficient.

Align data science with business

Today a lot of data science models are built in silos that are disconnected from normal business operations, Domino’s Elprin said. To make data science effective, it must be integrated into existing business processes. This comes from aligning data science projects with business initiatives. This might involve things like reducing the cost of fraudulent claims or improving customer engagement.

In less effective organizations, management tends to start with the data the company has collected and wonder what a data science team can do with it. In more effective organizations, data science is driven by business objectives.

“Getting to digital transformation requires top down buy-in to say this is important,” Elprin said. “The most successful organizations find ways to get quick wins to get political capital. Instead of twelve-month projects, quick wins will demonstrate value, and get more concrete engagement.”

Databricks platform additions unify machine learning frameworks

SAN FRANCISCO — Open source machine learning frameworks have multiplied in recent years, as enterprises pursue operational gains through AI. Along the way, the situation has formed a jumble of competing tools, creating a nightmare for development teams tasked with supporting them all.

Databricks, which offers managed versions of the Spark compute platform in the cloud, is making a play for enterprises that are struggling to keep pace with this environment. At Spark + AI Summit 2018, which was hosted by Databricks here this week, the company announced updates to its platform and to Spark that it said will help bring the diverse array of machine learning frameworks under one roof.

Unifying machine learning frameworks

MLflow is a new open source framework on the Databricks platform that integrates with Spark, SciKit-Learn, TensorFlow and other open source machine learning tools. It allows data scientists to package machine learning code into reproducible modules, conduct and compare parallel experiments, and deploy models that are production-ready.

Databricks also introduced a new product on its platform, called Runtime for ML. This is a preconfigured Spark cluster that comes loaded with distributed machine learning frameworks commonly used for deep learning, including Keras, Horovod and TensorFlow, eliminating the integration work data scientists typically have to do when adopting a new tool.

Databricks’ other announcement, a tool called Delta, is aimed at improving data quality for machine learning modeling. Delta sits on top of data lakes, which typically contain large amounts of unstructured data. Data scientists can specify a schema they want their training data to match, and Delta will pull in all the data in the data lake that fits the specified schema, leaving out data that doesn’t fit.

MLflow's tracking user interface
MLflow includes a tracking interface for logging the results of machine learning jobs.

Users want everything under one roof

Each of the new tools is either in a public preview or alpha test stage, so few users have had a chance to get their hands on them. But attendees at the conference were broadly happy about the approach of stitching together disparate frameworks more tightly.

Saman Michael Far, senior vice president of technology at the Financial Industry Regulatory Authority (FINRA) in Washington, D.C., said in a keynote presentation that he brought in the Databricks platform largely because it already supports several query languages, including R, Python and SQL. Integrating these tools more closely with machine learning frameworks will help FINRA use more machine learning in its goal of spotting potentially illegal financial trades.

You have to take a unified approach. Pick technologies that help you unify your data and operations.
John Golesenior director of business analysis and product management at Capital One

“It’s removed a lot of the obstacles that seemed inherent to doing machine learning in a business environment,” Far said.

John Gole, senior director of business analysis and product management at Capital One, based in McLean, Va., said the financial services company has implemented Spark throughout its operational departments, including marketing, accounts management and business reporting. The platform is being used for tasks that range from extract, transform and load jobs to SQL querying for ad hoc analysis and machine learning. It’s this unified nature of Spark that made it attractive, Gole said.

Going forward, he said he expects this kind of unified platform to become even more valuable as enterprises bring more machine learning to the center of their operations.

“You have to take a unified approach,” Gole said. “Pick technologies that help you unify your data and operations.”

Bringing together a range of tools

Engineers at ride-sharing platform Uber have already built integrations similar to what Databricks unveiled at the conference. In a presentation, Atul Gupte, a product manager at Uber, based in San Francisco, described a data science workbench his team created that brings together a range of tools — including Jupyter, R and Python — into a web-based environment that’s powered by Spark on the back end. The platform is used for all the company’s machine learning jobs, like training models to cluster rider pickups in Uber Pool or forecast rider demand so the app can encourage more drivers to get out on the roads.

Gupte said, as the company grew from a startup to a large enterprise, the old way of doing things, where everyone worked in their own silo using their own tool of choice, didn’t scale, which is why it was important to take this more standardized approach to data analysis and machine learning.

“The power is that everyone is now working together,” Gupte said. “You don’t have to keep switching tools. It’s a pretty foundational change in the way teams are working.”

Call center chatbots draw skepticism from leaders

ORLANDO, Fla. — Artificial intelligence chatbot vendors may hype machine learning tools to enhance customer service, but call center leaders aren’t necessarily ready to trust them in the real world.

Part of the reason is call centers are judged by hard-to-achieve performance metrics based on volume, efficiency and customer satisfaction. Once a call center performs successfully against those expectations set by management, it’s hard to convince leaders to entrust call center chatbots with the hard-fought, quality customer relations programs they’ve built with humans.

“I don’t anticipate them having any kind of utility here,” said Jason Baker, senior vice president of operations for Entertainment Benefits Group (EBG), which manages discount tickets and other promotions for 61 million employees at 40,000 client companies. Baker oversees EBG customer service spanning multiple call centers.

“We strive for creating personalized and memorable experiences,” Baker said. “A chatbot — I understand the reason behind it, and, depending upon the type of environment, it might make sense — but in the travel and entertainment industry, you have to have the personalized touch with all interactions.”

Artificial intelligence chatbots were the most-talked-about technology at the ICMI Contact Center Expo, with a mix of trepidation and interest.

Navy Federal Credit Union employs “a few bots” for fielding very basic customer questions, such as balance inquiries, said Georgia Adams, social care supervisor at credit union, based in Vienna, Va.

Her active social media team publishes tens of thousands of posts and comments annually on Twitter, Instagram and Facebook without the help of artificial intelligence chatbots, but “they’re on the horizon.” She stressed that call center chatbots must be transparent — identifying themselves as a bot — and be empowered to transfer customers to human customer service agents quickly to be effective.

“It’s coming, whether you want it or not,” Adams said. “We’re strategizing [and] looking at it. I certainly think they have a lot of value, especially when it comes to things that are basically self-service … but if I’m talking to a bot, I want to know I’m talking to a bot.”

Call center chatbots not gunning for humans’ jobs — yet

Another part of the reason call center personnel might be wary of chatbots — true or not, fair or unfair — is robotic automation will eventually take the humans’ jobs. This idea was dismissed by neutral industry experts such as ICMI founding partner Brad Cleveland, who said alternative customer service channels such as interactive voice response (IVR), email, social media and live chat each caused similar panic in the call center world when they were new. But none of them significantly affected call volumes.

“We hear predictions that artificial intelligence will replace all the jobs out there,” Cleveland said, not just in customer service. “If it does, we’re definitely going to be the last ones standing in customer service. But I don’t think it’s going to happen that way at all.”

Cleveland said he believes artificial intelligence chatbots will likely have utility in the near future, as technology advances and call centers find appropriate uses for them. Machine learning tools that aren’t chatbots, too, will make a difference, he said.

One example on display was an AI tool that can be trained to find — and adapt on the fly — pre-worded answers to common, or complex and time-consuming, customer queries that a human agent can paste into a chat window after a quick edit for sense and perhaps personalization. The idea is they get smarter and more on point over months of use.

But even live chat channels have their limits when they’re run by humans, let alone artificial intelligence chatbots. Frankie Littleford, vice president of customer support at JetBlue, based in Long Island City, N.Y., said during a breakout session here that her agents have to develop a sixth sense about when to stop typing and pick up the phone.

“You know in your gut when to take it out of email or whatever channel that isn’t person-to-person,” Littleford said. “You just continue to make someone angrier when you’re going back and forth — and let’s face it, a lot of people are really brave when they’re not face-to-face or on the phone … If your agents are skilled to speak with those customers, you can allow them to climb their mountain of anger and then de-escalate.”

AI chatbot benefits illustration
ICMI commissioned an artist for select Call Center Expo sessions by whiteboard artist Heather Klar. Here, she illustrated high points from a pro-AI chatbot lecture.

Vendors hold out hope

ICMI attendees weren’t fully buying into the promise of AI chatbots, but undeterred software vendors kept up the full-court press, attempting to sell the benefits of automation and allay fears that chatbots will eventually replace attendees’ jobs.

“We don’t use [AI] to replace human work,” said Mark Bloom, Salesforce Service Cloud senior director of strategy and operations, during his keynote, adding that organizations that attempt to replace people with AI tools haven’t been successful. “We want to augment the work our people are doing and make them more intelligent. That is how we are moving forward.”

You could train a new employee, and they could leave tomorrow. A bot is not going to give up and leave, it’s not going to get sick, and it’s so scalable.
Kaye Chapmancontent and client training manager, Comm100

Setting up call center chatbots will require extensive training in test environments — just like human agents do. Once they’re trained, they require maintenance and updating, but they will solve another vexing problem for call center managers — employee turnover, said Kaye Chapman, content and client training manager for chatbot vendor Comm100, based in Vancouver, B.C.

“You could train a new employee, and they could leave tomorrow,” Chapman said. “A bot is not going to give up and leave, it’s not going to get sick, and it’s so scalable.”

Bob Furniss, vice president at Bluewolf, an IBM subsidiary known for Salesforce automation integrations that’s based in New York, said he believes artificial intelligence chatbots are coming, and AI in general will change both our personal and work lives. He said the potential is there for AI to help ease call center agents’ workload — up to 30% of the simplest customer queries — similar to the promises of IVR and the other channels when they came online in the industry.

Just like all other call center systems, Furniss warned that anything AI-powered will require attention and maintenance to attenuate its actions and keep abreast of changing workflow and updated customer relations strategies.

“This is just like any other technology we have in the contact center,” Furniss said. “You don’t set it and leave it, just like workforce management [applications]. There’s an art and a skill to it.”

DJI and Microsoft partner to bring advanced drone technology to the enterprise

New developer tools for Windows and Azure IoT Edge Services enable real-time AI and machine learning for drones

REDMOND, Wash. — May 7, 2018 — DJI, the world’s leader in civilian drones and aerial imaging technology, and Microsoft Corp. have announced a strategic partnership to bring advanced AI and machine learning capabilities to DJI drones, helping businesses harness the power of commercial drone technology and edge cloud computing.

Through this partnership, DJI is releasing a software development kit (SDK) for Windows that extends the power of commercial drone technology to the largest enterprise developer community in the world. Using applications written for Windows 10 PCs, DJI drones can be customized and controlled for a wide variety of industrial uses, with full flight control and real-time data transfer capabilities, making drone technology accessible to Windows 10 customers numbering nearly 700 million globally.

DJI logoDJI has also selected Microsoft Azure as its preferred cloud computing partner, taking advantage of Azure’s industry-leading AI and machine learning capabilities to help turn vast quantities of aerial imagery and video data into actionable insights for thousands of businesses across the globe.

“As computing becomes ubiquitous, the intelligent edge is emerging as the next technology frontier,” said Scott Guthrie, executive vice president, Cloud and Enterprise Group, Microsoft. “DJI is the leader in commercial drone technology, and Microsoft Azure is the preferred cloud for commercial businesses. Together, we are bringing unparalleled intelligent cloud and Azure IoT capabilities to devices on the edge, creating the potential to change the game for multiple industries spanning agriculture, public safety, construction and more.”

DJI’s new SDK for Windows empowers developers to build native Windows applications that can remotely control DJI drones including autonomous flight and real-time data streaming. The SDK will also allow the Windows developer community to integrate and control third-party payloads like multispectral sensors, robotic components like custom actuators, and more, exponentially increasing the ways drones can be used in the enterprise.

“DJI is excited to form this unique partnership with Microsoft to bring the power of DJI aerial platforms to the Microsoft developer ecosystem,” said Roger Luo, president at DJI. “Using our new SDK, Windows developers will soon be able to employ drones, AI and machine learning technologies to create intelligent flying robots that will save businesses time and money, and help make drone technology a mainstay in the workplace.”

In addition to the SDK for Windows, Microsoft and DJI are collaborating to develop commercial drone solutions using Azure IoT Edge and AI technologies for customers in key vertical segments such as agriculture, construction and public safety. Windows developers will be able to use DJI drones alongside Azure’s extensive cloud and IoT toolset to build AI solutions that are trained in the cloud and deployed down to drones in the field in real time, allowing businesses to quickly take advantage of learnings at one individual site and rapidly apply them across the organization.

DJI and Microsoft are already working together to advance technology for precision farming with Microsoft’s FarmBeats solution, which aggregates and analyzes data from aerial and ground sensors using AI models running on Azure IoT Edge. With DJI drones, the Microsoft FarmBeats solution can take advantage of advanced sensors to detect heat, light, moisture and more to provide unique visual insights into crops, animals and soil on the farm. Microsoft FarmBeats integrates DJI’s PC Ground Station Pro software and mapping algorithm to create real-time heatmaps on Azure IoT Edge, which enable farmers to quickly identify crop stress and disease, pest infestation, or other issues that may reduce yield.

With this partnership, DJI will have access to the Azure IP Advantage program, which provides industry protection for intellectual property risks in the cloud. For Microsoft, the partnership is an example of the important role IP plays in ensuring a healthy and vibrant technology ecosystem and builds upon existing partnerships in emerging sectors such as connected cars and personal wearables.

Availability

DJI’s SDK for Windows is available as a beta preview to attendees of the Microsoft Build conference today and will be broadly available in fall 2018. For more information on the Windows SDK and DJI’s full suite of developer solutions, visit: developer.dji.com.

About DJI

DJI, the world’s leader in civilian drones and aerial imaging technology, was founded and is run by people with a passion for remote-controlled helicopters and experts in flight-control technology and camera stabilization. The company is dedicated to making aerial photography and filmmaking equipment and platforms more accessible, reliable and easier to use for creators and innovators around the world. DJI’s global operations currently span across the Americas, Europe and Asia, and its revolutionary products and solutions have been chosen by customers in over 100 countries for applications in filmmaking, construction, inspection, emergency response, agriculture, conservation and other industries.

About Microsoft

Microsoft (Nasdaq “MSFT” @microsoft) enables digital transformation for the era of an intelligent cloud and an intelligent edge. Its mission is to empower every person and every organization on the planet to achieve more.

For additional information, please contact:

Michael Oldenburg, DJI Senior Communication Manager, North America – michael.oldenburg@dji.com

Chelsea Pohl, Microsoft Commercial Communications Manager – chelp@microsoft.com

Note to editors: For more information, news and perspectives from Microsoft, please visit the Microsoft News Center at http://news.microsoft.com. Web links, telephone numbers and titles were correct at time of publication, but may have changed. For additional assistance, journalists and analysts may contact Microsoft’s Rapid Response Team or other appropriate contacts listed at http://news.microsoft.com/microsoft-public-relations-contacts.

For more information, visit our:

Website: www.dji.com

Online Store: store.dji.com/

Facebook: www.facebook.com/DJI

Instagram: www.instagram.com/DJIGlobal

Twitter: www.twitter.com/DJIGlobal
LinkedIn: www.linkedin.com/company/dji

Subscribe to our YouTube Channel: www.youtube.com/DJI

 

 

The post DJI and Microsoft partner to bring advanced drone technology to the enterprise appeared first on Stories.