Tag Archives: data

Uber breach affected 57 million users, covered up for a year

Malicious actors stole personal data on hundreds of thousands of Uber drivers and millions of Uber users and the company allegedly covered up the breach for one year, including reportedly paying the attackers to keep quiet.

According to new CEO Dara Khosrowshahi, the Uber breach was due to two malicious actors accessing “a third-party cloud-based service” — reportedly GitHub and Amazon Web Services (AWS) — in late 2016 and downloading files containing names and driver’s license information on 600,000 U.S. Uber drivers and personal information — names, email addresses and phone numbers — for 57 million Uber customers from around the world. According to Bloomberg, which was first to report the Uber breach, the incident was covered up by two members of the company’s infosec team.

“None of this should have happened, and I will not make excuses for it. While I can’t erase the past, I can commit on behalf of every Uber employee that we will learn from our mistakes,” Khosrowshahi wrote in a blog post. “We are changing the way we do business, putting integrity at the core of every decision we make and working hard to earn the trust of our customers.”

Khosrowshahi said the “failure to notify affected individuals or regulators last year” prompted a number of actions, including firing the two individuals responsible for the Uber breach response — Joe Sullivan, former federal prosecutor and now ex-CSO at Uber, and Craig Clark, one of Sullivan’s deputies — notifying and offering ID and credit monitoring to the affected drivers, notifying regulators and monitoring the affected customer accounts.

Details of the Uber data breach

According to Bloomberg, the attackers accessed a private GitHub repository used by Uber in October 2016 and used stolen credentials from GitHub to access an archive of information stored on an AWS account.

Terry Ray, CTO of Imperva, said the use of GitHub “appears to be a prime example of good intentions gone bad.”

“Using an online collaboration and coding platform isn’t necessarily wrong, and it isn’t clear if getting your accounts hacked on these platforms is even uncommon. The problem begins with why live production data was used in an online platform where credentials were available in GitHub,” Ray told SearchSecurity. “Sadly, it’s all too common that developers are allowed to copy live production data for use in development, testing and QA. This data is almost never monitored or secured, and as we can see here, it is often stored in various locations and is often easily accessed by nefarious actors.”

Sullivan reportedly took the lead in the Uber breach response and, along with Clark, worked to keep the incident under wraps, including paying the attackers $100,000 to delete the stolen personal data keep quiet.

Khosrowshahi mentioned communication with the attackers in his blog post, but did not admit to any payment being made.

“At the time of the incident, we took immediate steps to secure the data and shut down further unauthorized access by the individuals. We subsequently identified the individuals and obtained assurances that the downloaded data had been destroyed,” Khosrowshahi wrote. “We also implemented security measures to restrict access to and strengthen controls on our cloud-based storage accounts.”

Jeremiah Grossman, chief of security strategy at SentinelOne, said it can be “difficult, if not impossible, for an organization to lock down” a vector like GitHub.

“Developers accidentally, and often unknowingly, share credentials over GitHub all the time where they become exposed,” Grossman told SearchSecurity. “While traditional security controls remain crucial to organizational security, it’s no good if individuals with access to private information expose their account credentials in a place where they can be obtained and misused by others.”

Willy Leichter, vice president of marketing at Virsec Systems, Inc., said if the details of this Uber breach cover up are verified, it could been extremely damaging for the company. 

“This is a staggering breach of customer trust, ethical behavior, common sense and legal requirements for breach notification. Paying hackers to conceal their crimes is as short-sighted as it is stupid,” Leichter told SearchSecurity. “If this had happened after the EU GDPR kicks in, Uber would cease to exist. That may be the outcome anyway.”

Uber breach ramifications

The 2016 breach is the latest in a long line of issues for Uber. At the time of the incident, Uber was already under investigation for separate privacy violations. The company is also battling various lawsuits from cities and users.

Jim Kennedy, vice president North America at Certes Networks, said Uber’s already questionable reputation should take a big hit.

“Most likely the Uber C-suite, seeing the repercussions of cyber-attacks on similar household names, were keen to avoid the reputational damage — a massive error of judgement,” Kennedy told SearchSecurity. “The reality is that customer distrust of the brand will be amplified by the company’s attempts to hide the facts from them and points to the need for change in the industry.”

Adam Levin, cyber security expert and co-founder and chairman for CyberScout, said the Uber breach is another example of the company “placing stock value over and above privacy at the expense of drivers and consumers.”

Customer distrust of the brand will be amplified by the company’s attempts to hide the facts from them and points to the need for change in the industry.
Jim Kennedyvice president North America at Certes Networks

“Uber did a hit and run on our privacy and created a completely avoidable extinction or near-extinction event, and further damaged and already tarnished brand,” Levin told SearchSecurity. “As ever, the goal for a company faced with a breach or compromise should be urgency, transparency and above all else, empathy for those affected.”

Ken Spinner, vice president of field engineering at Varonis, said the Uber data breach will likely “fire up already angry consumers, who are going to demand action and protection.”

“Every state attorney general is going to be salivating at the prospect of suing Uber. While there’s no overarching federal regulations in place in the U.S., there’s a patchwork of state regulations that dictate when disclosures must be made — often it’s when a set number of users have been affected,” Spinner told SearchSecurity. “No doubt Uber has surpassed this threshold and violated many of them by not disclosing the breach for over a year. This is the latest example of how hiding a breach rarely benefits a company and almost surely will backfire.”

DoD exposed data stored in massive AWS buckets

Once again, Department of Defense data was found publicly exposed in cloud storage, but it is unclear how sensitive the data may be.

Chris Vickery, cyber risk analyst at UpGuard, based in Mountain View, Calif., found the exposed data in publicly accessible Amazon Web Services (AWS) S3 buckets. This is the second time Vickery found exposed data from the Department of Defense (DoD) on AWS. The previous exposure was blamed on government contractor Booz Allen Hamilton; UpGuard said a now defunct private-sector government contractor named VendorX appeared to be responsible for building this database. However, it is unclear if VendorX was responsible for exposing the data. Vickery also previously found exposed data in AWS buckets from the Republican National Committee, World Wrestling Entertainment, Verizon and Dow Jones & Co.

According to Dan O’Sullivan, cyber resilience analyst at UpGuard, Vickery found three publicly accessible DoD buckets on Sept. 6, 2017.

“The buckets’ AWS subdomain names — ‘centcom-backup,’ ‘centcom-archive’ and ‘pacom-archive’ — provide an immediate indication of the data repositories’ significance,” O’Sullivan wrote in a blog post. “CENTCOM refers to the U.S. Central Command, based in Tampa, Fla. and responsible for U.S. military operations from East Africa to Central Asia, including the Iraq and Afghan Wars. PACOM is the U.S. Pacific Command, headquartered in Aiea, Hawaii and covering East, South and Southeast Asia, as well as Australia and Pacific Oceania.”

UpGuard estimated the total exposed data in the AWS buckets amounted to “at least 1.8 billion posts of scraped internet content over the past eight years.” The exposed data was all scraped from public sources including news sites, comment sections, web forums and social media.

“While a cursory examination of the data reveals loose correlations of some of the scraped data to regional U.S. security concerns, such as with posts concerning Iraqi and Pakistani politics, the apparently benign nature of the vast number of captured global posts, as well as the origination of many of them from within the U.S., raises serious concerns about the extent and legality of known Pentagon surveillance against U.S. citizens,” O’Sullivan wrote. “In addition, it remains unclear why and for what reasons the data was accumulated, presenting the overwhelming likelihood that the majority of posts captured originate from law-abiding civilians across the world.”

Importance of the exposed DoD data

Vickery found references in the exposed data to the U.S. Army “Coral Reef” intelligence analysis program, which is designed “to better understand relationships between persons of interest,” but UpGuard ultimately would not speculate on why the DoD gathered the data.

Ben Johnson, CTO at Obsidian Security, said such a massive data store could be very valuable if processed properly.

“Data often provides more intelligence that initially accessed, so while this information was previously publicly available, adversaries may be able to ascertain various insights they didn’t previously had,” Johnson told SearchSecurity. “What’s more of a problem than the data itself in this case is that this is occurring at all — showcasing that there’s plenty of work to do in safeguarding our information.”

What’s more of a problem than the data itself in this case is that this is occurring at all — showcasing that there’s plenty of work to do in safeguarding our information.
Ben JohnsonCTO at Obsidian Security

Rebecca Herold, president of Privacy Professor, noted that just because the DoD collected public data doesn’t necessarily mean the exposed data includes accurate information.

“Sources of, and reliability for, the information matters greatly. Ease of modifying even a few small details within a large amount of data can completely change the reality of the topic being discussed. Those finding this information need to take great caution to not simply assume the information is all valid and accurate,” Herold told SearchSecurity. “Much of this data could have been manufactured and used for testing, and much of it may have been used to lure attention, as a type of honeypot, and may contain a great amount of false information.”

Herold added that the exposed data had worrying privacy implications.

“Just because the information was publicly available does not mean that it should have been publicly available. Perhaps some of this information also ended up being mistakenly being made publicly available because of errors in configurations of storage servers, or of website errors,” Herold said. “When we have organizations purposefully taking actions to collect and inappropriately (though legally in many instances) use, share and sell personal information, and then that information is combined with all this freely available huge repositories of data, it can provide deep insights and revelations for specific groups and individuals that could dramatically harm a wide range of aspects within their lives.”

Commvault GO: Vendor ‘HyperScales’ data management strategy

The Commvault HyperScale appliance is the latest — and largest — example of how the data protection company has changed in recent years.

The vendor put those changes on display at its Commvault GO user conference in early November. Besides showing off its integrated appliance, Commvault emphasized its software’s role in data management and analytics across on-premises and cloud storage.

Commvault CEO Bob Hammer said the type of scale-out storage HyperScale represents will soon become common. The key is to have all the software pieces in place.

“Everybody and their brother is going to do some scale-out stuff,” Hammer said in an interview at Commvault GO. “But that doesn’t mean, from a customer use case standpoint, it solves their data management problem, their data protection problem, their DR problems, and still highlights data movement, compliance and analytics.”

Commvault long resisted the notion of selling its backup software on a branded Commvault-sold appliance. Hammer maintained Commvault should concentrate on software and let disk appliance vendors handle the backup target.

Bob Hammer, CommvaultBob Hammer

“We don’t want to be in the hardware business,” Hammer said after its largest software rival, Symantec — now Veritas — put its flagship NetBackup application on an integrated appliance in 2010.

But if Veritas couldn’t nudge Commvault into the hardware business, a pair of newcomers could. Startups Cohesity and Rubrik — both with leadership roots from hyper-converged pioneer Nutanix — emerged in 2015 with integrated appliances that went beyond backup. The upstarts called their products converged secondary storage, because they handled data for backup, archiving, test and development, and disaster recovery, and they pulled in the cloud as well as disk for targets. Both have gained traction rapidly with their converged strategy.

Commvault was already headed in a new direction with its software, changing the name from Simpana to the Commvault Data Platform in 2015. Commvault always mixed data management with protection, but critics and even customers found all that functionality difficult to learn and use.

“Commvault was not known as the least expensive solution, or the easiest to use,” said Jon Walton, CIO of San Mateo County in California, and a longtime Commvault customer. “But it was definitely the most flexible. Its challenge was it was seen as a good tool, but not the cheapest. And in government, cheap wins bids. But we were trying to introduce a single tool to back up everything.”

Walton said he took the plunge with Commvault and made sure his staff received the training it needed. “I don’t lose any sleep using this platform for my data,” he said.

Around early 2016, Hammer said it became clear that secondary storage, and some primary storage, was moving to a “cloud-like infrastructure.” Customers were looking for a more unified way to protect and manage their data, both on premises and in public clouds.

“Going way back, I didn’t want to go into the hardware business, but it was clear as day the market was going to be driven by an integrated device,” Hammer said. “We said, ‘OK, we can supply that device,’ and just needed to put partnerships together.”

HyperScale involves hardware, software partners

Commvault HyperScale appliances run on 1U servers from Fujitsu. HyperScale software provides data services on the appliance. Commvault also partners with Red Hat, using Red Hat’s Gluster file system as a foundation for the HyperScale scale-out storage.

Commvault also lined up server vendors Cisco, Dell EMC, Hewlett Packard Enterprise, Huawei, Lenovo and Super Micro as partners on reference architectures that run HyperScale software and the Commvault Data Platform stack on top.

Cisco became an OEM partner, rebranding HyperScale as ScaleProtect on Cisco Unified Computing System. Commvault sees the 2U UCS server — 4U blades are also planned — as a good fit for the enterprise, while its 1U HyperScale blades handle all secondary data needs for SMBs, remote offices and departments.

Commvault's HyperScale appliance
Commvault showed off its HyperScale integrated appliance at the Commvault GO user conference.

Wrapping all of its features — plus cloud support — on an integrated appliance could help Commvault solve its complexity problems. The vendor already moved to simplify pricing and management in recent years by changing its licensing and selling a targeted bundle for use cases such as cloud storage, endpoint backup and virtual machine protection.

Commvault uses capacity-based licensing for HyperScale, with free hardware refreshes at the end of a three-year subscription.

“I think Commvault recognized the cost challenges and has probably risen to the challenges of meeting those as well as everybody,” San Mateo’s Walton said in an interview at Commvault GO.

Other customers at Commvault GO agreed with Walton that Commvault’s complexity is at least partly the result of ifs comprehensive feature set, and its broad functionality is a selling point.

“It’s a single tool to help us protect structured data, unstructured data, virtual and physical machines,” said John Hoover, IT manager of the database and infrastructure team at the Iowa Judicial Branch. “It’s one pane of glass, one index, one tool to know.”

Hoover said his team includes five people for infrastructure and two database administrators to manage more than 100 million digital court documents.

“We’re busy people. Trying to keep up with multiple tools to protect all that data taxes our time,” he said. “And we have to protect it. An electronic file is the official file of the state. There’s no paper trail anymore.”

Commvault HyperScale fights old foes

Despite moving to an integrated model to take on the likes of Cohesity and Rubrik, Commvault still battles old backup software competitors — mainly Veritas and Dell EMC. Hammer referred to Veritas NetBackup as a “legacy scale-up appliance,” the kind that customers are moving to scale-out models to avoid.

Hammer also challenged Michael Dell during his Commvault GO keynote. Dell EMC is one of Commvault’s HyperScale server partners, but also sells backup and data management software. Hammer pointed to the Dell CEO’s claim that he would pump $1 billion over three years into research and development for an internet of things (IoT) division.

I say, ‘Game on,’ to Michael Dell. You can’t do it with piece parts. It’s not so simple.
Bob HammerCEO, Commvault

“I have news for him,” Hammer said. “We’re going to innovate faster than you are, Michael. Game on.”

Off-stage, Hammer elaborated on Commvault’s relationship with Dell.

“Obviously, they’re a major player with HyperScale. Many customers are going to buy HyperScale with Dell servers,” he said. “That’s where we’re aligned. But it’s a whole different story putting a platform together for IoT and analytics, and that’s where I say, ‘Game on,’ to Michael Dell. You can’t do it with piece parts. It’s not so simple. I’m sure he’ll be in the game, but it’s not an easy thing.”

He should know, because Commvault has already gone down that path.

Quobyte preps 2.0 Data Center File System software update

Quobyte’s updated Data Center File System software adds volume-mirroring capabilities for disaster recovery, support for Mac and Windows clients, and shared access control lists.

The startup, based in Santa Clara, Calif., this week released the 2.0 version of its distributed POSIX-compliant parallel file system to beta testers and expects to make the updated product generally available in January.

The Quobyte software supports file, block and object storage, and it’s designed to scale out IOPS, throughput and capacity linearly on commodity hardware ranging from four to thousands of servers. Policy-based data placement lets users earmark high-performance workloads to flash drives, including faster new NVMe-based PCIe solid-state drives.

Software-defined storage startups face challenges

Despite the additions, industry analysts question whether Quobyte has done enough to stand out in a crowded field of file-system vendors.

Marc Staimer, president of Dragon Slayer Consulting, said Quobyte faces significant hurdles against competition ranging from established giants, such as Dell EMC, to startups, including Elastifile, Qumulo, Rozo Systems, StorOne and WekaIO.

Staimer called features such as shared access control lists (ACLs) and volume mirroring table stakes in the software-defined storage market. He said mirroring — a technology that was hot 20 years ago — protects against hardware failures, but doesn’t go far enough for disaster recovery. He said Quobyte must consider adding versioning and time stamping to protect against software corruption, malware, accidental deletion and problems of that nature.

Steven Hill, a senior storage analyst at 451 Research, said it takes more than features to gain share in the enterprise storage market. He said Quobyte would do well to forge closer hardware partnerships to provide better integration, optimization, support and services.

“Even though software-delivered storage appears to be the trend, many storage customers still seem more interested in the fully supported hardware [and] software appliance model, rather than taking a roll-your-own approach to enterprise storage,  especially when there can be so many different production requirements in play at the same time,” Hill wrote in an email.

Quobyte CEO Bjorn Kolbeck and CTO Felix Hupfeld worked in storage engineering at Google before starting Quobyte in 2013. And Kolbeck claimed the “Google-style operations” that the Quobyte architecture enables would allow users to grow the system and run 24/7 without the need for additional manpower.

According to Kolbeck, fault tolerance is the most important enabler for Google-style operations. He said Quobyte achieves fault tolerance through automated replication, erasure coding, disaster recovery and end-to-end checksums that ensure data integrity. With those capabilities, users can fix broken hardware on their own schedules, he said.

“That’s the key to managing very large installations with hundreds of petabytes with a small team,” Kolbeck said.

Quobyte 2.0

Kolbeck said Quobyte made volume mirroring a priority following requests from commercial customers. The software uses continuous asynchronous replication across geographic regions and clouds to facilitate disaster recovery. Kolbeck said customers would be able replicate the primary site and use erasure coding with remote sites to lower the storage footprint, if they choose.

To expand data sharing across platforms and interfaces, Quobyte 2.0 finalized native drivers for Mac and Windows clients. Its previous version supported Linux, Hadoop and Amazon Simple Storage Service (S3) options for users to read, write and access files.

Kolbeck said adding access control lists will allow users to read and modify them from all interfaces now that Mac and Windows ACLs and S3 permissions map to Quobyte’s internal NFSv4 ACLs.

Quobyte also moved to simplify installation and management through the creation of a cloud-based service to assist with domain name system service configuration. Kolbeck said the company “moved as far away from the command line as possible,” and the system now can walk customers through the installation process.

Kolbeck said Quobyte currently has about 25 customers running the software in production. He said the company targets commercial high-performance computing and “all markets that are challenged by data growth,” including financial services, life sciences, exploratory data analysis and chip design, media and entertainment, and manufacturing and internet of things.

Quobyte’s subscription pricing model, based on usable capacity, will remain unchanged with the 2.0 product release.

Push for public, private sector cybersecurity cooperation continues

Recent events such as the Equifax data breach and allegations regarding Russian interference with the 2016 presidential election are sobering reminders of cybersecurity holes in both the public and private sectors.

Cooperation between government and businesses has long been heralded as vital to protect digital assets and improve U.S. cybersecurity, which is why such cooperation is becoming part of U.S. cybersecurity strategy, said acting FBI Director Andrew McCabe.

“There is no law enforcement or exclusive intelligence answer to these questions,” McCabe said about cybersecurity strategy during the Cambridge Cyber Summit hosted by CNBC and the Aspen Institute earlier this month. “We’ve got to work together with the private sector to get there.”

Achieving this goal was the main topic presented at the annual conference, which examines how the public and private sectors can work together to safeguard economic, financial and government assets, while also maintaining convenience and protecting online privacy.

Regulations are usually anathema to a tech industry that worries cybersecurity mandates hinder the innovation upon which their industry thrives. There has been headway of late, however: In response to claims that Russian agents bought social media advertisements designed to sow discord in American politics, Facebook CEO Mark Zuckerberg announced policy changes to “protect election integrity.”

McCabe admitted that the relationship between the federal government and the private sector has had its ups and downs through the years. Edward Snowden’s disclosures about U.S. digital surveillance practices and law enforcement’s confrontation with Apple over the San Bernardino, Calif., shooter’s iPhone, for example, have hindered public and private sector cybersecurity cooperation.

“I see things like this and I hope that we are now edging back into a warmer space … to actually work on solutions,” McCabe said.

The public sector is doing its part to help facilitate these partnerships: The New Democrat Coalition has established a Cybersecurity Task Force that promotes “public-private sector cooperation and innovation” designed to protect against cyberattacks. The U.S. House of Representatives recently passed the National Institute of Standards and Technology (NIST) Small Business Cybersecurity Act, which sets “guidelines,” as opposed to mandatory requirements, for small businesses.

If you try to put too much constraint and mandatory check boxes on the security of a device, you will find that the manufacturers are going to be slowed in their ability to innovate.
Rob Joycecybersecurity coordinator, U.S. White House

Incentives are a big part of these types of efforts. Last month, senators introduced a cybersecurity bill that would establish a reward program designed to incentivize private researchers to identify security flaws in U.S. election systems.

These types of partnerships are beneficial for both sides, said Rod Rosenstein, deputy attorney general at the Department of Justice, at the Cambridge Cyber Summit. Law enforcement investigations can help a company understand what happened, share context and information about related incidents, and even provide advice to shore up defenses if the hackers act again, he said.

“We can inform regulators about your cooperation, and we are uniquely situated to pursue the perpetrators through criminal investigation and prosecution,” Rosenstein said. “In appropriate cases that involve overseas actors, we can also pursue economic sanctions, diplomatic pressure and intelligence operations ourselves.”

International efforts, global companies

The “overseas” variable doesn’t end with nefarious foreign actors hacking U.S. companies. Public and private sector cybersecurity cooperation is further complicated in the global economy with enterprises that have customers, headquarters and employees stationed all over the world. This makes it difficult to incorporate cybersecurity best practices as digital information moves across borders.

Different countries have different rules when it comes to handling digital information, leaving international organizations to navigate conflicting international laws.

“They have different threats to their systems, to their data, to their employees in many different places,” McCabe said. “I think we have a clear and important role in helping them address those threats and those challenges.”

McCabe was quick to add, however, that U.S.-based security professionals and law enforcement prioritize U.S. cybersecurity standards.

“Although we acknowledge that [global companies] have responsibilities in other parts of the world, we expect them to live up to our norms of behavior and in compliance with U.S. law and all the ways that that’s required here in the United States,” McCabe said.

The power of voluntary enforcement

When it comes to cybersecurity, White House Cybersecurity Coordinator Rob Joyce said he is a fan of “voluntary enforcement” among industry. If industry groups can rise up to identify unique risks and push best cybersecurity practices, it could create a sort of peer pressure for other organizations to step up their cybersecurity game, he said at the summit.

The goal is to give consumers the opportunity to choose companies that have voluntarily implemented well-planned cybersecurity best practices and compliance standards, as opposed to security protocols that are slapped together just so new products can be put on the market quickly, he said.

“We would expect industry groups to start labeling themselves as compliant and then consumers to make smart choices about what they’re buying,” Joyce said.

Forcing cybersecurity standards on the technology industry through government regulation poses problems, Joyce said, mostly because the industry evolves so fast. A cybersecurity standard that provides effective data protection and enforcement today could quickly become obsolete when the next iteration of technology is introduced.

“The problem with forcing it through government regulation is you snap a chalk line today, and this industry moves fast,” Joyce said. “You impede good security because people have to do the thing to regulate it instead of doing the thing that’s right.”

The trick is to find that balance between innovation and cybersecurity protection, Joyce added.

“If you try to put too much constraint and mandatory check boxes on the security of a device, you will find that the manufacturers are going to be slowed in their ability to innovate and give us that next better product,” Joyce said. “But we’ve got to have the ability to drive that next better product to have some base security.”

Druva Cloud Platform expands with Apollo

Druva moved to help manage data protection in the cloud with its latest Apollo software as a service, which helps protect workloads in Amazon Web Services through the Druva Cloud Platform.

The company’s new service provides a single control plane to manage infrastructure-as-a-service and platform-as-a-service cloud workloads.

Druva, based in Sunnyvale, Calif., sells two cloud backup products, Druva InSync and Druva Phoenix, for its Druva Cloud Platform. The enterprise-level Druva InSync backs up endpoint data across physical and public cloud storage. The Druva Phoenix agent backs up and restores data sets in the cloud for distributed physical and virtual servers. Phoenix applies global deduplication at the source and points archived server backups to the cloud target.

There is a big change going on throughout the industry in how data is being managed. The growth is shifting toward secondary data.
Steven Hillsenior storage analyst, 451 Research

Apollo enables data management of Druva Cloud Platform workloads under a single control plane so administrators can do snapshot management for backup, recovery and replication of Amazon Web Services instances. It automates service-level agreements with global orchestration that includes file-level recovery. It also protects Amazon Elastic Compute Cloud instances.

Druva Apollo is part of an industrywide trend among data protection vendors to bring all secondary data under global management across on-premises and cloud storage.

“There is a big change going on throughout the industry in how data is being managed,” said Steven Hill, senior storage analyst for 451 Research. “The growth is shifting toward secondary data. Now, secondary data is growing faster than structured data, and that is where companies are running into a challenge.”

“Apollo will apply snapshot policies,” said Dave Packer, Druva’s vice president of product and alliance marketing. “It will automate many of the lifecycles of the snapshots. That is the first feature of Apollo.”

Automation for discovery, analysis and information governance is on the Druva cloud roadmap, Packer said.

Druva last August pulled in $80 million in funding, bringing total investments into the range of $200 million for the fast-growing vendor. Druva claims to have more than 4,000 worldwide customers that include NASA, Pfizer, NBCUniversal, Marriott Hotels, Stanford University and Lockheed Martin.

Druva has positioned its data management software to go up against traditional backup vendors Commvault and Veritas Technologies, which also are transitioning into broad-based data management players. It’s also competing with startups Rubrik, which has raised a total of $292 million in funding since 2015 for cloud data management, and Cohesity, which has raised $160 million.

End-user monitoring gets a leg up with AI, analytics

As more endpoints and applications make their way into the workplace and generate more data, traditional monitoring tools — and the IT professionals who use them — struggle to keep up.

It’s harder to identify the causes of application and infrastructure performance problems and respond to them in a timely manner. In response to these challenges, new end-user computing (EUC) monitoring tools that rely on AI, machine learning and big data analytics have emerged. These technologies are popular buzzwords throughout the IT industry, and they may conjure up visions of a future in which sentient robots rule the world. In the EUC market, however, they’re real, and their time is now.

“There is a lot of hype around it,” said Jarian Gibson, an independent EUC consultant. “But I understand why, because it’s where these tools need to go.”

Three end-user monitoring challenges

Infrastructure and application monitoring software typically looks for deviations in normal behavior, such as spikes in an application’s CPU or bandwidth use. It then alerts IT of a potential problem, and some products even attempt to identify the cause.

This approach has worked well enough at a basic level, but the rise of mobility and new ways of delivering applications to users has exposed some flaws. IT professionals face three major challenges when monitoring modern endpoints and their support infrastructures:

It‘s a reactive approach. By the time monitoring software issues an alert, the problem is already happening, and it’s potentially affecting user productivity.

“We’re not doing enough on a proactive basis,” Gibson said.

It‘s hard to pinpoint the causes of problems. When physical desktops were users’ only devices, and they all ran IT-approved applications and accessed the same corporate network, there was a fairly limited number of potential root causes. Chief among those were the network infrastructure, the desktop hardware and operating system, and the applications themselves.

But these days, the average consumer owns more than three connected devices, according to a 2016 survey by GlobalWebIndex. People are using more of these devices for work, and they also have a mix of employer-provided physical desktops, virtual desktops and mobile devices at their disposal to access applications. To further complicate matters, they connect these devices to a variety of public, private, secure and unsecure networks. An issue with any of these components can cause a problem, and there are so many that it can take significant time and effort for IT to manually sort through them all.

“Trying to analyze it and digest it into something consumable and meaningful is the biggest thing,” said Patrick McGraw, virtual infrastructure engineer at Western Carolina University in Cullowhee, N.C.

The “problems” may not be problems at all. The thresholds that trigger end-user monitoring alerts don’t always align with issues that are perceptible to users. A computer’s network speed may suddenly drop, but if the user isn’t working in a bandwidth-intensive application and doesn’t notice any lag, it’s not necessarily something that IT needs to address. Conversely, users may experience problems when everything looks like it’s running smoothly on the monitoring dashboard.

IT professionals sometimes don’t even know what data they need from end-user monitoring to ensure a positive user experience, McGraw said.

Analyze this

The two major EUC platform vendors, Citrix and VMware, both launched new analytics-focused end-user monitoring offerings this year, following in the footsteps of several third-party providers.

Citrix Analytics Service monitors the behavior of users, devices, applications and data running on the company’s EUC products, including XenApp, XenDesktop, XenMobile and ShareFile. It then uses machine learning algorithms to analyze this behavior and identify potential threats, either from attackers or from workers who aren’t following security best practices. The service can also respond automatically to these threats by temporarily blocking unsecure applications or restricting compromised users’ access to corporate resources.

Citrix Analytics relies heavily on contextual data, such as what devices workers are using and where they’re logging in from. It’s hard to make sense of that information manually, so the service’s machine learning capabilities will be a big help, Gibson said.

What’s being stored, where’s it being stored and who has access to it?
Jarian Gibsonconsultant

VMware’s offering, Workspace One Intelligence, focuses more on automating responses to threats and user experience problems across physical and virtual desktops, mobile devices and their applications. The end-user monitoring service relies on technology from Apteligent, a vendor VMware acquired in May. Using those in-app analytics features, Workspace One Intelligence can automatically identify a UX problem caused by a bug, alert developers and roll back to the prior release.

Western Carolina University is an all-VMware shop using the vendor’s vRealize Operations for Horizon to monitor its virtual desktops. The product is good, but it’s huge — vRealize Operations also monitors data center infrastructure and applications — which makes it hard to customize, McGraw said. Workspace One Intelligence is appealing because it’s focused on EUC and its analytics capabilities could help the university get better insights into the hundreds of applications it manages, he said.

Other companies that have brought data analytics to end-user monitoring tools include ControlUp and UberAgent. ControlUp Real-time collects and analyzes data from across all the vendor’s customers and uses that information to proactively alert IT about upcoming physical and virtual desktop problems that could have a negative effect on UX. And UberAgent uses Splunk’s big data analysis of physical and virtual desktop performance to create dashboards that illustrate the metrics that matter to users, such as logon times and network latency.

Data security and trust issues linger

Despite the benefits of AI, machine learning and data analytics for EUC monitoring, IT professionals still have some concerns.

On a practical level, there’s the issue of getting these new technologies to work. Most virtual desktop infrastructures and their traditional monitoring tools live on premises, but these emerging technologies rely primarily on the cloud for heavy-duty data processing. Proper integration with existing in-house systems will be key to their success, Gibson said.

Because these new end-user monitoring tools collect and analyze so much data, security will be paramount.

“My biggest concern is the data part of it,” Gibson said. “What’s being stored, where’s it being stored and who has access to it?”

The trend toward more automated incident responses could also be problematic. It’s important to respond to problems quickly, or even stop them before they happen, but organizations shouldn’t depend solely on AI to make these decisions for them, McGraw said.

“I’d want to see how well that works and make sure we’re not getting false positives,” he said. “It’d probably take a while for us to trust it.”

Quorum OnQ solves Amvac Chemical’s recovery problem

Using a mix of data protection software, hardware and cloud services from different vendors, Amvac Chemical Corp. found itself in a cycle of frustration. Backups failed at night, then had to be rerun during the day, and that brought the network to a crawl.

The Los Angeles-based company found its answer with Quorum’s one-stop backup and disaster recovery appliances. Quorum OnQ’s disaster recovery as a service (DRaaS) combines appliances that replicate across sites with cloud services.

The hardware appliances are configured in a hub-and-spoke model with an offsite data center colocation site. The appliances perform full replication to the cloud that backs up data after hours.

“It might be overkill, but it works for us,” said Rainier Laxamana, Amvac’s director of information technology.

Quorum OnQ may be overkill, but Amvac’s previous system underwhelmed. Previously, Amvac’s strategy consisted of disk backup to early cloud services to tape. But the core problem remained: failed backups. The culprit was the Veritas Backup Exec applications that the Veritas support team, while still part of Symantec, could not explain. A big part of the Backup Exec problem was application support.

“The challenge was that we had different versions of an operating system,” Laxamana said. “We had legacy versions of Windows servers so they said [the backup application] didn’t work well with other versions.

“We were repeating backups throughout the day and people were complaining [that the network] was slow. We repeated backups because they failed at night. That slowed down the network during the day.”

We kept tapes at Iron Mountain, but it became very expensive so we brought it on premises.
Rainier Laxamanadirector of information technology, Amvac

Quorum OnQ provides local and remote instant recovery for servers, applications and data. The Quorum DRaaS setup combines backup, deduplication, replication, one-click recovery, automated disaster recovery testing and archiving. Quorum claims OnQ is “military-grade” because it was developed for U.S. Naval combat systems and introduced into the commercial market in 2010.

Amvac develops crop protection chemicals for agricultural and commercial purposes. The company has a worldwide workforce of more than 400 employees in eight locations, including a recently opened site in the Netherlands. Quorum OnQ protects six sites, moving data to the main data center. Backups are done during the day on local appliances. After hours, the data is replicated to a DR site and then to another DR site hosted by Quorum.

“After the data is replicated to the DR site, the data is replicated again to our secondary DR site, which is our biggest site,” Laxamana said. “Then the data is replicated to the cloud. So the first DR location is our co-located data center and the secondary DR our largest location. The third is the cloud because we use Quorum’s DRaaS.”

Amvac’s previous data protection configuration included managing eight physical tape libraries.

“It was not fun managing it,” Laxamana said. “And when we had legal discovery, we had to go through 10 years of data. We kept tapes at Iron Mountain, but it became very expensive so we brought it on premises.”

Laxamana said he looked for a better data protection system for two years before finding Quorum. Amvac looked at Commvault but found it too expensive and not user-friendly enough. Laxamana and his team also looked at Unitrends. At the time, Veeam Software only supported virtual machines, and Amvac needed to protect physical servers. Laxamana said Unitrends was the closest that he found to Quorum OnQ.

“The biggest (plus) with Quorum was that the interface was much more user-friendly,” he said. “It’s more integrated. With Unitrends, you need a third party to integrate the Microsoft Exchange.”

Explosion in unstructured data storage drives modernization

Digital transformation is the key IT trend driving enterprise data center modernization. Businesses today rapidly deploy web-scale applications, file sharing services, online content repositories, sensors for internet of things implementations and big data analytics. While these digital advancements facilitate new insights, streamline processes and enable better collaboration, they also increase unstructured data at an alarming rate.

Managing unstructured data and its massive growth can quickly strain legacy file storage systems that are poorly suited for managing vast amounts of this data. Taneja Group recently investigated the most common of these file storage limitations in a recent survey. The study found the top challenges IT faces with traditional file storage are lack of flexibility, poor storage utilization, inability to scale to petabyte levels and failure to support distributed data. These obstacles often lead to high storage costs, complex storage management and limited flexibility in unstructured data storage.

So how are companies addressing the unstructured data management challenge? As with all things IT, it’s essential to have the right architecture. For unstructured data storage, this means a highly scalable, resilient, flexible, economical and accessible secondary storage environment.

Let’s take a closer look at modern unstructured data storage requirements and examine why distributed file systems and a scale-out object storage design, or scale-out storage, are becoming a key part of modern secondary storage management.

Scalability and resiliency

Given the huge amounts of unstructured data, it’s undeniable that scalability is the most critical aspect of modern secondary storage.

Given the huge amounts of unstructured data, scalability is undeniably the most critical aspect of modern secondary storage. This is where scale-out storage shines. It’s ideal for managing huge amounts of unstructured data because it easily scales to hundreds of petabytes simply by adding storage nodes. This inherent advantage over scale-up file storage appliances that become bottlenecked by single or dual controllers has prompted several data protection vendors to offer scale-out secondary storage platforms. Notable vendors with scale-out secondary storage offerings are Cohesity, Rubik and — most recently — Commvault.

Attaining storage resiliency is another important requirement of modern secondary storage. Two key factors are required to achieve storage resiliency. The first is high fault tolerance. Scale-out storage is ideal in this area because it uses space-efficient erasure coding and flexible replication policies to tolerate site, multiple node and disk failures.

Rapid data recovery is the second key factor for storage resiliency. For near-instantaneous recovery times, IT managers should look for secondary storage products that provision clones from backup snapshots to recover applications in minutes or even seconds. Secondary storage products should allow administrators to run recovered applications directly on secondary storage until data is copied back to primary storage and be able to orchestrate the recovery of multi-tier applications.

Flexibility and cost

To handle multiple, unstructured data storage use cases, modern secondary storage must also be flexible. Central to flexibility is multiprotocol support. Scale-out storage should support both file and object protocols, such as NFS for Linux, SMB or CIFS for Windows and Amazon Simple Storage Service for web-scale applications. True system flexibility also requires modularity, or composable architecture, which enables multidimensional scalability and I/O flexibility. Admins must be able to quickly vary computing, network and storage resources to accommodate IOPS-, throughput- and capacity-intensive workloads.

Good economics is another requirement for modern secondary storage. Scale-out storage reduces hardware costs by enabling software-defined storage that uses standard, off-the-shelf servers. It’s also simple to maintain. Administrators can easily upgrade or replace computing nodes without having to migrate data among systems, reducing administration time and operating costs. Scale-out secondary storage also provides the option to store data in cost-effective public cloud services, such as Amazon Web Services, Google Cloud and Microsoft Azure.

Moreover, scale-out storage reduces administration time by eliminating storage silos and the rigid, hierarchical structure used in file storage appliances. It instead places all data in a flat address space or single storage pool. Scale-out secondary storage also provides built-in metadata file search capabilities that help users quickly locate the data they need.

Some vendors, such as Cohesity, offer full-text search that facilitates compliance activities by letting companies quickly find files containing sensitive data, such as passwords and Social Security numbers. Add to this support for geographically distributed environments, and it’s easy to see why scale-out storage is essential for cost-effectively managing large-scale storage environments.

Data management

The final important ingredient of modern secondary storage environments is providing easy access to services required to manage secondary data. As the amount of unstructured data grows, IT can make things easier for storage administrators and improve organizational agility by giving application owners self-service tools that automate the full data lifecycle. This means providing a portal or marketplace and predefined service-level agreement templates that establish the proper data storage parameters. These parameters include recovery points, retention periods and workload placement based on a company’s standard data policies. Secondary storage should also integrate with database management tools, such as Oracle Recovery Manager.

Clearly, distributed file systems and scale-out object storage architectures are a key part of modern secondary storage offerings. There is an evolution of secondary product portfolios to address the immense unstructured data storage needs of modern organizations in the digital era. So stay tuned, as I expect nearly all major data protection vendors will introduce scale-out secondary storage products over the next 12 to 18 months. 

How IT orgs make container management platform choices

One IT organization runs 50 data centers, while another started natively on the cloud and never looked down. Unsurprisingly, they have different expectations of container management software.

Every company and team has different goals and requirements to deploy containers. Technological differentiation is not the only — or even the biggest — factor when they select a container management platform.

Expertise on staff, tool cost, implementation decisions and the existing ecosystem and underlying infrastructure play a large role in the vendor, tools and technology that’s the right fit to scale containers.

“Some like to stick with the Docker product vertical due to lifecycle UX and a focus on simplicity and security,” said Bret Fisher, an independent DevOps and Docker consultant, trainer and speaker involved in open source communities, at the O’Reilly Velocity Conference 2017 in New York. “Some choose Kubernetes because it seems the current winner of orchestrators, and others choose Mesos and [Mesosphere] DC/OS due to flexibility and maturity.”

The management tool marketplace reflects the maturing nature of containers. “We’re just now standardizing what it means to be a container runtime and a container image,” Fisher said. The difference between container management platforms such as Kubernetes and Docker Enterprise Edition (EE) represents an ecosystem war reminiscent of the iPhone vs. Droid phone wars, he said. Orchestrators and schedulers have 75% the same features, so it often comes down to which one people know and feel comfortable on.

Dealer Tire, a Cleveland, Ohio-based automotive industry distributor, modernized from physical machines to virtual ones a few years ago, and now its web platform operations team is six months into container adoption on private servers in two data centers with VMware virtualization as the host layer. The container management tool evaluation covered Docker, Kubernetes, Mesos and Rancher.

Mesos and Kubernetes seemed complicated, and the team didn’t want to manage native Docker via the command-line interface, said Andrew Maurer, IT manager of web platform ops.

“Rancher seemed to make sense. It was low level of barrier to entry; to get it up and running was extremely simple,” he said.

Dealer Tire also wanted guidance through not just container adoption but also a shift to treat servers not as pets but as cattle, Maurer said.

Other companies’ IT teams have started to branch out from native Docker tooling.

Cox Automotive’s inventory solutions group is evaluating Kubernetes and Mesos technologies for container management as its Docker deployment grows, said Jason Riggins, the group’s director of production engineering, who discussed his company’s DevOps and cloud adoption at Delivery of Things World USA in San Diego.

The primary requirement of a container management platform — and any other tool they select — is production stability. “We know how to move stuff really fast,” Riggins said, “[but that’s not a good thing if] even bad stuff moves fast.” And his group also wants a more dynamic tool than the native Docker options, with a particular focus on the container registry. The tiebreaker for container management platform selection will be how much effort goes into maintenance and upkeep.

Container management platform choices often fall along data center vs. cloud lines. “People going with Google Cloud [Platform] tend to prefer Kubernetes. People with complex private data centers tend to consider Mesos, though that’s changing as data center venders have started to support Kubernetes and Docker EE,” Fisher said. Cox Automotive is consolidating data centers and adopting public cloud, so a container management product must work with on-premises infrastructure and public cloud deployments.

Part of Cox’s evaluation of Kubernetes and Mesos is to examine the “scar tissue” from difficult previous container deployment attempts, Riggins said. Peers who have already implemented each technology are also valuable information sources, he said.

Container platform
Containers are just one piece of the puzzle when it comes to delivering highly available, scalable containerized applications.

When to orchestrate a change

Most companies stick with their container management platform from pilot to large-scale production, and only change course when they hit a limitation. One popular goal for container orchestration is more flexible integration between components, but the market isn’t that mature yet, Fisher said.

Social Tables, a cloud-native 100% Amazon Web Services customer, bucked the comfort zone trend when it chucked its initial choice of AWS Elastic Compute Cloud Container Service (ECS).

One popular goal for container orchestration is more flexible integration between components, but the market isn’t that mature yet.
Bret Fisherindependent DevOps and Docker consultant

“We switched from ECS to Rancher because we wanted to move away from ELB [Elastic Load Balancing] and run our own global load balancing service for better control over our traffic,” said Michael Dumont, lead systems engineer in DevOps at the Washington, D.C., firm which provides social event planning and management SaaS. The company also required persistent storage for a Cassandra cluster, an Elasticsearch cluster, Redis, and Prometheus, and with Rancher it also gets DNS-based service discovery, Docker-Compose support, and GitHub OAuth integration for authentication and authorization.

While companies are unlikely to switch container orchestrators, sometimes they don’t have a choice. In this emergent space, container orchestrators, schedulers and related tools for storage and network management change constantly. For example, Rancher Labs brought in Kubernetes for Rancher 2.0. Both Maurer and Dumont hope Rancher will keep Kubernetes under the hood to preserve the familiar interface while enriching its management capabilities.

Support matters in emerging technologies

In the rapid modernization climate for an IT organization, any new tool has to do more than provide necessary technology — it has to be supported.

Cox Automotive will select a supported version of Kubernetes or Mesos, not pure upstream open source, because it encountered difficulties getting container deployments up and running at enterprise scale, Riggins said, adding that they’re familiar with taking the unsupported open source route, but not right in this case.

During Dealer Tire’s container management platform evaluation period, Rancher’s support engineers worked through a problem. “This was before we spent a dime with them,” Maurer said. Today, his group relies on enterprise support, and he believes commercial versions of open source technologies are the best option for IT organizations that want to safely move into new areas, and avoid the time and money to get a platform running only to find out support falls flat.

“My biggest challenge with purchased software is it’s really hard to [simulate real use] when you’re limited to a two-week trial,” he said. “It’s nice to be able to deploy something, configure something significant and then decide, ‘I’ve invested quite a bit of my business into the software — I need to buy support to make sure my business continues to succeed.'”

Work with what you have

Social Tables’ cloud-native, startup pedigree is the tailor-made case for containerization, but enterprise IT pros can suit up their traditional apps with containers, too.

At Dealer Tire, Maurer’s team started with a simple app that was not customer-facing as the lowest-risk entry point to containers. The team communicates with application owners about which apps are a good fit on containers, and which are not. A 100% move to containers is not going to happen at Dealer Tire, but Maurer expects to convert all the web apps. At the same time, the company puts new software development in containers — a natural fit, in his estimation.

Dealer Tire also decided to stay on premises during its ramp up of containers. It was too much change at once, and changing responsibilities, to go to a cloud model, and some of the company’s diverse supported apps are not conducive to cloud ops, Maurer said. However, a future phase of cloud migration would be easier with these workloads encapsulated in Docker containers, he said.

“There’s a learning curve, and because the system’s new you have to set new expectations on every facet,” he said. “What directory are you using? How do you log things? … How do you communicate your errors and metrics?” Whereas before everything lived on the server, now systems are volatile and ephemeral. “It’s not just moving to containers — you’re changing everything about your environment,” he said.