As International Financial Data Services (IFDS) started containerizing more and more environments, it needed better Kubernetes backup.
IFDS first dipped its toes into containers in 2017. The company writes its own software for the financial industry, and the first container deployment was for its application development environment. With the success of the large containerized quality assurance testing environment, the company started using containers in production as well.
Headquartered in Toronto, IFDS provides outsourcing and technology for financial companies, such as investment funds’ record keeping and back-office support. It has around 2,400 employees, a clientele of about 240 financial organizations and $3.6 trillion CAD ($2.65 trillion US) in assets under administration.
Kent Pollard, senior infrastructure architect at IFDS, has been with the company for 25 of its 33-year history and said containerizing production opened up a need for backup. One of the use cases of containers is to quickly bring up applications or services with little resource overhead and without the need to store anything. However, Pollard said IFDS’ container environment was no longer about simply spinning up and spinning down.
“We’re not a typical container deployment. We have a lot of persistent storage,” Pollard said.
Zerto recently unveiled its Zerto for Kubernetes backup product at ZertoCon 2020, but Pollard has been working with an alpha build of it for the past month. He said it is still in early stages, and he’s been giving feedback to Zerto, but he has a positive impression so far. Pollard said not having to turn to another vendor such as IBM, Asigra or Trilio for Kubernetes backup will be a huge benefit.
Pollard’s current container backup method uses Zerto to restore containers in a roundabout way. His container environment is built in Red Hat OpenShift and running in a virtualized environment. Zerto is built for replicating VMs, so Pollard can use it to restore the entire VM housing OpenShift. The drawback is this reverts the entire VM to an earlier state, when all he wanted was to restore a single container.
Pollard said, at the least, Zerto for Kubernetes instead allows him to restore at the container level. He understood the early nature of what he’s been testing and said he is looking forward to when other Zerto capabilities get added, such as ordered recovery and automated workflows for failover and testing. From his limited experience, Pollard said he believes Zerto for Kubernetes has the potential to fill his container backup needs.
Pollard said Zerto for Kubernetes will give him incentive to containerize more of IFDS’ environment. The number of containers IFDS currently has in production is still relatively small, and part of the reason Pollard won’t put more critical workloads in containers is because he can’t protect them yet.
He said there were many reasons IFDS moved to containers three years ago. With containers, IFDS is able to more efficiently use its underlying hardware resources, enabling faster responses to application load changes. Pollard also said it improved IFDS’ security and supports the company’s future move to the cloud and built out a hybrid infrastructure. Zerto provided Pollard with an AWS environment to test Zerto for Kubernetes, but IFDS currently has no cloud footprint whatsoever.
IFDS first deployed Zerto in late 2014. It started as a small production environment deployment on a couple of VMs but became the company’s standard tool for disaster recovery. IFDS now uses Zerto to protect 190 VMs and 200 TB of storage. Pollard said he was sold after the first annual DR test when Zerto completed in 30 minutes.
“We never had anything that fast. It was always hours and hours for a DR test,” he said.
A pioneer in virtual data protection is preparing for its now-virtual user conference.
Though Veeam Software has run virtual events before, VeeamON 2020 was scheduled for Las Vegas in May, but the coronavirus pandemic forced the data protection and management vendor to shift gears. The free virtual edition runs June 17-18.
The backup vendor wants to keep a live atmosphere for its virtual show. Sessions will include live Q&As, and VeeamON 2020 will even include a Veeam party, featuring a performance by Keith Urban.
“It can’t be a death-by-PowerPoint,” Veeam chief marketing officer Jim Kruger said. “You’ve got to mix it up.”
‘New Veeam’ takes on new type of user conference
Veeam has previously hosted a virtual VeeamON in addition to its in-person user conference. Kruger said the company learned from that prior experience and seeks a live component to the virtual show. Along with Q&As, the agenda includes a “VeeamathON” collection of 10 sessions — with live back-and-forth — that highlight different functionality of Veeam products.
“We want to make it as interactive as possible,” Kruger said.
That interactivity is one of the key elements that Christophe Bertrand, senior analyst at Enterprise Strategy Group, will watch for as he attends VeeamON 2020. He said he’ll also look for the quality of content, how it’s presented and accessibility to the content after the show is over.
Though the physical face-to-face interaction is lost, one benefit to a virtual show is being able to view sessions on your own time. Users can easily consume content such as product demos on-demand, Bertrand said. And the vendor can keep users coming back.
Kruger said the sessions will be available on-demand.
“You’re changing the pace you can consume the information,” Bertrand said.
Another benefit is the volume of attendees. While a typical Veeam conference might draw about 2,000 people, 12,000 from 148 countries had registered for VeeamON 2020 as of last week, Kruger said. The vendor is hoping for 15,000.
“It gives us a much broader audience to go after,” Kruger said.
Conference speakers include Veeam CEO William Largent and former Microsoft CFO and CIO John Connors. Session topics range from ransomware resiliency to Office 365 backup to Kubernetes protection.
Judging by the VeeamON 2020 tagline, “Elevate Your Cloud Data Strategy,” the cloud will play a significant role over the two days. Sessions include “AWS and Azure Backup Best Practices,” “Cloud Mobility and Data Portability” and “Current Global Backup Trends & the Future State of Cloud Data Management.”
Veeam plans to include product news at the conference. In recent years, Veeam has expanded its data protection from the virtual focus to physical and cloud support. In May, the vendor launched a partnership with Kasten in container data protection. In February, Veeam launched version 10 of its Availability Suite, featuring enhanced NAS backup and ransomware protection.
Jim KrugerChief marketing officer, Veeam
VeeamON 2020 comes five months after Insight Partners bought the data protection vendor at a $5 billion valuation. At that time, Largent returned to the role of CEO, replacing co-founder Andrei Baronov. Ratmir Timashev, the other founder and also a former Veeam CEO, left his executive vice president position.
“It’s kind of a new Veeam, in a sense,” Bertrand said.
Bertrand listed the cloud, automation and ransomware as key focus areas. He said he’d like to see more from Veeam in intelligent data management — that is, how organizations can reuse data for other purposes such as analytics or DevOps.
Amid the pandemic, vendors should be looking to deliver data protection in a way that uses remote management and automation, Bertrand said. That plays to a lot of vendors’ strengths.
“Veeam has some interesting cards to play,” Bertrand said.
Data protection report cites availability, staff challenges
One topic of discussion at VeeamON 2020 will be the vendor’s recently released “2020 Data Protection Trends” report. It will be part of the keynote and a couple of sessions will discuss its results.
Dave Russell, Veeam vice president of enterprise strategy, said one of the key takeaways from the report is that the “availability gap” still exists. Seventy-three percent of respondents said they either agreed or strongly agreed that there’s a gap between how fast they need applications recovered and how fast they actually recover, according to the report.
“The vast majority aren’t going to be able to meet their companies’ [service-level agreements],” said Russell, co-author of the report with Jason Buffington, vice president of solutions strategy at Veeam.
Ninety-five percent of organizations suffer unexpected outages and an average outage lasts nearly two hours, according to the report.
Lack of staff to work on new initiatives and lack of budget for new initiatives were the top two data protection challenges, with 42% and 40% of respondents choosing them, respectively.
Veeam commissioned Vanson Bourne to conduct the online survey of 1,550 enterprises from about 20 countries in early 2020. The agency sourced predominantly non-Veeam customers and respondents didn’t know Veeam was behind the report, Russell said.
The research concluded in January, before the pandemic really struck most places. The lack of staff and funding issue will undoubtedly heighten. Layoffs, furloughs and budget cuts have already hit. As a result, simplicity in IT products will be important.
“[Organizations] need a solution that’s going to be very intuitive,” Russell said.
For example, if workers are furloughed or laid off, an organization should be wary of products that require days or weeks of training.
Russell said he thinks cloud use will rise. After the 2008 economic downturn, Russell said he saw organizations holding onto their gear longer, with less refresh cycles, and the cloud was not what it is today.
“I think we will see people embrace cloud and all its various forms,” such as infrastructure and platform as a service, Russell said.
According to the report, 43% of organizations plan to use cloud-based backup managed by a backup-as-a-service provider within the next two years. Thirty-four percent said they anticipate self-managed backup using cloud services as their organization’s primary method of backing up data.
Russell said he hopes IT administrators go to their bosses with the info in this report and show how their organizations can adapt.
“I hope people can use this as ammunition to say, hey, we’re not so different,” Russell said.
While operators behind Maze ransomware have been exposing victims’ data through a public-facing website since November 2019, new information suggests ransomware gangs are now teaming up to share resources and extort their victims.
On June 5, information and files for an international architectural firm was posted to Maze’s data leak site; however, the data wasn’t stolen in a Maze ransomware attack. It came from another ransomware operation known as LockBit.
Bleeping Computer first reported the story and later received confirmation from the Maze operators that they are working with LockBit and allowed the group to share victim data on Maze’s “news site.” Maze operators also stated that another ransomware operation would be featured on the news site in the coming days.
Three days later, Maze added the data for a victim of another competing ransomware group named Ragnar Locker. The post on Maze’s website references “Maze Cartel provided by Ragnar.”
Maze operators were the first to popularize the tactic of stealing data and combining traditional extortion with the deployment of ransomware. Not only do they exfiltrate victims’ data, but they created the public-facing website to pressure victims into paying the ransom.
Data exposure along with victim shaming is a growing trend, according to Brian Hussey, Trustwave’s vice president of cyber threat detection & response. Threat actors exfiltrate all corporate data prior to encrypting it and then initiate a slow release of the data to the public, he said.
“Certainly, we’ve seen an increase in the threat — the actual carrying out of the threat not as much from what I’ve seen,” Hussey said. “But a lot of times, it does incentivize the victim to pay more often.”
There are dozens of victims listed by name on the Maze site, but only 10 “full dump” postings for the group’s ransomware victims; the implication is most organizations struck by Maze have paid the ransom demand in order to prevent the publication of their confidential data.
Rapid7 principal security researcher Wade Woolwine has also observed an increase in these shaming tactics. Both Woolwine and Hussey believe the shift in tactics for ransomware groups is a response to organizations investing more time and effort into backups.
“My impression is that few victims were paying the ransom because organizations have stepped up their ability to recover infected assets and restore data from backups quickly in response to ransomware,” Woolwine said in an email to SearchSecurity.
One of the primary things Trustwave advises as a managed security services provider, is to have intelligent, well-designed backup procedures, Hussey said.
“These new tactics are a response to companies that are mitigating ransomware risk by properly applying the backups. It has been effective. A lot of companies invested in backup solutions and design backup solutions to kind of protect from this ongoing scourge of ransomware. Now the response is even with backup data, if threat actors exfiltrate first and then threaten to release the private information, this is a new element of the threat,” Hussey said.
When threat actors make it past the perimeter to the endpoint and have access to the data, it makes sense to steal it as further incentive for organizations to pay to unencrypt the data, Woolwine said. And the threat actors pay particular attention to the most sensitive types of data inside a corporate network.
“Initially, we were seeing exploit kits like Cobalt Strike used by the attackers to look for specific files of interest manually. I say ‘look,’ but the Windows search function, especially if the endpoint is connected to a corporate file server, is largely sufficient to identify documents that say things like ‘NDA,’ ‘contract’ and ‘confidential,” Woolwine said. “More recently, we’ve seen these searches scripted so they can execute more quickly.”
According to Woolwine, phishing and drive-by continue to be preferred vectors of delivery for most ransomware attacks, but those techniques are shifting too.
“We also see attackers target specific internet-facing systems that have been unpatched, as well as targeting RDP servers with brute-force authentication attempts. In either case, once the vulnerability is exploited or the credentials guessed, the attackers will install ransomware before disconnecting,” Woolwine said. “The rise in tactics is very likely due to the shift from ransom to data exposure. It’s no longer about how many machines you can infect but infecting the machines that have access to the most data.”
Hussey said these new tactics were unexpected at the time; they are the next logical step in the ransomware progression, and he expects more threat actors to adopt them in the future.
After months of speculation, fast-growing cloud data warehouse vendor Snowflake has filed for an IPO.
“All our sources have confirmed that they filed using the JOBS Act approach,” said R “Ray” Wang, founder and CEO of Constellation Research.
The Jumpstart Our Business Act was signed into law by President Barack Obama in 2012 and is intended to help fund small businesses by easing securities regulations, including allowing smaller firms to file for IPOs confidentially while testing the market.
“They have ramped up their sales and marketing to match the IPO projections and they’ve made substantial customer progress,” Wang added.
Snowflake, meanwhile, has not yet confirmed that its IPO is now officially in the works.
“No comment” was the official response from the vendor when reached for comment.
Snowflake, founded in 2012 and based in San Mateo, Calif., has appeared to be aiming at an IPO for more than a year.
R ‘Ray’ WangFounder and CEO, Constellation Research
The vendor is in a competitive market that includes Amazon Redshift, Google BigQuery, Microsoft Azure SQL Data Warehouse and SAP Data Warehouse, among others. Snowflake, however, has established a niche in the market and been able to grow from 80 customers when it released its first platform in 2015 to more than 3,400.
“Unlike other cloud data warehouses, Snowflake uses a SQL database engine designed for the cloud, and scales storage and compute independently,” said Noel Yuhanna, analyst at Forrester Research. “Customers like its ease of use, lower cost, scalability and performance capabilities.”
He added that unlike other cloud data warehouses, Snowflake can help customers avoid vendor lock-in by running on multiple cloud providers.
“If the IPO comes through, it will definitely put pressure on the big cloud vendors Amazon, Google and Microsoft who have been expanding their data warehouse solutions in the cloud,” Yuhanna said.
Snowflake has been able to increase its valuation from under $100 million when it emerged from stealth to more than $12 billion by growing its customer base and raising investor capital through eight funding rounds. An IPO has the potential to infuse the company with even more capital, and fundraising is often the chief reason a company goes public.
Other advantages include an exit opportunity for investors, publicity and credibility, a reduced overall cost of capital since private companies often pay higher interest rates to receive bank loans, and the ability to use stock as a means of payment.
Speculation that Snowflake was on a path toward going public gained momentum when Bob Muglia, who took over as CEO of Snowflake in 2014 just before it emerged from stealth, abruptly left the company in April 2019 and was replaced by Frank Slootman.
Before joining Snowflake, Slootman had led ServiceNow and Data Domain through their IPOs, and in October 2019 told an audience in London that Snowflake could pursue an IPO as early as summer 2020.
Three months later, in February 2020, Snowflake raised $479 million in venture capital funding led by Dragoneer Investment Group and Salesforce Ventures, which marked the vendor’s eighth fundraising round and raised the its valuation to more than $12.4 billion.
Even eight funding rounds are rare, and in order to increase valuation beyond venture capital investments, companies are generally left with the option of either going public or getting acquired.
Meanwhile, last week at its virtual user conference Snowflake revealed expanded cloud data warehouse capabilities that included a new integration with Salesforce that will enable Snowflake to more easily connect to different data sources. And the more capabilities Snowflake has, the more attractive it would be to potential investors in an IPO.
“Snowflake, I believe, has been looking at an IPO for a few years now,” Yuhanna said. “They have had a steady revenue streamline for a while, and many large Fortune companies have been using it for critical analytical deployments. Based on our inquiries, it’s the top data warehouse that customers have been asking about besides Amazon Redshift.”
Organizations are creating and consuming more data than ever before, spawning enterprise data management system challenges and opportunities.
A key challenge is volume. With enterprises creating more data, they need to manage and store more data. Organizations are now also increasingly relying on the cloud for enterprise data management system storage needs because of the cloud’s scalability and low cost.
IDC’s Global DataSphereForecast currently estimates that in 2020, enterprises will create and capture 6.4zettabytesof new data. In terms of what types of new data is being created, productivity data — or operational, customer and sales data and embedded data — is the fastest-growing category, according to IDC.
“Productivity data encompasses most of the data we create on our PCs, in enterprise servers or on scientific computers,” saidJohn Rydning, research vice president for IDC’s Global DataSphere.
Productivity data also includes data captured by sensors embedded in industrial devices and endpoints, which can be leveraged by an organization to reduce costs or increase revenue.
Rydning also noted that IDC is seeing growth in productivity-relatedmetadata, which provides additional data about the captured or created data that can be used to enable deeper analysis.
Enterprise data management system challenges in a world of data growth
Looking ahead, Rydning sees challenges for enterprise data management.
Perhaps the biggest is dealing with the growing volume ofarchived data. With archival data, organizations will need to decide whether that data is best kept on relatively accessible storage systems for artificial intelligence analysis, or if it is more economical to move the data to lower-cost media such as tape, which is less readily available for analysis.
Another challenge is handling data from theedge of the network, which is expected to grow in the coming years. There too the question will be where organizations should store reference data for rapid analysis.
“Organizations will increasingly need to be prepared to keep up with the growth of data being generated across a wider variety of endpoint devices feeding workflows and business processes,” Rydning said.
The data management challenge in the cloud
In 2019, 34% of enterprise data was stored in the cloud. By 2024, IDC expects that 51% of enterprise data will be stored in the cloud.
While the cloud offers organizations a more scalable and often easier way to store data than on-premises approaches, not all that data has the same value.
Monte ZwebenCo-founder and CEO, Splice Machine
“Companies are continuing to dump data into storage without thinking about the applications that need to consume it,” saidMonte Zweben, co-founder and CEO of Splice Machine. “They just substituted cheap cloud storage, and they continue to not curate it or transform it to be useful. It is now a cloud data swamp.”
The San Francisco-based vendor develops adistributed SQL relational database management systemwith integrated machine learning capabilities. While simply dumping data into the cloud isn’t a good idea, that doesn’t mean Zweben is opposed to the idea of cloud storage.
Indeed, Zweben suggested that organizations use the cloud, since cloud storage is relatively cheap. The key is to make sure that instead of just dumping data, enterprises find way to use that data effectively.
“You may later realize you need to train ML [machine learning] models on data that you previously did not think was useful,” Zweben said.
Enterprise data management system lessons from data innovators
“Without a doubt, some companies are storing a lot of low-value data in the cloud,” said Andi Mann, chief technology advocate atSplunk, aninformation security and event management vendor. “But it is tough to say any specific dataset is unnecessary for any given business.”
In his view, the problem isn’t necessarily storing data that isn’t needed, but rather storing data that isn’t being used effectively.
Splunk sponsored a March 2019study conducted by Enterprise Strategy Group(ESG) about the value of data. The report, based on responses from 1,350 business and IT decision-makers, segments users by data maturity levels, with “data innovators” being the top category.
“While many organizations do have vast amounts of data — and that might put them in the data innovator category — the real difference between data innovators and the rest is not how much data they have, but how well they enable their business to access and use it,” Mann said.
Among the findings in the report is that 88% of data innovators employ highly skilled data investigators. However, even skilled people are not enough, so 85% of these innovative enterprises usebest-of-breed analyticstools, and make sure to provide easy access to them.
“Instead of considering any data unnecessary, look at how to store even low-value data in a way that is both cost-effective, while allowing you to surface important insights if or when you need to,” Mann suggested. “The key is to treat data according to its potential value, while always being ready to reevaluate that value.”
WANdisco has integrated its big data migration with the Microsoft Azure cloud.
WANdisco LiveData Platform for Azure — in customer preview — is designed to make it easier to move petabytes of data to Azure. Customers can discover LiveData through Marketplace and access its services directly through Portal and Azure command line interface (CLI). With LiveData, customers can perform large-scale migration of Hadoop data to Azure, and enable backup and disaster recovery (DR) in the cloud and cloud bursting. As a native service, LiveData Platform for Azure will show up on the same bill as Azure.
WANdisco also launched LiveData Migrator and LiveData Plane for the new Azure-based platform. These two work together to allow consistency between an on-premises Hadoop environment and Azure Data Lake Storage. LiveData Migrator performs a one-time scan of the on-premises data and feeds it to LiveData Plane, which captures any changes after that point.
LiveData can scan through petabyte-scale data and generate a copy in the cloud while ensuring both copies are the same. It is powered by WANdisco Fusion, a consensus engine that keeps data consistent and available across multiple environments. Because it is a single scan and data migration is continuous, nothing needs to be shut down. This integration with Azure makes it easier for Azure customers to discover and deploy LiveData.
LiveData’s ability to move petabytes of data without interrupting production and without risk of losing the data midflight is something no other vendor does, said Merv Adrian, Gartner research vice president of data and analytics. Moving data at this scale takes a long time, and traditionally involves a combination of physically shipping servers loaded with data to a cloud provider and/or transferring data to the cloud during non-peak hours. The data is inaccessible during migration using these methods. Adrian said as a result, enterprises tend not to move live, active data this way.
“Taking everything down until I’m finished isn’t an option,” Adrian said.
LiveData doesn’t technically “finish” the migration until later, but customers can access and make changes to all the data mid-migration. LiveData ensures those changes are reflected in all copies. Adrian said that’s an important differentiator from other migration tools.
WANdisco LiveData does not yet have similar integration with AWS or Google Cloud, but Adrian said that the Azure integration makes most sense. AWS has larger adoption, but Adrian pointed out that AWS and Google have no on-premises presence — those customers are already on the cloud. Microsoft customers are most likely hybrid, running Microsoft products in their data centers while also dipping into Azure for their cloud needs. They are the customers most likely looking to juggle petabytes of data between on-premises and cloud.
WANdisco CEO and founder David Richards said WANdisco focuses on serving the enterprise market. He said while AWS has higher general market adoption, it has similar adoption among enterprises as Azure. He also said Azure adoption is growing faster among the enterprise, partly because Microsoft’s office productivity and collaboration tools both on- and off-premises are widely popular.
Richards said cloud demand is spiking because of an increase in at-home workers as well as companies investing in AI and machine learning. Business has slowed across the board due to the COVID-19 pandemic, and companies are thinking of ways to modernize and transform their businesses in response. Investing in AI — specifically, the ability to make better decisions automatically — is a way for businesses to differentiate themselves.
“Businesses have to now reinvent themselves, but that has to come with severe IT mobilization,” Richards said. “The boldest move a company can make is looking at AI.”
Adrian brought up another point about the interplay between COVID-19 and cloud: many businesses are looking to cut costs, and CTOs are going to look at putting hardware on the chopping block. He said it depends on the workload, but in most cases, the total cost of ownership over three years for hosting on the cloud is cheaper than provisioning all the necessary hardware, floor space and cooling to host it on-premises.
Determining these costs and identifying which workloads are actually cheaper on the cloud is still a “black art,” Adrian said. It takes meticulous modeling to map out costs, and those models could still be wrong because the demands of the workloads and the cost of the cloud could grow or shrink unpredictably. However, Adrian said AI and machine learning are absolutely better done on the cloud because of the “bursty” nature of their compute demands.
The past few months have forced the normally conservative data storage world to make on-the-spot adjustments to the ways people buy and use storage.
Recent earnings reports from leading storage companies provided a look at how they adapted to the changes. While they experienced mixed results, clear buying patterns and industry changes emerged in the data storage market. Storage leaders expect many of the changes will remain in place, even after the COVID-19 threat subsides.
The recent earnings calls showed some trends accelerated — such as a move from large data center arrays to hyper-converged infrastructure (HCI) and the cloud, and a shift from Capex to Opex spending. It also forced new selling strategies as face-to-face sales calls and conferences gave way to virtual events and virtual meetings between buyers and sellers working remotely.
One major storage CEO even experienced COVID-19 personally.
“I contracted COVID-19 in mid-March,” Pure Storage CEO Charlie Giancarlo said last week on the company’s earnings call. “And that experience has provided me with a deep personal appreciation for this virus and its impact. The changes in people’s lives and livelihoods are truly extraordinary. And our expectations of what is or will be normal are forever changed. Every day, each new report on the crisis brings an uneasy mixture of anxiety, uncertainty and hope about the future.”
Storage vendors confronted this new normal over the last few months, with their business prospects also filled with uncertainty. Pure came out of it better than its larger direct competitors Dell EMC, NetApp and Hewlett Packard Enterprise. Still, it joined Dell and HPE in declining to give a forecast for this quarter because of uncertainty. NetApp did not give a long-term forecast but predicts a 6% revenue drop this quarter.
The following are some ways the data storage market changed during the first quarter of COVID-19:
Arrays give way to cloud, HCI
Flash array vendor Pure’s revenues increased 12% over last year, to $367 million. Other array vendors didn’t fare so well, while HCI and services revenue grew as organizations shifted to remote work and bought storage remotely.
Dell EMC’s storage revenue fell 5% to $3.8 billion, while its Infrastructure Solutions Group fell 8% overall (servers and networking dropped 10%). But while storage, servers and networking dipped, Dell reported double-digit growth in its VxRail HCI platform that combines those IT infrastructure tiers.
NetApp revenue dropped 12% to $1.4 billion, including a 21% decline in product revenue. NetApp all-flash array revenue of $656 million dropped 3% since last year, while cloud data services of $111 million more than doubled. NetApp claims it has more than 3,500 cloud data services customers.
“I would tell you that as we think about the go-forward strategic roadmap, it’s much more tied to software and cloud services,” NetApp CEO George Kurian said.
HPE storage revenues declined 16% since last year.
Hyper-converged infrastructure specialist Nutanix reported an 11% revenue increase to $318.3 million. Dell-owned VMware also reported revenue from its vSAN HCI software increased more than 20%, as did its NSX software-defined networking product.
It’s no surprise that the VDI expansion would lead to HCI sales, because VDI was among the first common use cases for hyper-convergence. One change since the early days of HCI is that now many of those desktops are sold as a cloud service.
Nutanix CEO Dheeraj Pandey said the increase for VDI and desktop as a service (DaaS) in March and April “brought us back to our roots, when a much larger piece of our business supported virtual desktop workloads.”
VDI also helped flash storage catch on, as a way to deal with boot storms and peak periods required for heavy volume of virtual desktops. Not all flash vendors benefited last quarter, but Pure did.
“Certainly, VDI was one of the major use cases out there,” Pure’s Giancarlo said.
In May, NetApp acquired VDI and DaaS startup CloudJumper to address that market.
Who’s buying? And how?
COVID-19’s impact on storage buying was far from uniform. The pandemic left some industries financially devastated, while others had to expand to keep up.
Dell COO Jeff Clarke said Dell saw demand drop among SMBs and industries such as retail, manufacturing, energy and transportation. But financial services, government, healthcare and life sciences increased spending.
Kurian said NetApp also saw an increase in healthcare spending, driven by the pandemic and a need for digital imaging.
Organizations spending on storage are increasingly going to a utility model, buying storage as a service. Pure’s subscription services jumped 37% year over year to $120 million, making up one-third of its overall revenue.
“What we saw in Q1 was that the urgency was to beef up what they currently had in, and that was largely on prem,” Giancarlo said. “But they wanted the option, they didn’t want to sign on to five years of more on prem or anything along those lines. They wanted the option of being able to move to the cloud at any point in time. And that’s exactly what our Pure as-a-Service is designed to do in several respects.”
While Dell’s overall revenue was flat from last year, its recurring revenue increased 16%, to around $6 billion. That recurring revenue includes utility and as-a-service pricing.
“We have a very, very modern way to consume and digest IT with the very best products in the marketplace,” Clarke said.
Virtualized sales become common
Remote work has changed the way vendors and customers interact. Like with user conferences, sales calls have become a virtual experience.
“Our teams had to be nimble and quickly embrace a new sales motion,” Dell’s Clarke said. “We successfully pivoted to all virtual engagements with hundreds of thousands of virtual customer interactions in the quarter.”
Clarke said there has been no negative impact, as he and his sales team can meet with more customers than in the past.
Nutanix, which shifted its 2020 .NEXT user conference to a virtual event and pushed it until Sept. 8, has also moved in-person regional shows and boot camps online. Pandey said Nutanix has seen no drop-off in qualified leads for its sales team from going virtual.
“We have gone completely virtual and are seeing comparable yield in terms of qualified leads and virtual meetings for our sales organization at less than half the cost,” he said.
Cost-saving: Furloughs, pay cuts, hiring freezes
Unsure of what the immediate future will look like, IT companies are enacting cost reduction plans and realigning their teams.
Dell is implementing a global hiring freeze, reduction in consulting and contractor costs, global travel restrictions and a suspension of its 401(k) match plan.
HPE said it would enact pay cuts across the board, with the executive team taking the biggest reductions. CEO Antonio Neri also said HPE would reduce and realign the workforce as part of a cost reduction plan save more than $1 billion over three years.
Nutanix implemented two nonconsecutive weeks of furloughs for a quarter of its employees and cut executive team members’ salaries by 10%.
Not all the vendors are reducing staff yet, though. NetApp CEO Kurian said the company reached its target goal of adding 200 primary sales reps, a quarter ahead of schedule.
Pure Storage’s Giancarlo said it’s his “personal mission” to avoid layoffs or furloughs through the rest of 2020, although the company did have layoffs — which he called a “rebalancing” — before COVID-19 hit. “We believe we’re going to be able to perform in such a way that we will not have layoffs or furloughs,” he said.
Despite the changes to the data storage market, one constant is data is growing in volume and important in business around the world.
“While we cannot predict when the world will return to normal, the enduring importance of data is clear,” Kurian said.
Enterprise data governance isn’t just managing the data an organization company possesses, it’s also key to managing the data supply chain, according to Charles Link, director of data and analytics at Covanta.
Link detailed his views on data management during a technology keynote at the Talend Connect 2020 Virtual Summit on May 27. Executives from other Talend customers, including AutoZone, also spoke at the event.
Covanta, based in Morristown, N.J., is in the waste-to-energy business, operating 41 facilities across North America and Europe. Data is at the core of Covanta’s operations as a way to help make business decisions and improve efficiency, Link said.
“We’re never just pushing data; we’re never just handing off the reports,” Link said. “The outcome is not data; it is always a business result.”
Link said he’s often observed that there can be a disconnect between decision-makers and the data that should be used to help make decisions.
To help connect data with decisions, “you really need both the data use and data management strategy to drive business outcomes,” Link said.
Enterprise data governance strategy defined
Link defined data use strategy as identifying business objectives for data and quantifying goals. The process includes key performance indicators to measure the success of data initiatives.
An enterprise data management strategy, on the other hand, is more tactical, defining the methods tools and technologies use to access, analyze, manage and share data, he said.
At Covanta, Link said enterprise data governance is essentially about the need to have what he referred to as data supply chain management.
Charles LinkDirector of data and analytics, Covanta
Link defined data supply chain management as data governance that manages where data comes from and helpsensure consistent quality from a reliable supplier.
For that piece, Covanta has partnered with Talend and is using the Talend Data Fabric, a suite of data integration and management tools that includes a data catalog that helps enable data supply chain management. With Talend as the technology base, Link said that his company has deployed a central hub for users within the organization to find and use trusted data.
“There is now a shared understanding across business and IT of what our data means,” Link said. “So now we trust the quality of the data we use to operate our facilities.”
The chaos of data demands driving AutoZone
For auto parts retailer AutoZone, managing the complexity of data and overcoming data challenges is a foundation of the company’s success, said Jason Vogel, IT manager of data management at AutoZone.
AutoZone has 6,400 stores and each store carries nearly 100,000 parts. In the background, AutoZone is moving data across its disparate data hubs and stores, making it available to the company’s business analysts. Data also helps ensure that AutoZone customers can get the parts they need quickly.
“We have 20 different types of databases — not instances, types,” Vogel emphasized. “We have thousands of instances and Talend serves as the glue to connect all these systems together.”
Vogel noted that AutoZone is looking to expand its real-time data processing so that it can do more in less time, getting parts to its customers faster. The company is also looking to expand operations overall.
“The only way to accomplish that is by moving more data, having more insight into how data is used and accomplishing it all faster,” Vogel said.
Many organizations continue to struggle with data
AutoZone isn’t the only organization that is trying to deal with data coming from many different sources. In another keynote at Talend Connect, Stewart Bond, research director of data integration and data intelligence software at IDC, provided some statistics about the current state of data integration challenges.
Bond cited a 2019 IDC survey of enterprise users’ experience with data integration and integrity that found most organizations are integrating up to six different types of data.
Those data types include transaction, file, object, spatial, internet of things and social data. Furthering adding to the complexity, the same study found that organizations are using up to 10 different data management technologies.
While enterprises are managing a lot of data, Bond said the survey shows that not all the organizations are using the data effectively. Data workers are wasting an average of 15 hours per week on data search, preparation and governance processes, IDC found. To improve efficiency, Bond suggested that organizations better manage and measure how data is used.
“Measurements don’t need to be complex; they can be as simple as measuring how much time people spend on data-oriented activity,” Bond said. “Set a benchmark and see if you can improve over time.”
Improving enterprise data governance with data trust
During her keynote, Talend CEO Christal Bemont emphasized that data quality and trust are keys to making the most efficient use of data.
She noted that it’s important to measure the quality of data, to make sure that organizations are making decisions based on good information. Talend helps its users enable data quality with a trust score for data sources, as part of the Talend Data Fabric.
“When you think about what Talend does, you know, you think of us as an integration company,” Bemont said. “Quite frankly we put equal, and maybe even in some cases more, importance on not only just being able to have a lot of data, but also having complete data.”
Restoring from backups is often the last resort when data is compromised by ransomware, but savvy criminals are also targeting those backups.
Arcserve enhanced its Sophos partnership to provide cybersecurity aimed at safeguarding backups, preventing cybercriminals from taking out organizations’ last line of ransomware defense. The Secured by Sophos line of Arcserve products, originally consisting of on-premises appliances that integrated Arcserve backup and Sophos security, extended its coverage to SaaS and cloud with two new entries: Arcserve Cloud Backup for Office 365 and Arcserve Unified Data Protection (UDP) Cloud Hybrid.
Arcserve UDP Cloud Hybrid Secured by Sophos is an extension to existing Arcserve software and appliances. It replicates data to the cloud, and the integrated Sophos Intercept X Advanced software scans the copies for malware and other security threats. The Sophos software recognizes the difference between encryption performed by normal backup processes and unauthorized encryption from bad actors.
Arcserve Cloud Backup for Office 365 Secured by Sophos is a stand-alone product for protecting and securing Office 365 data. It also uses Sophos Intercept X Advanced endpoint security, and it can do backup and restore for Microsoft Exchange emails, OneDrive and SharePoint.
Both new products are sold on an annual subscription model, with pricing based on storage and compute.
IDC research director Phil Goodwin described what has been an escalating battle between organizations and cybercriminals. Data protection vendors keep improving their products, and organizations keep learning more about backups. This trend allows companies to quickly and reliably restore their data from backups and avoid paying ransoms. Criminals, in turn, learn to target backups.
“Bad guys are increasingly attacking backup sets,” Goodwin said.
Arcserve’s Secured by Sophos products combines security and backup, specifically protecting backup data from cyberthreats. Organizations can realign their security to encompass backup data, but Arcserve’s products provide security out of the box. Goodwin said Acronis is the only other vendor he could think of that has security integrated into backup, while others such as IBM have data protection and security as separate SKUs.
From a development standpoint, security and data protection call on different skill sets, but both are necessary for combating ransomware. Goodwin said combining the two makes for stronger defense system.
Oussama El-Hilali, CTO at Arcserve, said adding Office 365 to the Secured by Sophos line was important because more businesses are adopting the platform than in the past. There was already an upward trend of businesses putting mission-critical data on SharePoint and OneDrive, but the boost in remote work deployments caused by the COVID-19 pandemic accelerated that.
El-Hilali said the pandemic has increased the need for protecting data in clouds and SaaS applications more for SMBs than enterprises, because larger organizations may have large, on-premises storage arrays they can use. The Office 365 product is sold stand-alone because many smaller businesses only need an Office 365 data protection component, and nothing for on premises.
“The [coronavirus] impact is more visible in the SMB market. A small business is probably using a lot of SaaS, and probably doesn’t have a lot of data on-prem,” El-Hilali said.
Unfortunately, its Office 365’s native data retention, backup and security features are insufficient in a world where many users are accessing their data from endpoint and mobile devices. Goodwin said there is a strong market need, and third parties such as Arcserve are taking that chance.
“There’s a big opportunity there with Office 365 — it’s one of the greatest areas of vulnerability from the perspective of SaaS apps,” Goodwin said.
Microsoft 365 (formerly Office 365) provides a wide set of options for managing data classification, retention of different types of data, and archiving data. This article will show the options a Microsoft 365 administrator has when setting up retention policies for Exchange, SharePoint, and other Microsoft 365 workloads and how those policies affect users in Outlook. It’ll also cover the option of an Online Archive Mailbox and how to set one up.
There’s also an accompanying video to this article which shows you how to configure a retention policy, retention labels, enabling Archive mailboxes, and creating a move to archive retention tag.
Before we continue, we know that for all Microsoft 365 admins security is a priority. And in the current climate of COVID-19, it’s well documented how hackers are working around the clock to exploit vulnerabilities. As such, we assembled two Microsoft experts to discuss the critical security features in Microsoft 365 you should be using right now in a free webinar on May 27. Don’t miss out on this must-attend event – save your seat now!
How To Manage Retention Policies in Microsoft 365
There are many reasons to consider labeling data and using retention policies but before we discuss these let’s look at how Office 365 manages your data in the default state. For Exchange Online (where mailboxes and Public Folders are stored if you use them), each database has at least four copies, spread across two datacenters. One of these copies is a lagged copy which means the replication to it is delayed, to provide the option to recover from a data corruption issue. In short, a disk, server, rack, or even datacenter failure isn’t going to mean that you lose your mailbox data.
Further, the default policy (for a few years now) is that deleted items in Outlook stay in the Deleted Items folder “forever”, until you empty it, or they are moved to an archive mailbox. If an end-user deletes items out of their Deleted Items folder, they’re kept for another 30 days (as long as the mailbox was created in 2017 or later), meaning the user can recover it, by opening the Deleted Items folder and clicking the link.
Where to find recoverable items in Outlook
This opens the dialogue box where a user can recover one or more items.
Additionally, it’s also important to realize that Microsoft does not back up your data in Microsoft 365. Through native data protection in Exchange and SharePoint online they make sure that they’ll never lose yourcurrentdata but if you have deleted an item, document or mailbox for good, it’s gone. There’s no secret place where Microsoft’s support can get it back from (although it doesn’t hurt to try), hence the popularity of third-party backup solutions such as Altaro Office 365 Backup.
Litigation Hold – the “not so secret” secret
One option that I have seen some administrators employ is to use litigation or in-place hold (the latter feature is being retired in the second half of 2020) which keeps all deleted items in a hidden subfolder of the Recoverable Items folder until the hold lapses (which could be never if you make it permanent). Note that you need at least an E3 or Exchange Online Plan 2 for this feature to be available. This feature is designed to be used when a user is under some form of investigation and ensures that no evidence can be purged by that user and it’s not designed as a “make sure nothing is ever deleted” policy. However, I totally understand the job security it can bring when the CEO is going ballistic because something super important is “gone”.
Litigation hold settings for a mailbox
If the default settings and options described above doesn’t satisfy the needs of your business or regulatory requirements you may have, the next step is to consider retention policies. A few years ago, there were different policy frameworks for the different workloads in Office 365, showing the on-premises heritage of Exchange and SharePoint. Thankfully we now have a unified service that spans most Office 365 workloads. Retention in this context refers to ensuring that the data can’t be deleted until the retention period expires.
There are two flavors here, label policies which publish labels to your user base, letting users pick a retention policy by assigning individual emails or documents a label (only one label per piece of content). Note that labels can do two things that retention policies can’t do, firstly they can apply from the date the content was labeled, and secondly, you can trigger a disposition / manual review of the SharePoint or OneDrive for Business document when the retention expires.
Labels only apply to objects that you label; it doesn’t retroactively scan through email or documents at rest. While labels can be part of a bigger data classification story, my recommendation is that anything that relies on users remembering to do something extra to manage data will only work with extensive training and for a small subset of very important data. You can (if you have E5 licensing for the users in question) use label policies to automatically apply labels to sensitive content, based on a search query you build (particular email subject lines or recipients or SharePoint document types in particular sites for instance) or to a set of trainable classifiers for offensive language, resumes, source-code, harassment, profanity, and threats. You can also apply a retention label to a SharePoint library, folder, or document set.
As an aside, Exchange Online also has personal labels that are similar to retention labels but created by users themselves instead of being created and published by administrators.
A more holistic flavor, in my opinion, is retention policies. These apply to all items stored in the various repositories and can apply across several different workloads. Retention policies can also both ensure that data is retained for a set period of time AND disposed of after the expiry of the data, which is often a regulatory requirement. A quick note here if you’re going to play around with policies is that they’re not instantaneously applied – it can take up to 24 hours or even 7 days, depending on the workload and type of policy – so prepare to be patient.
These policies can apply across Exchange, SharePoint (which means files stored in Microsoft 365 Groups, Teams, and Yammer), OneDrive for business, and IM conversations in Skype for Business Online / Teams and Groups. Policies can be broad and apply across several workloads, or narrow and only apply to a specific workload or location in that workload. An organization-wide policy can apply to the workloads above (except Teams, you need a separate policy for its content) and you can have up to 10 of these in a tenant. Non-org wide policies can be applied to specific mailboxes, sites, or groups or you can use a search query to narrow down the content that the policy applies to. The limits are 10,000 policies in a tenant, each of which can apply to up to 1000 mailboxes or 100 sites.
Especially with org-wide policies be aware that they apply to ALL selected content so if you set it to retain everything for four years and then delete it, data is going to automatically start disappearing after four years. Note that you can set the “timer” to start when the content is created or when it was last modified, the latter is probably more in line with what people would expect, otherwise, you could have a list that someone updates weekly disappear suddenly because it was created several years ago.
To create a retention policy login to the Microsoft 365 admin center, expand Admin centers, and click on Compliance. In this portal click on Policies and then Retention under Data.
Retention policies link in the Compliance portal
Select the Retention tab and click New retention policy.
Retention policies and creating a new one
Give your policy a name and a description, select which data stores it’s going to apply to and whether the policy is going to retain and then delete data or just delete it after the specified time.
Retention settings in a policy
Outside of the scope of this article but related are sensitivity labels, instead of classifying data based on how long it should be kept, these policies classify data based on the security needs of the content. You can then apply policies to control the flow of emails with this content, or automatically encrypt documents in SharePoint for instance. You can also combine sensitivity and retention labels in policies.
Since there can be multiple policies applied to the same piece of data and perhaps even retention labels in play there could be a situation where conflicting settings apply. Here’s how these conflicts are resolved.
Retention wins over deletion, making sure that nothing is deleted that you expected to be retained and the longest retention period wins. If one policy says two years and another says five years, it’ll be kept for five. The third rule is that explicit wins over implicit so if a policy has been applied to a specific area such as a SharePoint library it’ll take precedence over an organization-wide general policy. Finally, the shortest deletion policy wins so that if an administrator has made a choice to delete content after a set period of time, it’ll be deleted then even if another policy applies that requires deletion after a longer period of time. Here’s a graphic that shows the four rules and their interaction:
Policy conflict resolution rules (courtesy of Microsoft)
As you can see, building a set of retention policies that really work for your business and don’t unintentionally cause problems is a project for the whole business, working out exactly what’s needed across different workloads, rather than the job of a “click-happy” IT administrator.
It all started with trying to rid the world of PST stored emails. Back in the day, when hard drive and SAN storage only provided small amounts of storage, many people learnt to “expand” the capacity of their small mailbox quota with local PST files. The problem is that these local files aren’t backed up and aren’t included in regulatory or eDiscovery searches. Office 365 largely solved part of this problem by providing generous quotas, the Business plans provide 50 GB per mailbox whereas the Enterprise plans have 100 GB limits.
If you need more mailbox storage one option is to enable online archiving which provides another 50 GB mailbox for the Business plans and an unlimited (see below) mailbox for the Enterprise plans. There are some limitations on this “extra” mailbox, it can only be accessed online, and it’s never synchronized to your offline (OST) file in Outlook. When you search for content you must select “all mailboxes” to see matches in your archive mailbox. ActiveSync and the Outlook client on Android and iOS can’t see the archive mailbox and users may need to manually decide what to store in which location (unless you’ve set up your policies correctly).
For these reasons many businesses avoid archive mailboxes altogether, just making sure that all mailbox data is stored in the primary mailbox (after all, 100 GB is quite a lot of emails). Other businesses, particularly those with a lot of legacy PST storage find these mailboxes fantastic and use either manual upload or even drive shipping to Microsoft 365 to convert all those PSTs to online archives where the content isn’t going to disappear because of a failed hard drive and where eDiscovery can find it.
For those that really need it and are on E3 or E5 licensing you can also enable auto-expanding archives which will ensure that as you use up space in an online archive mailbox, additional mailboxes will be created behind the scenes to provide effectively unlimited archival storage.
Click on a user’s name to be able to enable the archive mailbox.
Archive mailbox settings
Once you have enabled archive mailboxes, you’ll need a policy to make sure that items are moved into at the cadence you need. Go to the Exchange admin center and click on Compliance management – Retention tags.
Exchange Admin Center – Retention tags
Here you’ll find the Default 2 year move to archive tag or you can create a new policy by clicking on the + sign.
Exchange Retention tags default policies
Pick Move to Archive as the action, give the policy a name and select the number of days that has to pass before the move happens.
Creating a custom Move to archive policy
Note that online archive mailboxes have NOTHING to do with the Archive folder that you see in the folder tree in Outlook, this is just an ordinary folder that you can move items into from your inbox for later processing. This Archive folder is available on mobile clients and also when you’re offline and you can swipe in Outlook mobile to automatically store emails in it.
Now you know how and when to apply retention policies and retention tags in Microsoft 365, as well as when online archive mailboxes are appropriate and how to enable them and configure policies to archive items.
Finally, if you haven’t done so already, remember to save your seat on our upcoming must-attend webinar for all Microsoft 365 admins:
Is Your Office 365 Data Secure?
Did you know Microsoft does not back up Office 365 data? Most people assume their emails, contacts and calendar events are saved somewhere but they’re not. Secure your Office 365 data today using Altaro Office 365 Backup – the reliable and cost-effective mailbox backup, recovery and backup storage solution for companies and MSPs.