Tag Archives: data

The business benefits of enterprise data governance and MDM

With seemingly overwhelming amounts of data coming from myriad sources, the need for effective enterprise data governance strategies is of paramount importance to many organizations.

Enterprise data governance has many facets and can often intersect with master data management (MDM) efforts. That convergence was on display at Informatica’s MDM 360 and Data Governance virtual summit hosted on March 19.

The enterprise cloud data management vendor, based in Redwood City, Calif., has been particularly active in recent months, hiring a new CEO in January and expanding the company’s product portfolio with updated governance, data catalog and analytics capabilities.

“We all want tomorrow’s data yesterday, to make a decision for today,” Informatica CEO Amit Walia said during the event’s opening keynote.

Informatica’s virtual conference was among the many similar events that tech vendors have held or are planning to substitute for in-person events canceled because of the coronavirus pandemic.

One notable tech conference producer, O’Reilly Media, sponsor of the Strata Data and AI conferences, among others, said March 24 it is closing its in-person conference business altogether because of the pandemic.

Amit Walia
Informatica CEO Amit Walia

How Hertz is mastering enterprise data governance and management

Meanwhile, with its global car rental operations, Hertz Corporation possesses a lot of data that it needs to collect and govern, for some 100 million customers and a fleet of nearly a million vehicles.

We all want tomorrow’s data yesterday, to make a decision for today.
Amit WaliaCEO, Informatica

Speaking at the virtual event, Richard Leaton, master data leader at Hertz, outlined the challenges his organization faces and the best practices for data governance and data management Hertz has used.

“The overall business objectives of MDM from an IT perspective, was a $1 billion transformation, changing our reservation system, rental system, sales engine and fleet management,” Leaton said. “If it had an electronic component to it, I think we changed it.”

As part of that effort Hertz needed to improve data quality and data governance, so there could be a single source of information for customer and fleet vehicle data.

Leaton noted that when he joined Hertz in 2017, the company had multiple sets of customer and vehicle master data sources and 30 years of mainframe-based proprietary databases. The systems were highly customized, not easy to upgrade and not uniformly governed.

Leaton emphasized that Hertz started with a process to engage all the right constituencies in the business.

“Data is an asset,” he said. “Data can have real hard number committed to it and when you have hard numbers associated with a data program, you’re going to have people who are helping you to make that data program successful.”

The technology should be the easy part of data transformation, Leaton said. The business processes, the buy-in and making sure the right data quality is present become the hard parts.

Enterprise data governance is the key to master data management

The first step for enabling MDM is to start with data governance, according to Leaton.

“If you don’t have your terms defined, you can’t build an MDM suite effectively,” Leaton said. “We were partway along the governance journey and started into MDM the first time and that’s where we ran into trouble.”

Hertz IT managers thought that they had defined enterprise data governance terms, but they came to realize that the terms were not agreed upon across the multiple platform of the business.

Securing executive buy-in for defining data governance across an organization is critical, Leaton said. He also emphasized that financial metrics and business value needs to be associated with the effort. Business leaders need to understand what the business will get out of a data governance effort. It’s not enough just to want to have good data, leaders need to define terms.

The defined terms for data governance can outline how the effort will help ensure regulatory compliance and how it will help to grow the business because all the systems talk to each other and there is better operational efficiency.

Data governance at Invesco

Rich Turnock, global head of enterprise data services at financial services firm Invesco, based in Louisville, Ky., also has a structured process for data governance.

The Invesco enterprise data platform incorporates three core steps for data governance and quality. In the planning phase, much like at Hertz, Turnock said the organization needs to define and document data requests in terms of business outcomes.

In the capture phase of data, enterprise data governance policies for mapping and cataloging data are important. For data delivery, Turnock said data output should be delivered in the agreed upon format and with preferred mechanisms that were defined up front in the planning process.

Using data to improve healthcare at Highmark Health

Using enterprise data governance and MDM best practices isn’t just about improving business outcomes. Those best practices can also improve healthcare.

Also at the Informatica virtual event, Anthony Roscoe, director of enterprise data governance at Highmark Health in Pittsburgh, explained how his organization embraced data governance and MDM. The key challenge for Highmark Health is that the organization had grown via acquisitions and ended up with multiple disparate data systems.

Operational integration of data is also part of Highmark Health’s data journey, making sure that clinical data from health systems can be correlated with health plans. It’s an approach that Roscoe said can help to streamline care decisions between the health insurance and care delivery portions of Highmark Health’s business.

The overriding goal of Highmark Health’s enterprise data platform is to take all the individual parts, find where the organization needs to gather data from so it can be organized, and ultimately govern the data so that appropriate access is in place.

“Mastering the data so that we speak a common language across the entire enterprise is key,” Roscoe said. “Speaking from the same language can deliver accurate data statements and reports and other metrics across the different business units.”

Go to Original Article
Author:

Canon breach exposes General Electric employee data

Canon Business Process Services suffered a security incident, according to a data breach disclosure by General Electric, for which Canon processes current and former employees’ documents and beneficiary-related documents.

GE systems were not impacted by the cyberattack, according to the company’s disclosure, but personally identifiable information for current and former employees as well as their beneficiaries was exposed in the Canon breach. The breach, which was first reported by BleepingComputer, took place between Feb. 3 and Feb. 14 of this year, and GE was notified of the breach on the 28th. According to the disclosure, “an unauthorized party gained access to an email account that contained documents of certain GE employees, former employees and beneficiaries entitled to benefits that were maintained on Canon’s systems.”

Said documents included “direct deposit forms, driver’s licenses, passports, birth certificates, marriage certificates, death certificates, medical child support orders, tax withholding forms, beneficiary designation forms and applications for benefits such as retirement, severance and death benefits with related forms and documents.” Personal information stolen “may have included names, addresses, Social Security numbers, driver’s license numbers, bank account numbers, passport numbers, dates of birth, and other information contained in the relevant forms.”

GE’s disclosure also said Canon retained “a data security expert” to conduct a forensic investigation. At GE’s request, Canon is offering two years of free identity protection and credit monitoring services.

GE shared the following statement with SearchSecurity regarding the Canon breach.

“We are aware of a data security incident experienced by one of GE’s suppliers, Canon Business Process Services, Inc. We understand certain personal information on Canon’s systems may have been accessed by an unauthorized individual. Protection of personal information is a top priority for GE, and we are taking steps to notify the affected employees and former employees,” the statement read.

Canon did not return SearchSecurity’s request for comment. At press time, Canon has not released a public statement.

Go to Original Article
Author:

Data center energy usage combated by AI efficiency

Data centers have become an important part of our data-driven world. They act as a repository for servers, storage systems, routers and all manner of IT equipment and can stretch as large as an entire building — especially in an age of AI that requires advanced computing

Establishing how much power these data centers utilize and the environmental impact they have can be difficult, but according to a recent paper in Science Magazine, the entire data center industry in 2018 utilized an estimated 205 TWh. This roughly translates to 1% of global electricity consumption.

Enterprises that utilize large data centers can use AI, advancements in storage capacity and more efficient servers to mitigate the power required for the necessary expansion of data centers.

The rise of the data center

Collecting and storing data is fundamental to business operation, and while having your own infrastructure can be costly and challenging, having unlimited access to this information is crucial to advancements.

Provoking the most coverage because of their massive size, data centers of tech giants like Google and Amazon often require the same amount of energy as small towns. But there is more behind these numbers, according to Eric Masanet, associate professor of Mechanical Engineering and Chemical and Biological Engineering at Northwestern University and coauthor of the aforementioned article.

The last detailed estimates of global data center energy use appeared in 2011, Masanet said.

Since that time, Masanet said, there have been many claims that the world’s data centers were requiring more and more energy. This has given policymakers and the public the impression that data centers’ energy use and related carbon emissions have become a problem.

Counter to this, Masanet and his colleagues’ studies on the evolution of storage, server and network technology found that efficiency gains have significantly mitigated the growth in energy usage in this area. From 2010 to 2018, compute instances went up by 550%, while energy usage increased just 6% in the same time frame. While data center energy usage is on the rise, it has been curbed dramatically through the development of different strategies.

Getting a step ahead of the data center footprint

The workings behind mediated energy increases are all tied to advancements in technology. Servers have become more efficient, and the partitioning of servers through server virtualization has curbed the energy required for the rapid growth of compute instances.

A similar trend is noticeable in the storage of data. While the demand has significantly increased, the combination of storage-drive efficiencies and densities has limited total increase of global storage energy usage to just threefold. To further curb the rising desire for more data and therefore the rising energy costs and environmental impact, companies integrating AI when designing their data centers.

Data center efficiency gains have stalled
Data center efficiency has increased greatly but may be leveling off.

“You certainly could leverage AI to analyze utility consumption data and optimize cost,” said Scott Laliberte, a managing director with Protiviti and leader of the firm’s Emerging Technologies practice.

“The key for that would be having the right data available and developing and training the model to optimize the cost.”  

By having AI collect data on their data centers and optimizing the energy usage, these companies can help mitigate the power costs, especially concerning cooling, one of the more costly and concerning of the processes within data centers.

“The strategy changed a little bit — like trying to build data centers below ground or trying to be near water resources,” said Juan José López Murphy, Technical Director and Data Science Practice Lead at Globant, a digitally native services company.

But cooling these data centers has been such a large part of their energy usage that companies have had to be creative. Companies like AWS and GCP are trying new locations like the middle of the desert or underground and trying to develop cooling systems that are based on water and not just air, Murphy said.

Google utilizes an algorithm that manages cooling at some of their data centers that can learn from data gathered and limit energy consumption by adjusting cooling configurations.

Energy trends

For the time being, both the demand for data centers and their efficiency has grown. Now the advancement of servers and storage drives as well as the implementation of AI in the building process has almost matched the growing energy demand. This may not continue, however.

“Historical efficiency gains may not be able to outpace rapidly rising demand for data center services in the not-too-distant future,” Masanet said. “Clearly greater attention to data center energy use is warranted.”

The increased efficiencies have done well to stem the tide of demand, but the future remains uncertain for data center’s energy requirements.

Go to Original Article
Author:

Databricks bolsters security for data analytics tool

Some of the biggest challenges with data management and analytics efforts is security.

Databricks, based in San Francisco, is well aware of the data security challenge, and recently updated its Databricks’ Unified Analytics Platform with enhanced security controls to help organizations minimize their data analytics attack surface and reduce risks. Alongside the security enhancements, new administration and automation capabilities make the platform easier to deploy and use, according to the company.

Organizations are embracing cloud-based analytics for the promise of elastic scalability, supporting more end users, and improving data availability, said Mike Leone, a senior analyst at Enterprise Strategy Group. That said, greater scale, more end users and different cloud environments create myriad challenges, with security being one of them, Leone said.

“Our research shows that security is the top disadvantage or drawback to cloud-based analytics today. This is cited by 40% of organizations,” Leone said. “It’s not only smart of Databricks to focus on security, but it’s warranted.”

He added that Databricks is extending foundational security in each environment with consistency across environments and the vendor is making it easy to proactively simplify administration.

As organizations turn to the cloud to enable more end users to access more data, they’re finding that security is fundamentally different across cloud providers.
Mike LeoneSenior analyst, Enterprise Strategy Group

“As organizations turn to the cloud to enable more end users to access more data, they’re finding that security is fundamentally different across cloud providers,” Leone said. “That means it’s more important than ever to ensure security consistency, maintain compliance and provide transparency and control across environments.”

Additionally, Leone said that with its new update, Databricks provides intelligent automation to enable faster ramp-up times and improve productivity across the machine learning lifecycle for all involved personas, including IT, developers, data engineers and data scientists.

Gartner said in its February 2020 Magic Quadrant for Data Science and Machine Learning Platforms that Databricks Unified Analytics Platform has had a relatively low barrier to entry for users with coding backgrounds, but cautioned that “adoption is harder for business analysts and emerging citizen data scientists.”

Bringing Active Directory policies to cloud data management

Data access security is handled differently on-premises compared with how it needs to be handled at scale in the cloud, according to David Meyer, senior vice president of product management at Databricks.

Meyer said the new updates to Databricks enable organizations to more efficiently use their on-premises access control systems, like Microsoft Active Directory, with Databricks in the cloud. A member of an Active Directory group becomes a member of the same policy group with the Databricks platform. Databricks then maps the right policies into the cloud provider as a native cloud identity.

Databricks uses the open source Apache Spark project as a foundational component and provides more capabilities, said Vinay Wagh, director of product at Databricks.

“The idea is, you, as the user, get into our platform, we know who you are, what you can do and what data you’re allowed to touch,” Wagh said. “Then we combine that with our orchestration around how Spark should scale, based on the code you’ve written, and put that into a simple construct.”

Protecting personally identifiable information

Beyond just securing access to data, there is also a need for many organizations to comply with privacy and regulatory compliance policies to protect personally identifiable information (PII).

“In a lot of cases, what we see is customers ingesting terabytes and petabytes of data into the data lake,” Wagh said. “As part of that ingestion, they remove all of the PII data that they can, which is not necessary for analyzing, by either anonymizing or tokenizing data before it lands in the data lake.”

In some cases, though, there is still PII that can get into a data lake. For those cases, Databricks enables administrators to perform queries to selectively identify potential PII data records.

Improving automation and data management at scale

Another key set of enhancements in the Databricks platform update are for automation and data management.

Meyer explained that historically, each of Databricks’ customers had basically one workspace in which they put all their users. That model doesn’t really let organizations isolate different users, however, and has different settings and environments for various groups.

To that end, Databricks now enables customers to have multiple workspaces to better manage and provide capabilities to different groups within the same organization. Going a step further, Databricks now also provides automation for the configuration and management of workspaces.

Delta Lake momentum grows

Looking forward, the most active area within Databricks is with the company’s Delta Lake and data lake efforts.

Delta Lake is an open source project started by Databrick and now hosted at the Linux Foundation. The core goal of the project is to enable an open standard around data lake connectivity.

“Almost every big data platform now has a connector to Delta Lake, and just like Spark is a standard, we’re seeing Delta Lake become a standard and we’re putting a lot of energy into making that happen,” Meyer said.

Other data analytics platforms ranked similarly by Gartner include Alteryx, SAS, Tibco Software, Dataiku and IBM. Databricks’ security features appear to be a differentiator.

Go to Original Article
Author:

Commvault storage story expands with Hedvig for primary data

Of all the changes data protection vendor Commvault made in the last year, perhaps the most striking was its acquisition of primary storage software startup Hedvig.

The $225 million deal in October 2019 — eight months into Sanjay Mirchandani’s tenure as CEO — marked Commvault’s first major acquisition. It also brought the backup specialist into primary storage as it tries to adapt to meet demand for analytics on data everywhere.

Hedvig gives Commvault a distributed storage platform that spans traditional and cloud-hosted workloads. The Hedvig software runs primary storage on commodity hardware and is already been integrated in the Commvault storage software stack, including the new Commvault Metallic SaaS-based backup.

Don Foster, a vice president of storage solutions at Commvault, said data centers want to centralize all their data, from creation to retention, without adding third-party endpoints.

“We envision Hedvig as a way to ensure that your storage and backup will work in a symbiotic fashion,” Foster said.

Hedvig provides unified storage that allows Commvault to tackle new cloud-application use cases. The storage software run on clustered commodity nodes as distributed architecture for cloud and scale-out file and object storage across multiple hypervisors.

Commvault plans to use Hedvig to converge storage and data management and enhance Commvault HyperScale purpose-built backup appliances. Revenue from Commvault HyperScale appliances was up 10% year over year last quarter, and the vendor said six of its top 10 customers have deployed HyperScale appliances.

Commvault has expanded Hedvig into more primary workloads with the addition of support for the Container Storage Interface and erasure coding. In the near term, Hedvig will also remain available for purchase as primary storage and existing Hedvig customers with in-force contracts will be supported. The larger plan is to integrate Hedvig as a feature in the Commvault Complete suite of backup and data management tools, Foster said.

Integrating technology and integrating culture

Mirchandani replaced retired CEO Bob Hammer, who led Commvault for 20 years. The change at the top also brought about a raft of executive changes and the launch of the Metallic SaaS offering under a brand outside of Commvault. But the Hedvig deal was most significant in moving the Commvault storage strategy from data protection to data management — a shift backup vendors have talked about for years.

Because Hedvig didn’t have a large installed base, the key for Commvault was gaining access to Hedvig’s engineering IP, said Steven Hill, a senior analyst of applied infrastructure and storage technologies at 451 Research, part of S&P Global Market Intelligence.

Hedvig gives Commvault a software-defined storage platform that combines block, file and object storage services, along with cloud-level automation and support for containers.
Steven HillSenior analyst of applied infrastructure and storage technologies, 451 Research

“Growing adoption of hybrid cloud infrastructure and scale-out secondary storage has changed the business model for backup vendors. Hedvig gives Commvault a software-defined storage platform that combines block, file and object storage services, along with cloud-level automation and support for containers. It checks a lot of boxes for the next generation of storage buyers,” Hill said.

“The future of hybrid secondary storage lies in the management of data based on the business value of its content, and makes the need for broader, cloud-optimized information management a major factor in future storage buying decisions,” Hill added. He said Cohesity and Rubrik “discovered this [idea] a while ago” and other backup vendors now are keying in on secondary storage to support AI and analytics.

A research note by IDC said the Hedvig deal signals “orthogonal and expansionary thinking” by Commvault that paves a path to primary storage and multi-cloud data management. Commvault is a top five backup vendor in revenue; its revenue has declined year over year for each of the last four quarters. Commvault reported $176.3 million in revenue last quarter, down 4.3% from the same period a year ago.

IDC researchers note the difference between traditional Commvault storage and the Hedvig product. Namely, that Commvault is a 20-year-old public company in an entrenched market, while Hedvig launched in 2018. The companies share only a few mutual business partners and resellers.

“Market motion matters here, as each company is selling into different buyer bases.  … Melding a unified company and finding synergies between different buying centers may be more difficult than the technical integration,” IDC analysts wrote in a report on the Commvault-Hedvig acquisition.

‘Belts and suspenders’ approach

Pittsburg State University (PSU) in Kansas has deployed Hedvig primary storage and Commvault backup for several years. Tim Pearson, the university’s assistant director of IT infrastructure and security, said he was not surprised to hear about the Hedvig deal.

“I knew Hedvig was looking for a way to grow the company,” Pearson said, adding that he spoke with Commvault representatives in the run-up to the transaction.

PSU runs Hedvig storage software on Hewlett Packard Enterprise ProLiant servers as frontline storage for its VMware farm and protects data with Commvault backup. Pearson said the “belts and suspenders” approach designed by Hedvig engineers enables Commvault to bridge production storage and secondary use cases.

“What I hope to gain out of this is a unified pane of glass to manage not only my traditional Commvault backups, but also point-in-time recovery by scheduling Hedvig storage-level snapshots,” Pearson said.

Go to Original Article
Author:

Data Science Central co-founder talks AI, data science trends

The data science field has changed greatly with the advent of AI. Artificial intelligence has enabled the rise of citizen data scientists, the automation of data scientist’s workloads, as well as the need for more skilled data scientists.

Vincent Granville, co-founder of Data Science Central, a community and resource site for data specialists, expects to see an increase in AI and IoT in data science over the next few years, even as AI continues to change the data science field.

In this Q&A, Granville discusses data science trends, the impact of AI and IoT on data scientists, how organizations and data scientists will have to adapt to increased data privacy regulations, and the evolution of AI.

Data Science Central was acquired by TechTarget on March 4.

Will an increase in citizen data scientists due to AI, as well as an increase of more formal data science education programs, help fix the so-called data scientist shortage?

Vincent Granville: I believe that we will see an increase in two fronts. We will see more data science programs being offered by universities, perhaps even doctorates in addition to master degrees, as well as more bootcamps and online training aimed at practitioners working with data but lacking some skills such as statistical programming or modern techniques such as deep learning — something old but that became popular recently due to the computational power now available to train and optimize these models.

Vincent Granville, Data Science Central, Data science trendsVincent Granville

There is also a parallel trend that will increase, consisting of hiring professionals not traditionally thought of as data scientists, such as physicists, who have significant experience working with data. This is already the case in fintech, where these professionals learn the new skills required on the job. Along with corporations training staff internally via sending selected employees to tech and data bootcamps, this will help increase the pipeline of potential recruits for the needed positions.

Also, AI itself will help build more tools to automate some of the grunt work, like data exploration, that many data scientists do today, currently eating up to 80% of their time. Think of it as AI to automate AI.

Similarly, how will, or how has, data science changed with the advent of AI that can automate various parts of the data science workflow?

Granville: We will see more automation of data science tasks. In my day-to-day activities, I have automated as much as I can, or outsourced or used platforms to do a number of tasks — even automating pure mathematical work such as computing integrals or finding patterns in number sequences.

The issue is resistance by employees to use such techniques, as they may perceive it as a way to replace them. But the contrary is true: Anything you do [manually] that can be automated actually lowers your job security. A change in mentality must occur for further adoption of automated data science for specific tasks, simple or not so simple, such as the creation of taxonomies, or programs that write programs.

The trend probably started at least 15 years ago, with the advent of machine-to-machine communications, using API’s and the internet at large for machines, aka robots, to communicate between themselves, and even make decisions. Now with a huge amount of unexploited sensor data available, it even has a term of its own: IoT.

An example is this: EBay purchases millions of keywords on Google; the process, including predicting the value, ROI and set[ting] the pricing for keywords, is fully automated. There is a program at eBay that exchanges info with one running at Google to make this transparent, including keyword purchasing, via programmed APIs. Yet eBay employs a team of data scientists and engineers to make sure things run smoothly and are properly maintained, and same with Google.

How will increased data privacy regulations and a larger focus on cybersecurity change data science?

Granville: It will always be a challenge to find the right balance. People are getting concerned that their data is worth something, more than just $20, and don’t like to see this data sold and resold time and over by third parties, or worse, hijacked or sold for nefarious purposes such as surveillance. Anything you post on Facebook can be analyzed by third parties and end up in the hands of government agencies from various countries, for profiling purposes, or detection of undesirable individuals.

Some expectations are unrealistic: You cannot expect corporations to tell what is hidden in the deep layers of their deep learning algorithms. This is protected intellectual property. When Google shows you search results, nobody, not even Google, knows how what you see — sometimes personalized to you — came up that way. But Google publishes patents about these algorithms, and everyone can check them.

The same is true with credit scoring and refusal to offer a loan. I think in the future, we will see more and more auditing of these automated decisions. Sources of biases will have to be found and handled. Sources of errors due to ID theft, for example, will have to be found and addressed. The algorithms are written by human beings, so they are not less biased than the human beings who designed them in the first place. Some seemingly innocuous decisions such as deciding which features, or variables, to introduce in your algorithm, potentially carry a bias.

I could imagine some companies [may] relocate … or even stop doing business altogether in some countries that cause too many challenges. This is more likely to happen to small companies, as they don’t have the resources to comply with a large array of regulations. Yet we might see in the future AI tools that do just that: help your business comply transparently with all local laws. We have that already for tax compliance.

What other data science trends can we expect to see in 2020 and beyond?

Granville: We live in a world with so many problems arising all the time — some caused by new technologies. So, the use of AI and IoT will increase.

We live in a world with so many problems arising all the time — some caused by new technologies. So, the use of AI and IoT will increase.
Vincent GranvilleCo-founder and executive data scientist, Data Science Central

Some problems will find solutions in the next few years, such as fake news detection or robocalls, just like it took over 10 years to fix email spamming. But it is not just a data science issue: if companies benefit financially short-term from the bad stuff, like more revenue to publishers because of fake news or clickbait, or more revenue to mobile providers due to robocalls, it needs to be addressed with more than just AI.

Some industries evolve more slowly and will see benefits in using AI in the future: Think about automated medical diagnostics or personalized dosing of drugs, small lawsuits handled by robots, or even kids at school being taught, at least in part, by robots. And one of the problems I face all the time with my spell-checker is its inability to detect if I write in French or English, resulting in creating new typos rather than fixing them.

Chatbots will get better too, eventually, for tasks such as customer support, or purchasing your groceries via Alexa without even setting foot in a grocery store or typing your shopping list. In the very long term, I could imagine the disappearance of written language, replaced by humans communicating orally with machines.

Go to Original Article
Author:

Alteryx 2020.1 highlighted by new data profiling tool

Holistic Data Profiling, a new tool designed to give business users a complete view of their data while in the process of developing workflows, highlighted the general availability of Alteryx 2020.1 on Thursday.

Alteryx, founded in 1997 and based in Irvine, Calif., is an analytics and data management specialist, and Alteryx 2020.1 is the vendor’s first platform update in 2020. It released its most recent update, Alteryx 2019.4, in December 2019, featuring a new integration with Tableau.

The vendor revealed the platform update in a blog post; in addition to Holistic Data Profiling, it includes 10 new features and upgrades. Among them are new language toggling feature in Alteryx Designer, the vendor’s data preparation product.

“The other big highlights are more workflow efficiency features,” said Ashley Kramer, Alteryx’s senior vice president of product management. “And the fact that Designer now ships with eight languages that can quickly be toggled without a reinstall is huge for our international customers.”

Holistic Data Profiling is a low-code/no-code feature that gives business users an instantaneous view of their data to help them better understand their information during the data preparation process — without having to consult a data scientist.

After dragging a Browse Tool — Alteryx’s means of displaying data from a connected tool as well as data profile information, maps, reporting snippets and behavior analysis information — onto Alteryx’s canvas, Holistic Data Profiling provides an immediate overview of the data.

Holistic Data Profiling is aimed to help business users understand data quality and how various columns of data may be related to one another, spot trends, and compare one data profile to another as they curate their data.

An overview of an organization's data is displayed in a sample Holistic Data Profiling gif from Alteryx.
A sample Holistic Data Profiling gif from Alteryx gives an overview of an organization’s data.

Users can zoom in on a certain column of data to gain deeper understanding, with Holistic Data Profiling providing profile charts and statistics about the data such as the type, quality, size and number of records.

That knowledge will subsequently inform how to proceed to the next move in order to ultimately make a data-driven decision.

It’s easy to get tunnel vision when analyzing data. Holistic Data Profiling enables end users — via low-code/no-code tooling — to quickly gain a comprehensive understanding of the current data estate.
Mike LeoneAnalyst, Enterprise Strategy Group

“It’s easy to get tunnel vision when analyzing data,” said Mike Leone, analyst at Enterprise Strategy Group. “Holistic Data Profiling enables end users — via low-code/no-code tooling — to quickly gain a comprehensive understanding of the current data estate. The exciting part, in my opinion, is the speed at which end users can potentially ramp up an analytics project.”

Similarly, Kramer noted the importance of being able to more fully understand data before the final stage of analysis.

“It is really important for our customers to see and understand the landscape of their data and how it is changing every step of the way in the analytic process,” she said.

Alteryx customers were previously able to view their data at any point — on a column-by-column or defined multi-column basis — but not to get a complete view, Kramer added.

“Experiencing a 360-degree view of your data with Holistic Data Profiling is a brand-new feature,” she said.

In addition to Holistic Data Profiling, the new language toggle is perhaps the other signature feature of the Alteryx platform update.

Using Alteryx Designer, customers can now switch between eight languages to collaborate using their preferred language.

Alteryx previously supported multiple languages, but for users to work in their preferred language, each individual user had to install Designer in that language. With the updated version of Designer, they can click on a new globe icon in their menu bar and select the language of their choice to do analysis.

“To truly enable enterprise-wide collaboration, breaking down language barriers is essential,” Leone said. “And with Alteryx serving customers in 80 different countries, adding robust language support further cements Alteryx as a continued leader in the data management space.”

Among the other new features and upgrades included in Alteryx 2020.1 are a new Power BI on-premises loader that will give users information about Power BI reports and automatically load those details into their data catalog in Alteryx Connect; the ability to input selected rows and columns from an Excel spreadsheet; and new Virtual Folder Connect to save custom queries.

Meanwhile, a streamlined loader of big data from Alteryx to the Snowflake cloud data warehouse is now in beta testing.

“This release and every 2020 release will have a balance of improving our platform … and fast-forwarding more innovation baked in to help propel their efforts to build a culture of analytics,” Kramer said.

Go to Original Article
Author:

Small business analytics success hinges on resources, skills

The inability to harness the power of data and turn it into fuel for growth hampers the success of many SMBs.

Unlike large enterprises with massive budgets, SMBs are often unable to employ data scientists to build and maintain analytics operations and interpret data to make fully informed decisions. Instead of investing in small business analytics strategies, they rely on instinct and experience, neither of which is foolproof.

Onepath, an IT services provider based in Kennesaw, Ga., sought to quantify the struggles of the SMBs it serves. It surveyed more than 100 managers and executives of organizations ranging in size from 100 to 500 employees to gauge their experience with analytics, and on Thursday released a report entitled “Onepath 2020 Trends in SMB Data Analytics Report.”

Key discoveries included that despite dedicating time and money to analytics, 86% felt they weren’t able to fully harness the power of data, 59% believed analytic capabilities would help them go to market faster and 54% felt that they risked making poor business decisions without the benefits of data analysis.

Phil Moore, Onepath’s director of applications management services, spoke with SearchBusinessAnalytics about the report as well as the difficulties involved in small business analytics efforts.

In Part I of this two-part Q&A, he discussed the findings of the report in detail. Here he talks about the perils SMBs face if they don’t develop a data-driven decision-making process.

As the technology in business intelligence platforms gets better and better, will SMBs be able to improve data utilization as well as large enterprises?

Phil MoorePhil Moore

Phil Moore: The Fortune 500s of the world have deep pockets and can hire their army of IT guys and go after it, but the small and medium-sized businesses tend to have far less volume of data unless they are in the unique position where they are a high-data business. But the core [of the SMB market] is around legal, construction, health care, doctor’s offices, and their data doesn’t get to the volume of larger organizations. They’re just looking for the metrics that help them run their business more efficiently, help them service their clients.

If you go to the other bookend and see an Amazon, of course they’re on a grand scale in terms of the size of their business. And they’re using analytics all up and down throughout their business, whether it be shipping, fulfillment, robotics, managing their warehouses. The SMB market won’t have the same types of complexities that the big guys have. The market is different.

Are there SMBs who are able to harness the power of data?

The survey shows that 86% of the companies that are taking a swing at analytics — that have some solution — say they’re underachieving, and they could be getting more out of their data.
Phil MooreDirector of applications management services, Onepath

Moore: The survey shows that 86% of the companies that are taking a swing at analytics — that have some solution — say they’re underachieving, and they could be getting more out of their data. That leaves 14% that are delighted with what they’re getting. There are always leading guys, the cutting edge, the folks that are more technology-centric or that appreciate and understand the value of technology and how it can help the business. Those guys are going to lead the way.

What will happen to companies that don’t figure out a way to use data, and is there a timetable for when they need to get with it?

Moore: If you break down the SMB market into the different disciplines — health care, legal, construction — the folks that get and use analytics, their first benefit over their competitors is a better line of sight to their business. They’re going to be able to make crisper decisions, which lead to either faster delivery of something to the market or better customer service, which indirectly will lead to higher profits. Right away they get a competitive advantage over their competitors that aren’t using analytics, that are running their business by shooting from the hip — which is running it with their intuition and their knowledge and their experience. That knowledge and experience may get proven wrong with data, because the eye in the sky doesn’t lie. At some point, things get revealed in the data that lead to transforming business decisions.

For example, in the IT space, one of the transforming business decisions is how to go to market, changing from charging by the hour for every hour worked when a ticket is opened to offering a fixed-price, all-you-can-eat model. The data shows a fixed price will still be profitable if they optimize internal processes. So, IT companies are shifting, and the companies that are now going to market with a fixed-price, all-you-can-eat support model are crushing the guys that are still out there charging by the hour. The guys charging by the hour either have to transform or die. Those transformations that get driven by the data will happen in an industry-vertical way.

Is it critical small business analytics expenditures to be part of the budget right off the bat?

Moore: Yes, but the challenge we see is that they know they want to have analytics but they don’t know how to budget for it. Therefore, it becomes unaffordable. One of the things we’re trying to do is make it affordable so people can bridge the mental gap from wanting analytics but not being able to get it by offering a monthly, low-entry, very affordable template set of [key performance indicators], so once they see the value they know how to put a dollar figure on the value and then adjust their budget for the next year. If you go to a small business and tell them they need analytics and need to budget for it, they struggle with how much to budget. They put a line item in the budget but they don’t know what they’re getting, so it often winds up getting cut from the budget.

Editor’s note: This Q&A has been edited for brevity and clarity.

Go to Original Article
Author:

A closer look at new and updated Microsoft security features

Data breaches occur on a daily basis. They can’t be avoided in our interconnected world, but you can take a proactive approach to reduce your risk.

While the internet has been a boon for organizations that rely on remote users and hybrid services, it’s now easier than ever for an intrepid hacker to poke at weak points at the perimeter to try and find a way inside. Windows Server is a key IT infrastructure component for most enterprises that handles numerous tasks — such as authentication — and runs critical workloads, namely Exchange Server, SQL Server and Hyper-V. Due to its ubiquitous nature, Windows Server is a natural target for hackers seeking a foothold inside your company. There are many Microsoft security products and native features in the newer Windows Server designed to keep sensitive information from spreading beyond your organization’s borders.

Microsoft security in Windows Server improved with the Server 2019 release by updating existing protections and adding new functionality geared to prevent the exposure of sensitive information. The company also offers several cloud-based products that integrate with the Windows operating system to warn administrators of trending threats that could affect their systems.

What are some features in Microsoft Defender ATP?

Microsoft Defender Advanced Threat Protection — formerly, Windows Defender ATP — supplements existing security measures while also providing a cloud-based platform with a range of capabilities, including response to active attacks, automated investigation of suspicious incidents and a scoring system that determines the level of vulnerability for each endpoint.

Microsoft Defender ATP, which underwent a name change in 2019 when the product was extended to protect Mac systems, features multiple proactive and reactive methods to protect organizations from many forms of cyberattacks. For example, to keep an endpoint from being susceptible to a common intrusion method via a Microsoft Office application, Microsoft Defender ATP can prevent the application from launching a child process.

Microsoft Defender ATP gathers information from a vast array of resources — such as different events on on-premises Windows systems and the Office 365 cloud collaboration platform — that Microsoft analyzes to detect patterns, such as certain command-line actions, that could indicate malicious behavior. Microsoft Defender ATP integrates with several Azure security products for additional protection. For example, by connecting to Azure Security Center, administrators get a dashboard that highlights suspicious activity in the organization with recommended actions to execute to prevent further damage.

Microsoft security features in this offering were tailored for Windows Server 2019 customers to prevent attacks that start either in the kernel or memory — sometimes called file-less attacks — of the operating system. Microsoft Defender ATP eases the onboarding process for this server OS through System Center Configuration Manager with a script.

What new SDN security features are in Windows Server 2019?

Abstracting the operations work associated with networking offers administrators a way to add some agility in an area not typically known for its nimbleness. Software-defined networking (SDN) gives IT newfound abilities via a centralized management platform for network devices to make it easier to perform certain tasks, such as ensuring specific workloads get enough bandwidth to meet performance expectations. But SDN is not immune to traditional threats if a malicious actor gains network access and proceeds to sniff traffic to scoop up credentials and other valuable information.

Microsoft enhanced the security aspect of its Windows Server 2019 SDN functionality by introducing several features to avoid data leakage, even if the data center defenses failed to stop unauthorized system access.

By implementing the “encrypted networks” feature, organizations add another layer of security around data that moves between VMs inside a particular subnet by encoding the information. Other noteworthy SDN security additions for the Server 2019 OS include more granular control over access control lists to avoid security gaps and firewall auditing on Hyper-V hosts for further investigation of suspicious incidents.

Where can I use BitLocker encryption in my environment?

Microsoft released its BitLocker encryption feature for on-premises Windows systems, starting with the Vista operating system in 2007. Since that time, the company has continued to develop ways to use this technology in more places, both in the data center and beyond.

BitLocker started out as an encryption method to protect all the contents in a hard drive. That way, even if a laptop was stolen, prying eyes would not be able to do anything with the confidential data stored on the device due to the length of time it would take to do a brute-force hack of even a less-secure 128-bit key.

Using BitLocker, while effective to thwart hackers, can frustrate users when they need to authenticate every time they need to use a device or when a BitLocker-encrypted server requires an additional login process after a reboot. Microsoft developed a feature dubbed BitLocker Network Unlock, debuting with Windows 8 and Windows Server 2012, that uses the physical network to deliver the encrypted network key so protected systems can unlock if they are connected to the corporate network.

Microsoft extended BitLocker technology to the cloud to give administrators a way to put additional safeguards around sensitive Azure VMs with the platform’s Azure Disk Encryption feature for full volume protection of disks. For this type of deployment, the Azure Key Vault is used for key management.

What are some recent security features added to Hyper-V?

Data leakage can tarnish a company’s reputation, but it can be an expensive lesson for lax security practices if regulators determine a privacy law, such as the GDPR, was broken.

Organizations that use the Hyper-V platform get the typical benefits acquired by consolidating multiple workloads on a single host in a virtualized arrangement.

But Microsoft continues to help administrators who operate in sensitive environments by adding virtualization-based security features with each successive Windows Server release to reduce the probability of a data breach, even if an intruder makes their way past the firewall and other defensive schemes.

Microsoft added shielded VMs in Windows Server 2016, which encrypts these virtualized workloads to prevent access to their data if, for example, the VM is copied from the sanctioned environment. In Windows Server 2019, Microsoft extended this protection feature to Linux workloads that run on Hyper-V when the VMs are at rest or as they shift to another Hyper-V host.

Go to Original Article
Author:

Sigma analytics platform’s interface simplifies queries

In desperate need of data dexterity, Volta Charging turned to the Sigma analytics platform to improve its business intelligence capabilities and ultimately help fuel its growth.

Volta, based in San Francisco and founded in 2010, is a provider of electric vehicle charging stations, and three years ago, when Mia Oppelstrup started at Volta, the company faced a significant problem.

Because there aren’t dedicated charging stations the same way there are dedicated gas stations, Volta has to negotiate with organizations — mostly retail businesses — for parking spots where Volta can place its charging stations.

Naturally, Volta wants its charging stations placed in the parking spots with the best locations near the business they serve. But before an organization gives Volta those spots, Volta has to show that it makes economic sense, that by putting electric car charging stations closest to the door it will help boost customer traffic through the door.

That takes data. It takes proof.

Volta, however, was struggling with its data. It had the necessary information, but finding the data and then putting it in a digestible form was painstakingly slow. Queries had to be submitted to engineers, and those engineers then had to write code to transform the data before delivering a report.

Any slight change required an entirely new query, which involved more coding, time and labor for the engineers.

But then the Sigma analytics platform transformed Volta’s BI capabilities, Volta executives said.

Curiosity isn’t enough to justify engineering time, but curiosity is a way to get new insights. By working with Sigma and doing queries on my own I’m able to find new metrics.
Mia OppelstrupBusiness intelligence manager, Volta Charging

“If I had to ask an engineer every time I had a question, I couldn’t justify all the time it would take unless I knew I’d be getting an available answer,” said Oppelstrup, who began in marketing at Volta and now is the company’s business intelligence manager. “Curiosity isn’t enough to justify engineering time, but curiosity is a way to get new insights. By working with Sigma and doing queries on my own I’m able to find new metrics.”

Metrics, Oppelstrup added, that she’d never be able to find on her own.

“It’s huge for someone like me who never wrote code,” Oppelstrup said. “It would otherwise be like searching a warehouse with a forklift while blindfolded. You get stuck when you have to wait for an engineer.”

Volta looked at other BI platforms — Tableau and Microsoft’s Power BI, in particular — but just under two years ago chose Sigma and has forged ahead with the platform from the 2014 startup.

The product

Sigma Computing was founded by the trio of Jason Frantz, Mike Speiser and Rob Woollen.

Based in San Francisco, the vendor has gone through three rounds of financing and to date raised $58 million, most recently attracting $30 million in November 2019.

When Sigma was founded, and ideas for the Sigma analytics platform first developed, it was in response to what the founders viewed as a lack of access to data.

“Gartner reported that 60 to 73 percent of data is going unused and that only 30 percent of employees use BI tools,” Woollen, Sigma’s CEO, said. “I came back to that — BI was stuck with a small number of users and data was just sitting there, so my mission was to solve that problem and correct all this.”

Woollen, who previously worked at Salesforce and Sutter Hill Ventures — a main investor in Sigma — and his co-founders set out to make data more accessible. They set out to design a BI platform that could be used by ordinary business users — citizen data scientists — without having to rely so much on engineers, and one that respond quickly no matter the queries users ask of it.

Sigma launched the Sigma analytics platform in November 2018.

Like other BI platforms, Sigma — entirely based in the cloud — connects to a user’s cloud data warehouse in order to access the user’s data. Unlike most BI platforms, however, the Sigma analytics platform is a low-code BI tool that doesn’t require engineering expertise to sift through the data, pull the data relevant to a given query and present it in a digestible form.

A key element of that is the Sigma analytics platform’s user interface, which resembles a spreadsheet.

With SQL running in the background to automatically write the necessary code, users can simply make entries and notations in the spreadsheet and Sigma will run the query.

“The focus is always on expanding the audience, and 30 percent employee usage is the one that frustrates me,” Woollen said. “We’re focused on solving that problem and making BI more accessible to more people.”

The interface is key to that end.

“Products in the past focused on a simple interface,” Woollen said. “Our philosophy is that just because a businessperson isn’t technical that shouldn’t mean they can’t ask complicated questions.”

With the Sigma analytics platform’s spreadsheet interface, users can query their data, for example, to examine sales performance in a certain location, time or week. They can then tweak it to look at a different time, or a different week. They can then look at it on a monthly basis, compare it year over year, add and subtract fields and columns at will.

And rather than file a ticket to the IT department for each separate query, they can run the query themselves.

“The spreadsheet interface combines the power to ask any question of the data without having to write SQL or ask a programmer to do it,” Woollen said.

Giving end users power to explore data

Volta knew it had a data dexterity problem — an inability to truly explore its data given its reliance on engineers to run time- and labor-consuming queries — even before Oppelstrup arrived. The company was looking at different BI platforms to attempt to help, but most of the platforms Volta tried out still demanded engineering expertise, Oppelstrup said.

The outlier was the Sigma analytics platform.

“Within a day I was able to set up my own complex joins and answer questions by myself in a visual way,” Oppelstrup said. “I always felt intimidated by data, but Sigma felt like using a spreadsheet and Google Drive.”

One of the significant issues Volta faced before it adopted the Sigma analytics platform was the inability of its salespeople to show data when meeting with retail outlets and attempting to secure prime parking spaces for Volta’s charging stations.

Because of the difficulty accessing data, the salespeople didn’t have the numbers to prove that by placing charging stations near the door it would increase customer traffic.

With the platform’s querying capability, however, Oppelstrup and her team were able to make the discoveries that armed Volta’s salespeople with hard data rather than simply anecdotes.

They could now show a bank a surge in the use of charging stations near banks between 9 a.m. and 4 p.m., movie theaters a similar surge in the use just before the matinee and again before the evening feature, and grocery stores a surge near stores at lunchtime and after work.

They could also show that the charging stations were being used by actual customers, and not by random people charging up their vehicles and then leaving without also going into the bank, the movie theater or the grocery store.

“It’s changed how our sales team approaches its job — it used to just be about relationships, but now there’s data at every step,” Oppelstrup said.

Sigma enables Oppelstrup to give certain teams access to certain data, everyone access to other data, and importantly, easily redact data fields within a set that might otherwise prevent her from sharing information entirely, she said.

And that gets to the heart of Woollen’s intent when he helped start Sigma — enabling business users to work with more data and giving more people that ability to use BI tools.

“Access leads to collaboration,” he said.

Go to Original Article
Author: