Tag Archives: primary

Cloudera open source route seeks to keep big data alive

Cloudera has had a busy 2019. The vendor started off the year by merging with its primary rival Hortonworks to create a new Hadoop big data juggernaut. However, in the ensuing months, the newly merged company has faced challenges as revenue has come under pressure and the Hadoop market overall has shown signs of weakness.

Against that backdrop, Cloudera said July 10 that it would be changing its licensing model, taking a fully open source approach. The Cloudera open source route is a new strategy for the vendor. In the past, Cloudera had supported and contributed to open source projects as part of the larger Hadoop ecosystem but had kept its high-end product portfolio under commercial licenses.

The new open source approach is an attempt to emulate the success that enterprise Linux vendor Red Hat has achieved with its open source model. Red Hat was acquired by IBM for $34 billion in a deal that closed in July. In the Red Hat model, the code is all free and organizations pay a subscription fee for support services.

New subscription model

Under the new model the plan is that starting in September 2019, Cloudera will require users to buy a subscription agreement to access binaries and the Cloudera-hosted source they are built from, for all new versions and maintenance releases of supported products. Not all of the vendor’s products are open source today, but they all will be by February of 2020, if everything goes according to the Cloudera open source plan.

Over the past several weeks, Cloudera has briefed customers, partners and analysts on the new licensing model and the feedback thus far has been largely positive, said David Moxey, VP of product marketing at Cloudera.

“We believe that reflects market understanding and acceptance of the Red Hat model which we have chosen to emulate,” Moxey said. “Influential analysts have been positive and have helped clients and colleagues accurately understand the change and rationale.”

Open source path a departure

The shift to a completely open source model is a departure from earlier comments the company made about maintaining both open source and some proprietary tooling, said James Curtis, analyst at 451 Research. Overall, Curtis said he sees the Cloudera open source move settling a number of things for the vendor.

 “Trying to drive both a hybrid and OS [open source] strategy would have had it challenges,” Curtis said. “Now the company can move forward and concentrate on its products and services.”

It makes sense to maintain that strategy not only from the standpoint of keeping faith with former Hortonworks customers, but also to trade on the momentum that open source software is seeing in the enterprise.
Doug Henschen Analyst, Constellation Research

Doug Henschen, an analyst at Constellation Research, said he wasn’t entirely surprised by Cloudera’s all open source commitment as it’s consistent with Hortonworks’ strategy before the merger with Cloudera.

“It makes sense to maintain that strategy not only from the standpoint of keeping faith with former Hortonworks customers, but also to trade on the momentum that open source software is seeing in the enterprise,” Henschen said.

Also, now that it’s apparent that cloud services — in particular AWS EMR (Amazon Elastic MapReduce) — are cutting into Cloudera’s business, it makes sense to back the hybrid- and multi-cloud messaging and differentiation from cloud services with a purely open source approach, Henschen added. Looking forward, Henschen said he sees Cloudera continuing to emphasize its hybrid- and multi-cloud story by continuing to develop and deliver its Altus Cloud services.

“Cloud services are clearly winning in the market because they offer both elasticity and minimal administration, two traits that save customers money,” Henschen said. “What wouldn’t be surprising is seeing new Altus services options, perhaps including pared down, lower-cost options featuring Spark, object storage and fewer, if any, Hadoop components.”

Hadoop’s prospects uncertain

At the core of Cloudera’s business is Hadoop, a technology and a market that is arguably in retreat as organizations choose different options for handling large data sets and big data.

While the financial fortunes of Cloudera and other Hadoop vendors, notably MapR, have been under pressure, Curtis noted that there is still demand for Hadoop and related distributed data processing frameworks. The market is shifting and settling, however.

Cloudera Palo Alto headquarters, Workload XM
Cloudera Palo Alto headquarters (on screen is the Workload XM)

“What is happening is that where enterprises are doing their analytics is changing and specifically to the cloud,” he said. “Cloudera was late to fully adopt the cloud, but it wasn’t completely behind either.”

Curtis added that while the Cloudera open source route still leaves the vendor with catchup to do, he doesn’t expect Hadoop to go away completely as there is still a demand for that type of processing capability.

But big data market is vibrant

As for Henschen, he said he doesn’t think of it as ‘the Hadoop market’ so much as ‘the big data market,’ and that’s not going away.

“Companies are continuing to harness big data to understand their businesses and to spot new opportunities,” he said. “What is clear is that companies are moving away from complexity, if they can avoid it.”

The demand to reduce complexity has forced changes in Hadoop software and vendors associated with Hadoop. For example, Henschen noted that Cloudera embraced Apache Spark as early as 2017, with Cloudera executives making the point that they were behind more Spark software deployments than any other vendor. Indeed, as part of the company’s recent open source announcements, Henschen said that Cloudera executives emphasized that Cloudera plans to invest in Spark, Kubernetes and Kafka.

Henschen emphasized that it’s important to note that AWS, Microsoft, Google and other cloud vendors still offer Hadoop services because that software addresses certain big data needs well.

Also, there are thousands of on-premises Hadoop deployments that are disappearing right away, Henschen pointed out.

The installed base of deployments is a legacy that the vendor can build on as the Cloudera open source strategy unfolds, he said. But the big question, according to Henschen, is whether the vendor can succeed in offering the right mix of software and services that will address today’s big data needs, appeal to customers and drive growth.

“Hadoop helped usher in the big data era, but today there are more choices and combinations of software and services that companies can use to address their big data needs,” he said.

Go to Original Article

Microsoft seeks broader developer appeal with Azure DevOps

Microsoft has rebranded its primary DevOps platform as Azure DevOps to reach beyond Windows developers or Visual Studio developers and appeal to those who just want a solid DevOps platform.

Azure DevOps encompasses five services that span the breadth of the development lifecycle. The services aim to help developers plan, build, test, deploy and collaborate to ship software faster and with higher quality. These services include the following:

  • Azure Pipelines is a CI/CD service.
  • Azure Repos offers source code hosting with version control.
  • Azure Boards provides project management with support for Agile development using Kanban boards and bug tracking.
  • Azure Artifacts is a package management system to store artifacts.
  • Azure Test Plans lets developers define, organize, and run test cases and report any issues through Azure Boards.

Microsoft customers wanted the company to break up the Visual Studio Team Services (VSTS) platform so they could choose individual services, said Jamie Cool, Microsoft’s program manager for Azure DevOps. By doing so, the company also hopes to attract a wider audience that includes Mac and Linux developers, as well as open source developers in general, who avoid Visual Studio, Microsoft’s flagship development tool set.

Open source software continues to achieve broad acceptance within the software industry. However, many developers don’t want to switch to Git source control and stay with VSTS for everything else. Over the past few years, Microsoft has technically separated some of its developer tool functions.

But the company has struggled to convince developers about Microsoft’s cross-platform capabilities and that they can pick and choose areas from Microsoft versus elsewhere, said Rockford Lhotka, CTO of Magenic, an IT services company in St. Louis Park, Minn.

Rockford Lhotka, CTO, MagenicRockford Lhotka

“The idea of a single vendor or single platform developer is probably gone at this point,” he said. “A Microsoft developer may use ASP.NET, but must also use JavaScript, Angular and a host of non-Microsoft tools, as well. Similarly, a Java developer may well be building the back-end services to support a Xamarin mobile app.”

Most developers build for a lot of different platforms and use a lot of different development languages and tools. However, the features of Azure DevOps will work for everyone, Lhotka said.

Azure DevOps is Microsoft’s latest embrace of open source development, from participation in open source development to integrating tools and languages outside its own ecosystem, said Mike Saccotelli, director of modern apps at SPR, a digital technology consulting firm in Chicago.

In addition to the rebranded Azure DevOps platform, Microsoft also plans to provide free CI/CD technology for any open source project, including unlimited compute on Azure, with the ability to run up to 10 jobs concurrently, Cool said. Microsoft has also made Azure Pipelines the first of the Azure DevOps services to be available on the GitHub Marketplace.

Bringing Device Support to Windows Server Containers

When we introduced containers to Windows with the release of Windows Server 2016, our primary goal was to support traditional server-oriented applications and workloads. As time has gone on, we’ve heard feedback from our users about how certain workloads need access to peripheral devices—a problem when you try to wrap those workloads in a container. We’re introducing support for select host device access from Windows Server containers, beginning in Insider Build 17735 (see table below).

We’ve contributed these changes back to the Open Containers Initiative (OCI) specification for Windows. We will be submitting changes to Docker to enable this functionality soon. Watch the video below for a simple example of this work in action (hint: maximize the video).

What’s Happening

To provide a simple demonstration of the workflow, we have a simple client application that listens on a COM port and reports incoming integer values (powershell console on the right). We did not have any devices on hand to speak over physical COM, so we ran the application inside of a VM and assigned the VM’s virtual COM port to the container. To mimic a COM device, an application was created to generate random integer values and send it over a named pipe to the VM’s virtual COM port (this is the powershell console on the left).

As we see in the video at the beginning, if we do not assign COM ports to our container, when the application runs in the container and tries to open a handle to the COM port, it fails with an IOException (because as far as the container knew, the COM port didn’t exist!). On our second run of the container, we assign the COM port to the container and the application successfully gets and prints out the incoming random ints generated by our app running on the host.

How It Works

Let’s look at how it will work in Docker. From a shell, a user will type:

docker run --device="/"

For example, if you wanted to pass a COM port to your container:

docker run --device="class/86E0D1E0-8089-11D0-9CE4-08003E301F73" mcr.microsoft.com/windowsservercore-insider:latest

The value we’re passing to the device argument is simple: it looks for an IdType and an Id. For this coming release of Windows , we only support an IdType of “class”. For Id, this is  a device interface class GUID. The values are delimited by a slash, “/”.  Whereas  in Linux a user assigns individual devices by specifying a file path in the “/dev/” namespace, in Windows we’re adding support for a user to specify an interface class, and all devices which identify as implementing this class   will be plumbed into the container.

If a user wants to specify multiple classes to assign to a container:

docker run --device="class/86E0D1E0-8089-11D0-9CE4-08003E301F73" --device="class/DCDE6AF9-6610-4285-828F-CAAF78C424CC" --device="…" mcr.microsoft.com/windowsservercore-insider:latest

What are the Limitations?

Process isolation only: We only support passing devices to containers running in process isolation; Hyper-V isolation is not supported, nor do we support host device access for Linux Containers on Windows (LCOW).

We support a distinct list of devices: In this release, we targeted enabling a specific set of features and a specific set of host device classes. We’re starting with simple buses. The complete list that we currently support  is  below.

Device Type Interface Class  GUID
GPIO 916EF1CB-8426-468D-A6F7-9AE8076881B3
I2C Bus A11EE3C6-8421-4202-A3E7-B91FF90188E4
COM Port 86E0D1E0-8089-11D0-9CE4-08003E301F73
SPI Bus DCDE6AF9-6610-4285-828F-CAAF78C424CC

Stay tuned for a Part 2 of this blog that explores the architectural decisions we chose to make in Windows to add this support.

What’s Next?

We’re eager to get your feedback. What specific devices are most interesting for you and what workload would you hope to accomplish with them? Are there other ways you’d like to be able to access devices in containers? Leave a comment below or feel free to tweet at me.


Craig Wilhite (@CraigWilhite)