Adapting to Data Architectures in Motion

Share:

Read Clive Bearman’s take on the implications of the Cloudera and Hortonworks merger.

Cloudera and Hortonworks Merger

High school physics defines Newton’s first law as follows:

“An object will remain at rest or in constant velocity, unless acted on by an external force”.

That means stuff just sits around or continues to meander in a straight line unless you “give it a push”, and yesterday the big data market got an enormous shove. The external force in question was that Cloudera and Hortonworks agreed to a $5.2 billion merger and will form a new entity. The name has not yet been decided, but the transaction is slated to close sometime in the first quarter of 2019.

The natural question to ask if you’re a Hadoop user is “what does this mean for the technology and where is it headed?” It’s too early to tell, but if the press releases are any indication then there are exciting changes on the way. The combined entity will provide a unified data platform that spans from the edge of the enterprise with IoT, to the heart of the business to embrace AI. It’s also noteworthy that the overarching message is one of living in hybrid world where data and components bridge both the cloud and traditional data centers. In addition, generation 3.x of Hadoop had already committed to containerization, erasure storage, tiering data and support for specialized hardware such as GPUs. As Hortonworks partner of the year, we couldn’t be more excited for the future.

However, the big Hadoop elephant in the room is “what do you do with petabytes of data stored in a particular distribution?” especially since we can’t predict the future. While the merger might signal that a data migration is on the horizon, we can’t be certain or begin to understand a timeline. If I’m sticking with the physics metaphor, then we’re witnessing the “Heisenberg’s uncertainty principle” because “it is impossible to know simultaneously the exact position and momentum of any particular Hadoop distribution.” The question though remains valid “what do we do with all that data?”

If you’ve been in the technology space for any amount of time, then you’ve experienced these sorts of mergers before. The immediate advice is pure Douglas Adams: “Don’t Panic!” and if you’ve you’re your job correctly then you’ve planned for change. Here at Attunity we’ve codified planning for change in our products and promote the philosophy of “architectures in motion”. We know that the data space is extremely dynamic and the analytics technology you use today might not be the data solution that you use tomorrow. I recently discussed this viewpoint on a recent webinar with DBTA. Watch the Taking Your Data and Analytics to the Cloud webinar replay.

“Architectures in motion” is a euphemism that describes an ever-changing data environment. For example, your data services might start out on-premises, then later move to the cloud. We’ve seen this with many of our customers who started with an on-prem deployment of Hortonworks and have been moving to Azure HDInsight.

You might also bridge environments in the future and operate a hybrid model. Additionally, you could run data analytics on Amazon Web Services and add complementary services from Microsoft Azure, or Google cloud platforms. Nevertheless, the ability to effortlessly and flexibly adapt to change is the key to future proofing your data architecture.

Attunity Replicate provides flexibility to handle constantly changing environments in several ways:

  1. Universal connectivity supports data flows between virtually any enterprise data source and any target. For example, databases, data warehouses, legacy/mainframe, into any major data lake based on HDFS or cloud targets like AWS S3.
  2. A graphical UI doesn’t require lengthy and expensive custom development. Just simply use the UI to configure a new data target.
  3. Agent-less CDC technology and a 100% automated process means that there’s no need for a major retrofit or forklift upgrade. Once the new target is configured then data will begin to flow as soon as there’s a change to the source data or schema.

Attunity Replicae Diagram

This flexibility also means you can freely experiment on your own timescale. Perhaps you’ve been meaning to investigate some of the Hadoop offerings from public cloud vendors such as Amazon EMR, Azure HDInsight and Google Cloud Dataproc. Or maybe now is the perfect time to revisit MapR. Attunity will work with whatever vendor you choose, no matter your data gravity.

In closing, in the late seventeenth century English physicist Sir Isaac Newton formulated theories that described mechanical events that involved forces acting on matter. He couldn’t have predicted that we’d still be talking about his genius over 200 years later. By the same token we can’t predict what is going to happen in the Hadoop ecosystem, but we can implement technology to mitigate adverse outcomes. However, one thing is certain. Change is a constant.

Dev Tool:

Request: blog/adapting-to-data-architectures-in-motion
Matched Rewrite Rule: blog/([^/]+)(?:/([0-9]+))?/?$
Matched Rewrite Query: name=adapting-to-data-architectures-in-motion&page=
Loaded Template: single.php