Seven Attributes of Big Data 2.0: My Advice for CIOs

Seven Attributes of Big Data 2.0: My Advice for CIOs

By Scott Gnau, CTO, Hortonworks

Scott Gnau, CTO, Hortonworks

When Web 2.0 was coined, it was about new design patterns and business models emerging from the bubbles and shakeouts of subsequent computing revolutions. The term worked for everyone. It took off and became a meme that, to this day, others are trying to repeat.

At its recent I/O Conference, for example, Google spoke of AI and machine learning as a basis of Cloud 2.0. Artificial Intelligence in the cloud, delivered via voice, will talk to you and predict what it thinks you want to know. This will be the next generation experience in your home or car–put that on screen, Jarvis?

"A successful IT strategy lies in a platform that can connect new tools to the legacy ones while handling the workload prioritization, security, governance, and operations"

Many think the Internet of Things 2.0 will result from machine-to-machine connectivity to create ‘an interface of things’ and spawn smart connected environments in schools, businesses and even across industries. Systems will understand each other and work ‘in the context of a larger, commonly understood purpose.’

But, the problem persisting at hand is CIOs today often find it impossible to think of all of these clusters of technologies as one. Plus, what do all of the technologies then actually do as one?

I’m not into the meme game. I believe data is the common factor in all of them and, whatever you name it, it is the tipping point that is changing everything for enterprise IT, to people and processes, to platforms and architectures.

I believe there are 7 key attributes of this tipping point, and CIOs should be thinking and acting on these to lead their organizations on this journey.

1) It Starts with Access to All Data

The implication of Cloud 2.0 and IOT 2.0 is the real-time need for and access to data, however it’s consumed or processed. It’s a world where we need to look into and across all the data.

This makes the data scientist the super admin of the future. He or she needs access to all the data to do their job, not just what IT thinks is relevant. As in mining for gold or iron, more raw material leads to more refined product.

Start by making sure your scientists have access to as much data as possible with the fewest constraints possible. This implies new and important governance and security designs.

2) Bring the Processing to the Data

DevOps was about making development and operations work closer together while automating software delivery and infrastructure.

With zettabytes involved, data movement is still relatively expensive, difficult and sometimes leads to governance nightmares. It’s just more efficient to bring the processing to the data, which requires multi-tenancy and mixed workload management. We need ops and data (and development) to work together tightly to make it happen.

Consider the rise of DataOps and its impact on your IT organization.

3) Be Connected not Converged

Big Data ‘next’ depends on a huge ecosystem to get to the level of technology investment required. That’s why we’ve already seen the explosion of new technologies and companies. For enterprises to succeed with data, apps and data need to be connected via a set of platforms within a logical framework. Convergence of such divergent requirements into single proprietary product will not work, and is not a good investment in the long term for any large corporation.

4) Data Scientist 2.0

We’ve already moved beyond the time of data as ‘experiment’ by early adopters, as today’s organizations see data as a mainstream part of business transformation.

Clearly, Data Scientist 2.0 is a new character in the play to come in from stage right, with new working methods. To scale, CIOs will require data analysts and scientists to re-use and collaborate at the same time as building a healthy competition to create a lasting stream of innovative new algorithms.

This will require internal and external communities of analysts and data scientists to foster collaboration and friendly competition. Call this the new agile development applied to the world of big data.

5) The Platform, Not the Tool, Will Matter

Tools have always been tools—useful for one or two things but you can’t build a house with just one hammer.

It’s certain that Hadoop is rapidly emerging as the open data technology on which the real time world of Data 2.0 is being based. Tools for capturing, managing and analyzing data are proliferating at an accelerated pace. But none of these tools are enough on their own.

A successful IT strategy lies in a platform that can connect new tools to the legacy ones while handling the workload prioritization, security, governance, and operations.

6) Data Will Put IT in Fast Reverse

IT project and program management is completely different in the next generation world of data.

Traditional projects went like this: 

1. Requirements
2. Data Sourcing
3. ETL
4. App Dev
5. Delivery

BD2.0 Projects are:

1. Land the data raw
2. Data Science/Machine Learning
3. Data Sourcing, Requirements
4. Define Requirements
5. App Dev
6. Land more Data
7. Repeat

Your IT strategy needs to go into reverse too.

7) Legacy Doesn’t Have To Be

Existing operational systems are often the last mile to a business process or customer experience. I predict the world of big data will help these flourish again, with new insights and speed, versus rip and replace.

So don’t throw them away. Work out how to connect them.

Read Also

Supply Chain Process Development using DFSS/DRM and First-Time Quality

Supply Chain Process Development using DFSS/DRM and First-Time Quality

Eric C. Maass, Director of Design for Reliability and Manufacturing, Medtronic
Internet of Things and Big Data: Old is New and New is Old

Internet of Things and Big Data: Old is New and New is Old

Shane Fazzio, Director Product Management-IoT Platforms & Analytics, Honeywell [NYSE:HON]

"A Programmer, A Controls Engineer and A Mathematician Walk into anX-Bar..."

Dennis Rausch, VP of IT, Rea Magnet Wire Company, Inc.
Integrated Quality Systems Requires Integrated Technology

Integrated Quality Systems Requires Integrated Technology

Donovan Hardenbrook, Director, Global Quality Management Systems, Littelfuse [NASDAQ:LFUS]

Weekly Brief

Top 10 Manufacturing Intelligence Solution Providers - 2018

Manufacturing Intelligence Special