Why is On-Demand Data Aggregation Important?

  • Taming the Data Sprawl

    Taming the Data Sprawl

    Enterprise information has never been more fragmented. Modern data systems are often isolated, distributed and difficult to untangle from applications, making it hard to see and access all the relevant bits at once. The resulting architecture limits a firm's ability to find and exploit information, affecting worker productivity. To solve the data sprawl problem businesses need to get smarter about how they Prepare and Distribute critical information.
  • 1



Data fragmentation is a costly problem. Information scattered across silos can affect performance, compromise security and increase operational risk, making regulatory compliance a challenge. The disjointed nature of enterprise data in its many formats also makes adapting to changes in data structure a difficult task.   And that makes it hard for businesses to stay competitive.

Implementing a centralized data storage or warehouse solution, even with low cost technologies like Hadoop or NoSQL is often impossible given the geography and variety of modern data sources. Besides if data silos are the problem, another silo is probably not the solution.   So the focus of enterprise architecture is shifting from data capture and management to enabling data agility.

Agility is all about how fast you can extract value from mountains of data stored in disparate systems and how quickly information can be turned into action. All that raw data will also need some common structure before it can be analyzed. After all, just because you can get data into Hadoop easily doesn't mean an analyst can easily get it out. That means some data preparation must happen before it can be used by data crunching applications and tools.

The Reactive Data Platform™ is a powerful new technology for on-demand data aggregation, preparation and analysis. Application Dataspaces combine data integration with a high-performance query engine and transactional memory that lets users exploit information regardless of volume, format or location. A rich semantics layer lets data analysts declare schema on-the-fly, making it possible to explore, query and join data across multiple disparate sources using familiar SQL-like syntax. So you can access, shape and publish critical information using one unified technology.

Stream processing and schema-on-read capabilities make data more nimble, allowing users to aggregate information without the need for creating additional copies. This reduces data sprawl and eliminates the people and process bottleneck of traditional, ETL-centric data preparation. And combining all this capability into a single, scale-able data layer makes implementation a snap while reducing cost.

Data aggregation on-demand is a transformative technology. It's goal is to present all critical assets and resources as a single, always-on view that is contextually relevant to the observer. This greatly simplifies data-driven decision making, improving one's ability to mitigate risk, track performance and identify threats or opportunities on a global level.

In the past, data aggregation initiatives often focused on creating a so-called master copy of all critical data and required years of integration work as companies tackled mundane but critical issues, like how data should be labeled and organized for optimal access.

This 'schema-first' approach, sometimes called 'Schema on Write' required complex models to be developed for organizing data. As systems evolved and business requirements changed, enterprise data became fragmented while data schema became more rigid. It became impossible to re-organize critical data in a timley fashion, resulting in major impact to the business.

To improve flexibility architects are looking at more agile ways of packaging and re-organizing data. Recently introduced technologies let schema be applied to raw, unprocessed data at query time, allowing data to be aggregated on-demand when users ask for it. The so-called 'schema last' approach to data preparation is called Schema on Read and dramatically reduces data integration efforts and cost by eliminating many data ingest and preparation steps. The schema last approach improves analyst productivity because it allows them to be more independent and agile when engaged in exploratory analytics.

.. try it your self, check out
the Reactive Data Platform™ now!


Login to access additional content such as white papers, on-line docs, Wiki and product downloads.