Distributed Architecture for Operational Workloads

//Distributed Architecture for Operational Workloads

Distributed Architecture for Operational Workloads

There has been a tremendous growth of Big Data in the past few years with Machine Learning being used to draw intelligent insights for competitive advantage. Big Data has been fueled by social media, IoT, and a variety of other data types and sources. As companies have matured in their Machine Learning implementations, there has been an increase in the need to apply models generated by these algorithms to events as they are streaming in, to trigger real-time actions, or for operational dashboards based on those events. To reduce the latency between receiving these events, analyzing them, and deploying the models, these operational workloads need to coexist on the same scale-out data platform that is receiving these events and processing business intelligence or analytic queries.

Additionally, contextual data is needed for both the operational and analytical workloads. Master data, such as customer, product, etc. are needed to provide the context for this event data. Historical data is needed to get a longer-term context for these events. And transactional data is needed to provide the current context. In other words, a distributed data platform is needed to support a hybrid of operational and analytical processing workloads, in order to not only gain intelligence from streaming data, but to make that intelligence operationally actionable in real-time.

Preceding the Big Data revolution, scale-out operational workloads motivated the rise of NoSQL technologies. The early creators and adopters of these technologies faced high volume and velocity of operational events, that databases such as Oracle and Microsoft SQL Server could not keep up with. Scale-out implementations of these databases would have been prohibitively expensive. While document databases such as MongoDB, where a transaction is encapsulated within a single JSON document, eventual consistency models, or row-level immediate consistency solutions, are good for a certain set of operational applications, they do not address a large number of more complex SQL-based ACID transactional requirements. These capabilities are provided now by distributed databases, such as EsgynDB, on scale-out platforms, enabling the hosting of many more OLTP and operational workloads that need elastic linear scalability and high concurrency, without compromising on performance and enterprise class manageability.

The widespread growth in online transactions is coming from the ubiquitous mobile revolution, proliferation of mobile apps, mobile and micro-payments, dramatic increase in the world of users with mobile devices and access to the internet, down to farmers in villages. Current OLTP systems are bursting at the seams, with the inability of vendors like Oracle to keep up with this increase at a reasonable price per transaction without compromising performance. These systems need to be offloaded, migrated, or completely revamped to provide new interfaces, functionality, or to leverage micro-services and container-based distributed application architectures.

Finally, the need for scale-out Operational Data Stores, has also increased commensurate with the increase in OLTP volumes. Call centers, voice recognition systems and Bots to provide support and services, real-time monitoring of transported goods through the supply chain, immediate notification of financial transactions to mobile devices, online or mobile account inquiries, are amongst a myriad of back-end operational applications, increasingly needing to scale-out.
All this points to the fact that if distributed architectures are not part of your current OLTP / operational IT landscape, they soon will be. In this day and age, you want to architect a system that is not siloed, since the lines between operational and analytical workloads is blurring, and latency, introduced by moving / replicating and transforming data between silos, can be a huge competitive disadvantage.

Let’s explore some use cases

Real-time paradigm shift

If your business has applications that need to process IoT data or events as they happen, then you have to deal with the volume and velocity of streaming data. You often have to deal with the variety of data as well, depending on the unstructured or semi-structured nature of the content. This could be:
• Security events captured from customer environments to be displayed in real-time on aggregated dashboards, for customers to monitor security and be notified of intrusions in their environment
• Managing buildings or shopping plazas for security and energy efficiency, with data from multiple sensors, such as from heating and air-conditioning systems, security camera feeds, lighting controls
• Billing for the use of toll roads, and detecting if the same license plate is being fraudulently used elsewhere at the same time
• Processing feeds from smart energy meters to manage the electric grid efficiently
• Reacting to real-time telemetry data from vehicles for various purposes

Examples of IoT or event-based systems are plenty. Often, these events need contextual data (master data and transactional data). Telemetry data for example, has additional relevance based on information about the driver, his or her current location, what the sensors are indicating about the current status of the car, etc. Telling a driver in real-time about dining and shopping options available at her current location is a commonly used example. However, during the recent fires in Camp, California, as the fire spread the roads were being blockaded. GM was providing its OnStar customers real-time information on alternate routes, to avoid the fires and these blockages, based on telemetry data being collected from the vehicles, real-time updates on how the fire was spreading and the roads that were being blocked, and detailed maps of the area.

To deal with this increase in data volume and velocity you need a scale-out high ingestion distributed database engine that can deliver operational queries and dashboards in real-time or near-real-time, and can deploy event-based actions, driven by a rules engine or by models developed using BI and analytics. As 5G technologies become more prevalent, the volume and velocity of such events will increase even more dramatically, accentuating this need further.

Existing OLTP / operational applications

Most enterprises do not consider migrating existing mission critical applications, unless there is a compelling motivator. Motivations can be a changing business environment necessitating a new architecture, or new business requirements resulting in enhancements to applications, or both. One example can be the need to scale-out to meet the demands of much larger transaction volumes, such as from mobile devices, or the expansion of the business to an online consumer market, a merger of two companies, or moving into new markets – cities, regions, or countries. Such business changes could also result in rewriting or significantly enhancing applications. User interfaces may need to be redesigned. How transactions are completed may change due to differences in in-store, online, versus
mobile app interactions. The person completing the entire transaction may now be the end consumer and not the clerk or cashier. The interfaces for such a user would be substantially different. In all these cases, companies need to look at what architecture they want to host this revamped or extended application on, so that it is well positioned for the future. Other considerations, such as Cloud deployment models, integrating other types of data, removing silos, may also play into these decisions.

An associated reason to migrate or offload existing operational environments may be the cost implications of scaling out or maintaining old or obsoleted hardware and software technologies. For example, Oracle specialized hardware and software, that the current application is hosted on, may be too expensive to maintain and grow. Maybe it cannot even meet the scaling requirements at a competitive price.

Operational Data Store

Operational Data Stores (ODS) can consolidate pertinent enterprise operational data to monitor and improve the operational aspects of the business. They can be used to offload operational systems. For example, they can host Call Centers for customer product support and services. This is applicable for many enterprises that need scale-out implementations of ODSs because of the amount of operational data and the historical nature of this data. Banking customers can have mission critical operational systems feed data continuously to a distributed system, using change data capture (CDC), to offload those systems. Customer inquiries from mobile or online apps for account balances and statements, queries against historical transactions, mailed or emailed statement generation, for both consumer as
well as commercial customers, can be serviced by the ODS. There are various applications of ODSs that can facilitate scale-out and provide relief to and curtail the growth of critical OLTP systems.

Greenfield operational applications

To develop new applications, there are many more architectures and technology stacks to consider than were available before. You need to consider options that have:
• Deployment flexibility – capability to run on-prem or on different Cloud platforms
• Potential to scale elastically
• Ability to run multi-tenant environments
• Separation of compute and storage so that each can be scaled independently, potentially leveraging on-demand compute pricing models
• Open source software so that you are not dependent on a single company to deliver the functionality you need when you need it, and at a much lower price point
• A portable architecture that works with standards to allow integration and deployment flexibility – the ability to change components in the stack in the future

There are myriad of examples of how Amazon, Google, LinkedIn, Facebook, Yahoo, Netflix, Uber, Lyft, Airbnb and other such enterprises have developed greenfield applications in each of their areas, to bring innovation and competitive advantage from online shopping, to renting movies or watching media content, to real-time complexities of ride-share systems, to renting vacation homes, to hosting social media content, to building a search engine, to various other innovative ventures. These are all examples of scale-out operational systems.


Certainly, there has been and there will continue to be a focus on scale-out implementations for Big Data BI and Analytics applications. But it may be time to start assessing the OLTP and operational applications in your portfolio, for opportunities where a scale-out architecture can provide a strategic advantage and position you well for the future.



About the Author:

Rohit Jain is Esgyn's Chief Technology Officer.Rohit has worn many hats in his career, including solutions architect, database consultant, developer, development manager and product manager. Prior to joining Esgyn, Rohit was a Chief Technologist at Hewlett-Packard for SeaQuest and Trafodion. In his 39 years in applications and databases, Rohit has driven pioneering efforts in Massively Parallel Processing and distributed computing solutions for both operational and analytical workloads.