ETL 3.0 is driven by the business cost of putting data in motion. ETL is “plumbing” — it offers no direct value, but it is a critical part of the support system for the data warehouses and data marts which do provide direct value. In this light, ETL is a necessary cost. ETL 3.0 is shaped to manage and control that cost.
The driving factor of the modern IT shop is to operate at the speed of business.More than anything else, this means being able to respond rapidly to changes in the business climate. These changes come from the business units within the enterprise, from trading partners outside the enterprise and – for IT – from the continuous advancements of technology itself.
Whenever the business makes a new demand, IT must be prepared to satisfy that demand. It must do so rapidly. It must do so without major disruption. And it must be able to move forward when it needs to move forward.
The approach to technology that underlies a system contributes to the responsiveness of IT. And the core principles underlying the system’s architecture drive that approach.
Core Principles for an ETL Architecture
It is important to give appropriate weight to the principles that must drive the architectural foundation of the Extract, Transform and Load (ETL) system.
These three principles – Flexibility, Extensibility and Autonomy – have the greatest impact on IT’s ability to respond to changing business demand.
Flexibility is the key principle to guide all design, adoption and implementation choices. Flexibility means being able to adapt to forces of change, easily, swiftly and with minimum risk. Technology, products, the marketplace, or – especially – the business may impose necessary and beneficial change. Flexibility is essential to avoid “tear-up” when the inevitable changes occur.
Extensibility ranks right behind Flexibility. It is especially important when Flexibility must be compromised because of a limitation in a product or design choice. Extensibility means being able to take a product beyond its intended capabilities. This is the enabler of discovery and invention, two key elements of a vibrant IT organization. Extensibility allows you to overcome limitations, not with “workarounds”, but with solutions that are well-designed and architecturally sound.
Autonomy is the IT organization’s capacity for moving forward at its own pace. Autonomy is enabled by Flexibility, by Extensibility and by a skilled workforce. If an architected solution supports Autonomy, then the IT organization can take an active role in creating what is needed, when it is needed, to respond to a specific business demand. The IT organization is not dependent on, or encumbered by, the ability or desire or timetable of vendors or markets.
The Inevitability of Change
The business community will change – organizations recombine and refocus; partners and vendors join and separate; people come and go; priorities rise and fall.
And certainly technology changes – in fact, “technology churn” has been with us for decades. Often, the churning just produces cosmetic changes to old favorites. But, on occasion, the churning signals a significant adjustment – the replacement of some tried-and-true technology with a “new kid on the block” that will make a real difference.
When architecting a solution, we must account for the fact that today’s innovations are tomorrow’s legacy systems and legacy products. Anticipate the change that will inevitably come, by adopting Flexibility and Extensibility as core architectural principles.
ETL 3.0 Begins With Flexibility
The core principle of ETL 3.0 is rooted in the recognition of change as a constant and driving force. ETL 3.0 strives to embrace change — the same external pressures that prevented IT from servicing customer demands are anticipated by ETL 3.0, enabling IT to rapidly respond to the customer.
The benefits of an ETL 3.0 architecture address the costs of technical changes, but they go well beyond technology. An ETL 3.0 architecture enables direct business benefits as well.
In every IT shop, there has come the moment when the market unveils a new product, one which is better, faster and — most importantly — cheaper than what you have in place. Yet, you cannot take advantage of this cost savings because this product doesn’t work just like your legacy products. The parts don’t fit together, so the cost savings are offset by the costs and risks of “rip-up-and-replace”.
ETL 3.0 attacks this problem by dictating that the ETL solution must be made up of discrete parts, with standardized, open, and loosely-coupled interfaces between parts. When the process transitions from source system to Extract, from Extract to Transform, from Transform to Load, and from Load to target system conform to the ETL 3.0 model, than any individual part of the process can be enhanced or replaced without disrupting the remaining parts.
The business value lies in empowering IT to explore new product opportunities, by minimizing the disruptive cost of the exploration. A new transform component can be added in, co-existing with the existing legacy transform component, for a direct side-by-side comparison of features, functions and fit.
When a good fit is found, the ETL 3.0 model permits that same replacement model to ease the transition from the legacy tool to the updated tool. Because the interfaces are open, standardized and loosely-coupled, the old and new tools can run side-by-side. This allows the transition to be paced, driven not by the demands of the technology but by the capacity and capabilities and needs of the business.
Bottom Line: The ETL 3.0 model empowers IT to better serve the business, by investigating and adopting new products with minimal disruption and side-effects, leveraging the historic technology trend of better, faster, cheaper products.
As a data warehouse grows to meet the needs of the business, it is likely that the number of systems and tools making up that warehouse — the “plumbing” — expand. Today’s market is dominated by a few very large vendors and a smattering of smaller vendors. The technologies and tools offered by both large and small vendors is continuously in flux, and every tool and platform requires a different set of skills, experience and expertise. As this mix grows within a single IT shop, the ability to staff that shop becomes more difficult — and more expensive.
ETL 3.0 attacks this problem by dictating that that the ETL solution must be made up of discrete parts, with standardized, open, and loosely-coupled interfaces between parts. Because the parts are discrete and the interfaces are loosely-coupled, the ETL development staff can likewise be comprised of discrete and loosely-coupled staff members.
Consider: if an ETL environment consists of moving data from an IBM mainframe to an Oracle database using Informatica, the ETL staff is typically made up of people who know Informatica and Oracle, or Informatica and mainframe. Candidates who are expert in Informatica but don’t know Oracle, or who are strong in Oracle but don’t know Informatica, are not as highly regards as someone who is less expert in both.
Now consider what happens when change occurs: imagine that the ETL environment is expanded to include a Teradata system, co-existing with Oracle. And some of the ETL will be processed using Talend. What skills does the new ETL staff candidate need to have? The search for someone experienced with mainframe and Informatica and Talend and Oracle and Teradata will probably reduce to zero candidates.
In an ETL 3.0 environment, IT is freed from making such a choice. Each tool or platform can be staffed by the most affordable skilled candidate for that position. The staff members work as loosely-coupled teams to obtain the maximum value from each tool or from each platform.
Bottom Line: The ETL 3.0 model empowers IT to better serve the business, by drawing the best, most skilled, most affordable ETL staff members from a large pool of candidates.
What are other ways that the ETL 3.0 model empowers IT? In the next dispatch, I’ll be talking about the two essentials of advancing business value — invention and ‘one-off solutions‘.
ETL 3.0TM is a trademark of BVWatson LLC, copyright 2010, all rights reserved.