white papers, articles, blogs, websites monthly archives by title additional info about CDIstation and the author

Read and comment, that's what makes it work.

Monday, August 15, 2005

DBMS2 - A new box for your old shoes

A week ago, Curt Monash wrote a piece for Computerworld titled "Time for a New View of Data Management", decrying the proliferation of schema complexity and sounding the death knell for traditional database management systems. Curt puts forth an idea he calls DBMS2, short for "database management systems services". This is basically distributed data architecture - either through federation or in a services oriented framework.

While I don't disagree with the direction of Curt's piece, there are a few very important nits I would pick with it.

(1) This is hardly new. All the overblown drama aside ... "Database management is in a crisis", and "the pure relational model is collapsing under its own weight" ... slapping a new name on something doesn't make it new. Especially if the name is something as unwieldy as "DMBS2". What the article talks about is the logical implication of best practices around SOA.

In the initial iteration of service oriented architectures, we saw application and process "hooks" exposed so that they could be invoked or addressed by external systems. This did not put EII "in a crisis", or cause anyone to conclude that EAI is "collapsing under its own weight". Instead, web services became just another tool in the toolkit for addressing application and database interoperability.

In the next round, which is unfolding right now, we are seeing true federation of applications and databases. This is the scenario Curt describes as DBMS2. This means that data does not need to be centralized and normalized in one gigantic storehouse, but instead you can "play it where it lies". Certainly metadata management becomes a more critical skill as you reconcile multiple native schemas, but the schemas do not need to be consolidated, merely cross-referenced.

The article is basically arguing that since our dictionaries are getting to be so big, we should shorten up all our words (Curt proposes "drastic limitations on relational schema complexity")rather than try to alphabetize by the second, third, fourth letters and beyond. This is flawed logic, primarily since much of the "verbiage" generated by data management processes today is purpose-specific. Why do we need to incorporate one-off data into a master schema that applies on an enterprise basis? From an SOA perspective, we don't. That data is mostly overhead, useful only to the application environment in which it is generated.

If we need, as Curt suggests by example, full click-through website utilization data in the customer profile (and I disagree, but for the sake or argument let's see it through), then the logical question is "where do we need it?" Certainly not at the bank teller window, not on the ATM screen, not in the mail room. For those few places where I DO need it, I can dynamically request it from the web channel management application where it is natively stored, I can parse and normalize and reformat it on the fly as needed, and then flush it. All I need in the customer master is a pointer to show where the data lies. That's more logical, it works well within the existing SOA roadmaps I've seen, and it is, in fact, the way it is being done today.

(2) There are three critical factors in determining the architecture of an enterprise database application: performance, performance and performance. One bank calls it a "newspaper event" when the customer master system goes down. In the largest institutions, where you're architecting systems to hold hundreds of millions of customer accounts, and handle over a thousand transactions per second, a mainframe based solution is - at least for now - your only viable choice.

EAI, for example, is an elegant solution for consolidation of denormalized data from multiple source systems. At high transaction rates though, the BPM layer becomes a bottleneck and simply cannot keep pace. At that point, the data needs to be normalized and consolidated for rapid request-response performance. And that usually requires a big database.

I've even talked to one bank so concerned about performance that they are not confident a J2EE environment - foundational for their SOA - can handle it. They are simultaneously testing CICS transactions in a side-by-side speed test with J2EE.

It isn't just that hardware that is a concern. It's also firmware and software that need to optimized and tested for HA and HR. The DO-178B class A standard requires extensive design, test, and process documentation. Testing includes criteria designed to stress the system, such as multiple condition decision coverage (MCDC). This methodology ensures that all paths and all statements in the code have been exercised and shown to work. It is very laborious and therefore greatly increases the cost of software components. Since every ad hoc data environment envisioned under the articles' "DBMS2" would conceivably be different, this kind of testing and certification for HA and HR is not merely difficult, it's impossible. And for many customers, that's unacceptable.

(I'm trying - really trying - to envision MySQL "farms" of massively parallel database operations, and it's not coming together for me.)

(3) Not this week, not this year, not this decade. A truly federated data environment is a glorious vision, but realistically in terms of benefits achieved, it probably takes a back seat to SOA. I don't think that the explosion in schema complexity is likely to outpace the continuing explosion in storage capacity and computing power (sort of Moore's law for data). Despite the ominous overtones in the article, the pressure to decommission the massive DB2 and Oracle and Sybase database infrastructures in corporate America is not very intense at the moment, and I doubt we'll feel the heat any time soon.

Don't give up on managing your metadata and your schemas just yet, because they're not going away any time soon!

1 Comments:

john parker said...

Comment from the author here.

My reply here.

12:27 PM  

Post a Comment

<< Home


Simple Atom XML feed provided by Blogger Rich Site Summary XML feeds available through FeedBurner Make text larger for easier reading Return text to default sizing


Powered by FeedBlitz   (No spam, only email updates)

CDI in the News
Tools

Google
Web CDIStation.com


News aggregated by Google News using search terms:
"customer data integration"
"master data management"
"customer hub"


Inbound XML News Feeds aggregated by FeedDigest


Outbound RSS Feeds provided by FeedBurner

Powered by Blogger
Blogging software provided by Blogger

email subscription by Feedblitz
Email updates provided by Feedblitz


Copyright 2005 CDI Station. All Rights Reserved. Reproduction of this publication in any form without prior written permission is forbidden. The information contained herein has been obtained from sources believed to be reliable. CDI Station disclaims all warranties as to the accuracy, completeness or adequacy of such information. CDI Station shall have no liability for errors, omissions or inadequacies in the information contained herein or for interpretations thereof. The opinions expressed herein are subject to change without notice.