Monday, 29 October 2012

Schema-less or Schema-more?

If you are an advocate schema-less design, this post isn't for you. Someone will come along later and tidy up your mess, not to worry. If however you are the one trying to assemble a fragmented patchwork of conflicting information by unravelling a granny-knot of select distincts and full outer joins, fan traps, chasm traps, and traps that defy description, this might just help.

The rules for laying out a data model are not complicated. I'm not really sure why they aren't more widely followed, they enable the rapid acquisition of so much insight that you will quickly feel like you have an unfair advantage over others who do not approach data modelling or analysis this way.

The basic idea is that cardinality flows down the page, and process flows across it.

The rules are as below. They are pretty dry, but so you'll probably want to flick back to them. In my next post, I'll illustrate how they work and what you can learn from this visual language (the answer is - A LOT!)
  1. Detail flows down- 1:n parent-child relationships go vertically STRAIGHT down, with the parent at the top. If you find they are not straight, make the entities wider!
  2. Process flows left to right - 1:1 relationships or 1:n relationships that result from a state transition or high level process are horizontally STRAIGHT across, with the parent to the left. If they start zig-zagging, make the entities taller!
  3. One subject area per page - As a rule of thumb thats between about 5 and 35 entities. If you need to make lines cross a lot, you are either looking at duplicated relationships, or multiple subject areas on a single diagram. 
  4. Organise the entities into tiers - The tiers go down the page. I use Coad's UML entity types (Description, Party/Place/Thing, Role, Moment), expressed in more human-friendly language (Reference data,  Master Data, Agreements or Roles, Transactions)
  5. Represent Subtypes with the Barker notation 
    • Use nested boxes for strongly-typed subtypes (when subtypes fall into different entities). 
    • Don't USUALLY represent weakly-typed subtypes on the diagram (i.e. when subtypes are distinguished by a SUBTYPE attribute rather by an explictly declared subtype).
    • Strongly typed subtypes are mutually exclusive (i.e. an entity can't be two things at once).
    • Make the subtypes complete (a subtype of "Other" draws attention to an area of incomplete understanding).

Saturday, 18 June 2011

The Data Architect's Manifesto

Its easy to find pithy and authoratitive exhortations to a data-driven design approach.

Data dominates.  If you've chosen the right data structures and organized things well, the algorithms will almost always be self­evident.  Data structures, not algorithms, are central to programming

Representation Is the Essence of Programming
Show me your flowcharts and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won't usually need your flowcharts; they'll be obvious

Smart data structures and dumb code works a lot better than the other way around.

But we need to get past the subjective measures ("right", "organized", or "obvious", "smart" or "dumb") to get past the endless circular debates about data, and data quality.  Dave suggested some less subjective measures,  which he expressed in his Data Architect's Manifesto:

The success of an information system is dependent on the following qualities of its data:

1 Completeness
2 Correctness
3 Clear Shared Meaning
4 Conciseness
5 Adaptability

One of the key practical lessons Dave taught me was that you can organize the presentation of a data model in such a way that if you know how to read it these qualities are pretty much revealed both to you and, with a little effort, your business.

In the next post, I'll begin exploring how this is done.

Monday, 13 June 2011

The Matisse of Data Modelling

[ These ] miracles of pure line ... beguile us and take our breath away at Matisse’s sheer virtuosity in making a simple line evoke the complexities of space and form.
I am  fortunate to have worked alongside some very good software architects. The best show refined instincts for designing solutions to complex and seemingly intractable problems with apparent ease. These solutions seem natural, with well defined components inter-operating predictably. They are readily adapted to a changing environment. The same practitioners also show skill in immediately locating exactly how and where other solutions fail.

Between 2005 - 2008 I led a database team during a period of frenetic change. One of the first recruits was an experienced contractor called Dave. Dave had learned data-centric development while working under Richard Barker (author of the Oracle Case Method) at Oracle. He effortlessly and repeatedly absorbed the awkward and incomplete problem specifications, producing blueprints that both described and addressed the problem more accurately, more completely, and infinitely more plainly than its author or sponsor had begun to articulate it. And with infinitely more clarity than any accompanying documentation.

His designs were complete, correct and concise. They were rarely substantially re-drafted, largely proofed against challenge, and almost magically adaptable to "new" requirements. The resulting software was delivered on time, and worked. The designs and resulting software seemed to define the essence of the subject quite completely. To my eye they were done with more than a degree of artistry.

I was forced to accept that Dave had perfected a practical art-form in our professional domain, an art-form in which I was no more than a neophyte, and in which it was clear most of our peers lacked both skill or even interest.

I was compelled to learn how this was done. I don't believe there is a process to follow. Rather, as fortune favours the prepared mind, there are concepts and techniques which prepare a data architect's mind to understand, design and communicate information problems and solutions.

What follows, will I hope, be a formalisation of what I learned: from Dave's zen-mastery of this art, and from the other expert practioners I have been privileged to work with.