View Single Post
  #12  
Old February 25th, 2010, 06:32 PM posted to microsoft.public.access.tablesdbdesign
David W. Fenton
external usenet poster
 
Posts: 3,373
Default I was told "Fields are expensive, records are cheap"

=?Utf-8?B?RGVubmlz?= wrote in
news
However, most personal lines and
commercial lines insurance policy and claim tables can get very
wide.


That suggests they are probably denormalized so some degree.

I know a number of popular applications (ACT springs to mind) use
very wide tables, and this means their application has to be more
complicated to work with the data. It also is one of the main causes
of the performance problems that ACT has in terms of filtering large
datasets -- it's harder to efficiently use indexes when you've got
the data denormalized (and some of the data in ACT (last time I
checked, which was many, many years ago) seems to be stored in
multi-value fields, which is yet another reason for performance
problems, since even if they are indexed, they can't be used
efficiently).

I will agree that a wide table *can* be normalized.

I will also agree that there can be perfectly good reasons to
denormalize.

But I have very few tables in any of my apps that have even close to
50 fields.

A common reason for denormalization that looks properly normalized
is mis-definition of the entities (i.e., a table holds records, each
of which represents an entity, and each field is an attribute of the
entity). In a survey, for instance, it's very common to create a
single record for each survey, with a field for each question. But
this approach is completely unextensible -- any change to the
questionaire requires a change to the table. Or you have to create a
table for each questionaire. Or you have to add fields that are used
in some questionaires and not in others.

I learned this lesson with a medical study database that was tied to
the structure of the UNOS national transplant registry. There was
only one medical study and for it to work the variables being
examined had to be established at the beginning, so it was never
going to change. It seemed logical to use a record per patient. But
before the app was even finished, UNOS changed the data structure of
how they recorded data, and we ended up having to add fields to the
existing tables and then change all the forms and reports to reflect
that. If we'd used a proper structure, i.e., with a record for each
variable, and a table that defined what the variables were, we would
have needed to make absolutely no changes to the *structure* of the
data or the application -- we would have been able to accomodate the
additional variables just by adding *data* to the existing tables.

That's the essence of the problem. Fields are expensive, records are
cheap -- it's easy to change the data, hard to change the data
storage *structure*.

--
David W. Fenton http://www.dfenton.com/
usenet at dfenton dot com http://www.dfenton.com/DFA/