View Single Post
  #29  
Old February 27th, 2010, 12:07 AM posted to microsoft.public.access.tablesdbdesign
David W. Fenton
external usenet poster
 
Posts: 3,373
Default I was told "Fields are expensive, records are cheap"

=?Utf-8?B?RGVubmlz?= wrote in
news
Your comment “When table modifications call for lots of new
fields it often means that a one-to-many relationship that ought
to be set up in multiple tables is being shoe-horned into a single
table.”

It also means that you have to alter your front-end forms and
reports by adding all the fields, whereas if you're adding records
instead, whatever continuous or datasheet form you're using to
display those records will just have more records in it, so there
is no need to alter then user interface objects.

Response: I don’t understand you comment. It is absolutely
that that is I add more records, I don’t’ have altered
anything. But when the user asks where to they enter these new
fields and where the new reports are, what am I suppose to say.
Well, I did not do add those fields but you can enter more data?
That makes no sense at all.


I suspect you haven't looked at Duane's survey database. In the
structure there (which is a fairly extreme version of what I'm
talking about), when you add a new question to a survey, you add a
new record to the questions table, and link it to a particular
survey. This will then be displayed by the same user interface that
was used to display the other questions.

In the structure you're suggestion, you'd have to add a field for
the question to your survey table and then alter your form for data
entry, and your report for printing out, and all of the tools you
use to analyze and summarize the results.

Now, a lot of statistical tools expect a flat-file structure, with
one record for each completed survey, so it's usually the case that
one takes the many-records structure and denormalizes it into a flat
structure that is exported for use by the statistics program. The
key there is that you don't choose your data structure based on how
you're going to output it. Instead, you choose the structure that is
going to be most extensible with the least amount of work, and then
you take that data and convert it to the structures required for
particular purposes, which is very often a denormalized structure.

I’ve yet to figure how to capture additional data and produce
reports with this additional data without modifying the UI or
creating the new reports. Sorry, I have no idea what you mean by
the above statement.


I strongly recommend you examine Duane's survey sample database.

Your comment: To *me*, that's what the phrase means, that adding
fields is more complicated in terms of UI, which, to be frank, is
where 90% of our time is spent in the development process (rather
than in schema design/changes).

My response: You are preaching to the choir! I fully agree.

However, this project started out a simple membership mailing
list. And that is all it was. Once the users saw it working,
their imagination went wild and they wanted to start capturing
additional information so they could produce additional reports.


I have plenty of databases that are badly structured because they
started out small, when it seemed it wasn't necessary to choose an
industrial-strength schema, and then when they got extended, I
greatly regretted not doing it the right way on the front end.

So, the existing app is no justification for continuing to use a
less than optimal schema.

When this project started, I knew about the other report and
strongly suggested that they just capture the additional fields
and produce all of the report. However, the users kept saying no,
no, no we don’t need at that stuff. The process is working just
fine. And the boss man said – just do the mailing labels and
that all. So that is what I did. I designed it for the full
requirements, but only implemented the mailing label piece.

But what are you supposed to do when the user, who is paying you,
say no we don’t need that when you know that are going to need
it? You can only push the customer so far.


The approach I'm advocating doesn't take more time to implement to
cover the initial requirements and is inherently extensible, without
any additional programming for the UI.

You've front-loaded the programming to handle the future extensions,
which is something I do, too, but the suggestion by those advocating
records over fields is that there is no extra work involved. You
don't have to do anything extra to prepare for the N new fields,
because the structure you've built handles any number of records
from the very beginning.

Have you ever implemented a many-to-many join when the current
requirement was only a many-to-one? Say the client has customers and
there's a customer type. You can store it in a field in the customer
record, but what if the client later decides that a customer can be
more than one customer type simultaneously? At that point you need a
many-to-many join between customers and customer types. If you
structured it that way from the beginning, it would handle 1 type
per customer, and then would handle as many customer types as you
needed when the scope expands.

This is the simplest example of the cheap records vs. expensive
fields equation. And the point is that as soon as the client needs
more than 1 type per customer, you have to decide how many you're
going to accomodate. You could, I guess, add a second customer type
field, but then when they need three, you'd have to add a third new
field. With the join table, you accomodate anything from 0 to
however many you could ever conceive of. It's not more complicated
to implement, and when the customer needs more than 1, there is
nothing that needs to be done (other than, perhaps, altering your UI
to remove any limitations you may have previously put on entry).

But again, none of this had anything to do with the original disk
access speed vs. Access internal process speed question.


That was your red herring. You asked for an explanation of the
expression. You misinterpreted it and went down the access speed
garden path. I'm explaining what the phrase refers to, and it is, in
fact, more efficient to add records than fields (consider the number
of indexes that have to be maintained/searched).

--
David W. Fenton http://www.dfenton.com/
usenet at dfenton dot com http://www.dfenton.com/DFA/